Skip to Main content Skip to Navigation
New interface
Theses

Inconsistency-aware quantification, query answering and ranking in relational databases

Abstract : The inconsistency problems in databases and knowledge bases have been largely tackled and discussed in the last forty years. Inconsistency is one of the main dimensions of data quality. In our era, data is the new gold, but data without quality or lack ofquality measures is another burden leading to erroneous and uninformative analysisresults from data. The inconsistency problem arises when a set of constraints thathave to be satisfied by the database instance are violated by this database instance.All the previous works that deal with the problem of inconsistency are focused oneither the repair of the inconsistent database to obtain a new database that is consistent(i.e, there is no violation of constraints), or quantification of the inconsistency inthe entire database. In this thesis, we propose a new approach to handle inconsistencyin relational database by quantifying it on the level of tuples, and then rankingtuples/answers according to their inconsistency to enable choosing among query answersthe most consistent/inconsistent ones. So, we define different new of measuresof inconsistency degrees that based either on tuples violation (tuple-based approach)or on constraints violation(constraint-based approach). We consider the class of denialconstraints as class of constraints and the class of conjunctive queries as class of queries.We leverage why-provenance and polynomial provenance to identify inconsistent tuplesand to compute inconsistency degrees of query answers, respectively. We converteach denial constraint into a boolean conjunctive query and evaluate this last one ondatabase to compute the why-provenance of the true answer. Using why-provenance,each tuple in the database is annotated with the set of constraints that it violates and itsidentifiers in a monomial form (otherwise, i.e, the tuple does not involve in violationof any constraint, then it is annotated by the monomial 1), then we obtain an annotateddatabase. Given a conjunctive query Q, Q is evaluated on the annotated database andeach answer is computed with a polynomial provenance that encodes in a polynomialformula the set of constraints violated by the answers as well as the set of tuples used tocompute answer and involved in violation of these constraints. Then, we define twelvemeasures of inconsistency degrees using the polynomial provenance of answers. Once,measures of inconsistency are defined, it is interesting to allow ranking of answers (tuplesin database) according to their inconsistency degrees. We design a set of top-kalgorithms, including TopINC on which the idea of other algorithms is based, allowingto rank the query answers according to their inconsistency degrees. We introducea new class of algorithms with a new cost model and shown the optimality of thesetop-k algorithms in some specifics conditions. Also, for each top-k algorithm, we giveits theoretical complexity. We have conducted a large experiment to show the feasibilityof our approach in practice and also to show the efficiency of our top-k developedalgorithms.
Document type :
Theses
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03859885
Contributor : ABES STAR :  Contact
Submitted on : Friday, November 18, 2022 - 2:00:12 PM
Last modification on : Saturday, November 19, 2022 - 4:04:20 AM

File

2022UCFAC010_ISSA.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03859885, version 1

Citation

Ousmane Issa. Inconsistency-aware quantification, query answering and ranking in relational databases. Databases [cs.DB]. Université Clermont Auvergne, 2022. English. ⟨NNT : 2022UCFAC010⟩. ⟨tel-03859885⟩

Share

Metrics

Record views

0

Files downloads

0