Discovering and exploiting analogical proportions in a relational database context

Abstract : In this thesis, we are interested in the notion of analogical proportions in a relational database context. An analogical proportion is a statement of the form “A is to B as C is to D”, expressing that the relation beween A and B is the same as the relation between C and D. For instance, one may say that “Paris is to France as Rome is to Italy”. We studied the problem of imputing missing values in a relational database by means of analogical proportions. A classification algorithm based on analogical proportions has been modified in order to impute missing values. Then, we studied how analogical classifiers work in order to see if their processing could be simplified. We showed how some typeof analogical proportions is more useful than the others when performing classification. We then proposed an algorithm using this information, which allowed us to considerably reduce the size of the training set used by the analogical classificationalgorithm, and hence to reduce its execution time. In the second part of this thesis, we payed a particular attention to the mining of combinations of four tuples bound by an analogical relationship. For doing so, we used several clustering algorithms, and we proposed some modifications to them, in order tomake each obtained cluster represent a set of analogical proportions. Using the results of the clustering algorithms, we studied how to efficiently retrieve the analogical proportions in a database by means of queries. For doing so, we proposed to extend the SQL query language in order to retrieve from a database the quadruples of tuples satisfying an analogical proportion. We proposed severalquery evaluation strategies and experimentally compared their performances.
Document type :
Theses
Complete list of metadatas

Cited literature [105 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01508503
Contributor : Abes Star <>
Submitted on : Friday, April 14, 2017 - 11:48:10 AM
Last modification on : Friday, January 11, 2019 - 2:28:05 PM
Long-term archiving on : Saturday, July 15, 2017 - 1:09:41 PM

File

CORREA_William.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01508503, version 1

Citation

William Correa Beltran. Discovering and exploiting analogical proportions in a relational database context. Databases [cs.DB]. Université Rennes 1, 2016. English. ⟨NNT : 2016REN1S110⟩. ⟨tel-01508503⟩

Share

Metrics

Record views

314

Files downloads

181