Skip to Main content Skip to Navigation

Information Retrieval (IR) Modeling by Logic and Lattice. Application to Conceptual IR

Abstract : This thesis is situated in the context of logic-based Information Retrieval (IR) models. The work presented in this thesis is mainly motivated by the inadequate term-independence assumption, which is well-accepted in IR although terms are normally related, and also by the inferential nature of the relevance judgment process. Since formal logics are well-adapted for knowledge representation, and then for representing relations between terms, and since formal logics are also powerful systems for inference, logic-based IR thus forms a candidate piste of work for building effective IR systems. However, a study of current logic-based IR models shows that these models generally have some shortcomings. First, logic-based IR models normally propose complex, and hard to obtain, representations for documents and queries. Second, the retrieval decision d->q, which represents the matching between a document d and a query q, could be difficult to verify or check. Finally, the uncertainty measure U(d->q) is either ad-hoc or hard to implement. In this thesis, we propose a new logic-based IR model to overcome most of the previous limits. We use Propositional Logic (PL) as an underlying logical framework. We represent documents and queries as logical sentences written in Disjunctive Normal Form. We also argue that the retrieval decision d->q could be replaced by the validity of material implication |= d->q. We then exploit the potential relation between PL and lattice theory to check if d->q is valid or not. We first propose an intermediate representation of logical sentences, where they become nodes in a lattice having a partial order relation that is equivalent to the validity of material implication. Accordingly, we transform the checking of |= d->q, which is a computationally intensive task, to a series of simple set-inclusion checking. In order to measure the uncertainty of the retrieval decision U(d->q), we use the degree of inclusion function Z that is capable of quantifying partial order relations defined on lattices. Finally, our model is capable of working efficiently on any logical sentence without any restrictions, and is applicable to large-scale data. Our model also has some theoretical conclusions, including, formalizing and showing the adequacy of van Rijsbergen assumption about estimating the logical uncertainty U(d->q) through the conditional probability P(q|d), redefining the two notions Exhaustivity and Specificity, and the possibility of reproducing most classical IR models as instances of our model. We build three operational instances of our model. An instance to study the importance of Exhaustivity and Specificity, and two others to show the inadequacy of the term-independence assumption. Our experimental results show worthy gain in performance when integrating Exhaustivity and Specificity into one concrete IR model. However, the results of using semantic relations between terms were not sufficient to draw clear conclusions. On the contrary, experiments on exploiting structural relations between terms were promising. The work presented in this thesis can be developed either by doing more experiments, especially about using relations, or by more in-depth theoretical study, especially about the properties of the Z function.
Document type :
Complete list of metadata

Cited literature [105 references]  Display  Hide  Download
Contributor : Marie-Christine Fauvet <>
Submitted on : Thursday, May 15, 2014 - 4:51:01 PM
Last modification on : Tuesday, December 8, 2020 - 10:42:46 AM
Long-term archiving on: : Friday, August 15, 2014 - 11:25:22 AM


  • HAL Id : tel-00991669, version 1



Karam Abdulahhad. Information Retrieval (IR) Modeling by Logic and Lattice. Application to Conceptual IR. Information Retrieval [cs.IR]. Université de Grenoble, 2014. English. ⟨tel-00991669⟩



Record views


Files downloads