Skip to Main content Skip to Navigation
Theses

Modélisation et prototypage d'un système de recherche d'informations basé sur la proximité des occurences des termes de la requête dans les documents

Abstract : The huge size of digital data accentuates the scientific challenge of information retrieval (IR) consisting in finding a compromise between recall and precision. We propose an IR model based on fuzzy proximity (FP) of the query terms which is aimed to high precision. It combines the expressivity of the Boolean query model and the ranking of the documents thanks to the use of proximity. Each keyword defines an influence zone at the query evaluation time. The fuzzy operations associated to the traditional Boolean operators propagate the proximity to the root of the query tree. The FP model was largely validated on the traditional test collections and at the 2005 and 2006 editions of the international IR evaluation campaigns (TREC, CLEF and INEX 2006). The results obtained with the automatically built queries are equivalent to the baselines (Okapi/Lucy and vector/MG). Moreover, with manual queries adapted to FP, the results are better than the baselines.
Document type :
Theses
Complete list of metadatas

Cited literature [128 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00785143
Contributor : Florent Breuil <>
Submitted on : Tuesday, February 5, 2013 - 2:50:46 PM
Last modification on : Wednesday, June 24, 2020 - 4:18:09 PM
Long-term archiving on: : Monday, June 17, 2013 - 7:25:36 PM

Identifiers

  • HAL Id : tel-00785143, version 1

Citation

Annabelle Mercier. Modélisation et prototypage d'un système de recherche d'informations basé sur la proximité des occurences des termes de la requête dans les documents. Modélisation et simulation. Ecole Nationale Supérieure des Mines de Saint-Etienne, 2006. Français. ⟨NNT : 2006EMSE0024⟩. ⟨tel-00785143⟩

Share

Metrics

Record views

375

Files downloads

1343