Skip to Main content Skip to Navigation
Theses

Un système de recherche d'information adapté aux données incertaines : adaptation du modèle de langue

Abstract : An information retrieval system is based on a formal methodology to assert if terms documents correspond to terms of queries. Most of these systems assume that terms extracted from documents are perfectly recognized which involves that their matching function can consider the equality between terms of documents and terms of queries.
Our work occurs in a context where data are not perfectly recognized and thus considered as uncertain. In this case, the equality between terms of documents and terms of queries may be change to the context of 'almost equality'. We propose an information retrieval system adapted to the uncertain data and based on the language model. We introduce the concept of pairing which measures 'almost equality' between two terms by the concordance and the intersection values. The pairing is also introduced in the matching function. Furthermore, the matching function is extended to take into account the extracted terms certainty value computed by an interpretation system. Basic assumptions of information retrieval such as Zipf's law and Luhn's conjecture are first checked. Then, our model is implemented.
Our model is experimentally validated and compared with systems which do not integrate the concept of uncertainty. Finally, we present a tool dedicated to phone meeting which is an application using an information retrieval system adapted to the uncertain data.
Document type :
Theses
Domain :
Complete list of metadatas

Cited literature [2 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00202702
Contributor : Caroline Tambellini <>
Submitted on : Monday, January 7, 2008 - 6:13:50 PM
Last modification on : Tuesday, December 8, 2020 - 10:26:07 AM
Long-term archiving on: : Tuesday, April 13, 2010 - 3:32:27 PM

Identifiers

  • HAL Id : tel-00202702, version 1

Collections

CNRS | LIG | UJF | UGA

Citation

Caroline Tambellini. Un système de recherche d'information adapté aux données incertaines : adaptation du modèle de langue. domain_stic.inge. Université Joseph-Fourier - Grenoble I, 2007. Français. ⟨tel-00202702⟩

Share

Metrics

Record views

379

Files downloads

972