# Extraction et impact des connaissances sur les performances des systèmes de recherche d'information

Abstract : An information retrieval system is dedicated to find the best possible results in a rich information context. Our study is interested in the knowledge which can be extracted from textual documents contents by associating a linguistic approach to the capacity of a statistical approach to analyze big corpus. The statistical approach is based on Text Data Mining, more precisely on the association rule technique. The linguistic approach is based on noun phrases considered as more adequate to represent document content than single words. It clarifies the needed linguistic constraints for the extraction of noun phrases and explicits the syntagmatic relations between words in noun phrases. These phrasal relations are exploited to structure noun phrases. A measure, namely information quantity'', is proposed to estimate the suggestive power of every noun phrase, to filter and compare noun phrases. The proposed model demonstrates that the combination of a statistical approach and a linguistic approach refines the extracted knowledge and increases the performances of an information retrieval system.
Keywords :
Document type :
Theses

https://tel.archives-ouvertes.fr/tel-00004459
Contributor : Thèses Imag <>
Submitted on : Tuesday, February 3, 2004 - 3:29:57 PM
Last modification on : Friday, November 6, 2020 - 4:12:41 AM
Long-term archiving on: : Friday, April 2, 2010 - 8:14:35 PM

### Identifiers

• HAL Id : tel-00004459, version 1

### Citation

Mohamed Hatem Haddad. Extraction et impact des connaissances sur les performances des systèmes de recherche d'information. domain_stic.gest. Université Joseph-Fourier - Grenoble I, 2002. Français. ⟨tel-00004459⟩

Record views