Skip to Main content Skip to Navigation
Theses

Extraction et impact des connaissances sur les performances des systèmes de recherche d'information

Abstract : An information retrieval system is dedicated to find the best possible results in a rich information context. Our study is interested in the knowledge which can be extracted from textual documents contents by associating a linguistic approach to the capacity of a statistical approach to analyze big corpus. The statistical approach is based on Text Data Mining, more precisely on the association rule technique. The linguistic approach is based on noun phrases considered as more adequate to represent document content than single words. It clarifies the needed linguistic constraints for the extraction of noun phrases and explicits the syntagmatic relations between words in noun phrases. These phrasal relations are exploited to structure noun phrases. A measure, namely ``information quantity'', is proposed to estimate the suggestive power of every noun phrase, to filter and compare noun phrases. The proposed model demonstrates that the combination of a statistical approach and a linguistic approach refines the extracted knowledge and increases the performances of an information retrieval system.
Document type :
Theses
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-00004459
Contributor : Thèses Imag <>
Submitted on : Tuesday, February 3, 2004 - 3:29:57 PM
Last modification on : Friday, November 6, 2020 - 4:12:41 AM
Long-term archiving on: : Friday, April 2, 2010 - 8:14:35 PM

Identifiers

  • HAL Id : tel-00004459, version 1

Collections

UJF | CNRS | IMAG | UGA

Citation

Mohamed Hatem Haddad. Extraction et impact des connaissances sur les performances des systèmes de recherche d'information. domain_stic.gest. Université Joseph-Fourier - Grenoble I, 2002. Français. ⟨tel-00004459⟩

Share

Metrics

Record views

632

Files downloads

1281