Skip to Main content Skip to Navigation

Un modèle d'indexation pour les documents textuels structurés

Abstract : Most indexing models in information retrieval are dedicated to a particular domain or application, and do not exploit the richness of electronic documents. The goal of this work is to define an indexing model for textual documents that includes structure and other complementary information to the discourse. The proposed model consists of two components: the representation language, which defines at a conceptual level the information in the document, including the index themselves, and the derivation rules, which are based on this language and enable to deduce a particular kind of index, the themes. Indexing in our model does not only produce a static representation of documents, but is also dynamically linked to the correspondence process; in this way, selection of themes, as determined by the rules, is a function of the document and the user. Our approach was validated in two steps. First, a questionnaire was submitted to a group of users in order to understand their process of theme derivation. This a priori validation showed the validity of our derivation rules. Then, in an a posteriori validation, the model was implemented and tested on a collection of sgml documents. This experimentation showed the applicability and flexibility of the model.
Document type :
Complete list of metadata

Cited literature [56 references]  Display  Hide  Download
Contributor : Thèses Imag <>
Submitted on : Monday, February 23, 2004 - 5:11:04 PM
Last modification on : Friday, November 6, 2020 - 4:04:26 AM
Long-term archiving on: : Friday, September 14, 2012 - 10:40:44 AM


  • HAL Id : tel-00005009, version 1




Francois Paradis. Un modèle d'indexation pour les documents textuels structurés. Interface homme-machine [cs.HC]. Université Joseph-Fourier - Grenoble I, 1996. Français. ⟨tel-00005009⟩



Record views


Files downloads