Vers une prise en compte de plusieurs aspects des besoins d'information dans les modèles de la recherche documentaire : Propagation de métadonnées sur le World Wide Web

Abstract : In this thesis, which is part and parcel of the more general context of web information retrieval, we consider the issue of thematic and non thematic page indexation, with particular focus on page typology. We suggest a page characterization method in two steps. The first one, named homogeneous corpus extraction, aims at connecting several pages sharing similar features. The second one, called semi-automatic metadata assignment within each homogeneous corpus, is based on propagation : to begin with, only a small proportion of all ressources is manually qualified, ressources information is then propagated to other ressources. Methodologically, the homogeneous corpus extraction is grounded on hypertext link analysis. More precisely, it uses the "co-citation" principle. This principle is a Web transposition of the well-known scientometry co-citation method.
Document type :
Theses
Complete list of metadatas

Cited literature [80 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00839565
Contributor : Florent Breuil <>
Submitted on : Friday, June 28, 2013 - 2:44:39 PM
Last modification on : Thursday, October 17, 2019 - 12:33:31 PM
Long-term archiving on : Sunday, September 29, 2013 - 4:37:33 AM

Identifiers

  • HAL Id : tel-00839565, version 1

Citation

Camille Prime-Claverie. Vers une prise en compte de plusieurs aspects des besoins d'information dans les modèles de la recherche documentaire : Propagation de métadonnées sur le World Wide Web. Synthèse d'image et réalité virtuelle [cs.GR]. Ecole Nationale Supérieure des Mines de Saint-Etienne; Université Jean Monnet - Saint-Etienne, 2004. Français. ⟨NNT : 2004EMSE0020⟩. ⟨tel-00839565⟩

Share

Metrics

Record views

285

Files downloads

632