Skip to Main content Skip to Navigation
Theses

Extraction de données et apprentissage automatique pour les sites web adaptatifs

Abstract : Our work is about Knowledge Discovery and Data Mining. We focus on web data including server log files. In order to know automatically how to adapt a web site, we decide to learn grammatical models about users behaviors. We show in this work how the web data are difficult to acquire in order to use them in a grammatical inference process. We try to eliminate the almost totality of the noise which is present in these data. We also show how grammatical inference can learn good models by generalizing enough its input data. We explain how difficult the evaluation of the quality of learned models is, and we introduce an euclidean measure between languages models represented by automata. We prove that this measure is a true distance in a mathematical sense. Finally, we propose our experimentation results: we show that our method (from the prepossessing of the data to the evaluation of learned models) gives better success rates for the new page prediction task which is very common in web usage mining.
Document type :
Theses
Complete list of metadatas

Cited literature [54 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00366586
Contributor : Thierry Murgue <>
Submitted on : Monday, March 9, 2009 - 10:28:13 AM
Last modification on : Wednesday, June 24, 2020 - 4:18:31 PM
Long-term archiving on: : Tuesday, June 8, 2010 - 11:14:58 PM

Identifiers

  • HAL Id : tel-00366586, version 1

Citation

Thierry Murgue. Extraction de données et apprentissage automatique pour les sites web adaptatifs. Autre [cs.OH]. Ecole Nationale Supérieure des Mines de Saint-Etienne; Université Jean Monnet - Saint-Etienne, 2006. Français. ⟨tel-00366586⟩

Share

Metrics

Record views

523

Files downloads

3920