Skip to Main content Skip to Navigation

Extraction de données et apprentissage automatique pour les sites web adaptatifs

Abstract : Our work is about Knowledge Discovery and Data Mining. We focus on web data including server log files. In order to know automatically how to adapt a web site, we decide to learn grammatical models about users behaviors. We show in this work how the web data are difficult to acquire in order to use them in a grammatical inference process. We try to eliminate the almost totality of the noise which is present in these data. We also show how grammatical inference can learn good models by generalizing enough its input data. We explain how difficult the evaluation of the quality of learned models is, and we introduce an euclidean measure between languages models represented by automata. We prove that this measure is a true distance in a mathematical sense. Finally, we propose our experimentation results: we show that our method (from the prepossessing of the data to the evaluation of learned models) gives better success rates for the new page prediction task which is very common in web usage mining.
Document type :
Complete list of metadatas

Cited literature [54 references]  Display  Hide  Download
Contributor : Thierry Murgue <>
Submitted on : Monday, March 9, 2009 - 10:28:13 AM
Last modification on : Wednesday, June 24, 2020 - 4:18:31 PM
Long-term archiving on: : Tuesday, June 8, 2010 - 11:14:58 PM


  • HAL Id : tel-00366586, version 1


Thierry Murgue. Extraction de données et apprentissage automatique pour les sites web adaptatifs. Autre [cs.OH]. Ecole Nationale Supérieure des Mines de Saint-Etienne; Université Jean Monnet - Saint-Etienne, 2006. Français. ⟨tel-00366586⟩



Record views


Files downloads