Skip to Main content Skip to Navigation

Extraction de séquences fréquentes : des données numériques aux valeurs manquantes

Abstract : The large amount of data stored in any areas as well as the diversity of their format and origin make manual analysis or knowledge discovery impossible. For this reason, various communities have been interested for several years in the conception and implementation of tools that can automatically extract knowledge from such large databases. Nowadays these works aim at considering heterogeneity of data, format and quality. Our own workis part of this research axis. More particularly, we consider the context of frequent pattern discovery from data ordered as data sequences. Until now, such patterns, called sequential patterns, could be extracted only from sequence databases containing symbolic and perfect data, i.e. databases consisting of binary information or data that can be processed as binary and only containing complete data. So we propose several improvement of frequent sequence discovery techniques in order to take into account heterogeneous, incomplete or uncertain data, while minimizing possible information loss. Thus, the work described in this thesis consists of the implementation of a global framework for fuzzy sequential pattern discovery within numerical quantitative data, the definition of soft temporal constraints allowing flexibility for the user and sorting of uncovered patterns, last the implementation of two approaches for sequential pattern discovery from incomplete data.
Document type :
Complete list of metadata

Cited literature [149 references]  Display  Hide  Download
Contributor : Celine Fiot <>
Submitted on : Monday, October 15, 2007 - 4:44:09 PM
Last modification on : Friday, October 23, 2020 - 4:39:34 PM
Long-term archiving on: : Sunday, April 11, 2010 - 11:04:19 PM


  • HAL Id : tel-00179506, version 1



Céline Fiot. Extraction de séquences fréquentes : des données numériques aux valeurs manquantes. Interface homme-machine [cs.HC]. Université Montpellier II - Sciences et Techniques du Languedoc, 2007. Français. ⟨tel-00179506⟩



Record views


Files downloads