Apprentissage incrémental en ligne sur flux de données

Christophe Salperwyck 1
1 SHACRA - Simulation in Healthcare using Computer Research Advances
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, Inria Nancy - Grand Est
Abstract : Statistical learning provides numerous algorithms to build predictive models on past observations. These techniques proved their ability to deal with large scale realistic problems. However, new domains generate more and more data which are only visible once and need to be processes sequentially. These volatile data, known as data streams, come from telecommunication network management, social network, web mining. The challenge is to build new algorithms able to learn under these constraints. We proposed to build new summaries for supervised classification. Our summaries are based on two levels. The first level is an online incremental summary which uses low processing and address the precision/memory tradeoff. The second level uses the first layer summary to build the final sumamry with an effcient offline method. Building these sumamries is a pre-processing stage to develop new classifiers for data streams. We propose new versions for the naive-Bayes and decision trees classifiers using our summaries. As data streams might contain concept drifts, we also propose a new technique to detect these drifts and update classifiers accordingly.
Document type :
Theses
Complete list of metadatas

Cited literature [174 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00845655
Contributor : Abes Star <>
Submitted on : Friday, September 20, 2013 - 8:47:13 AM
Last modification on : Thursday, February 21, 2019 - 10:52:54 AM
Long-term archiving on : Friday, April 7, 2017 - 12:20:16 AM

File

SALPERWYCK_Christophe.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00845655, version 2

Collections

Citation

Christophe Salperwyck. Apprentissage incrémental en ligne sur flux de données. Autre [cs.OH]. Université Charles de Gaulle - Lille III, 2012. Français. ⟨NNT : 2012LIL30037⟩. ⟨tel-00845655v2⟩

Share

Metrics

Record views

874

Files downloads

1151