Incremental Bayesian network structure learning from data streams

Abstract : In the last decade, data stream mining has become an active area of research, due to the importance of its applications and an increase in the generation of streaming data. The major challenges for data stream analysis are unboundedness, adaptiveness in nature and limitations over data access. Therefore, traditional data mining techniques cannot directly apply to the data stream. The problem aggravates for incoming data with high dimensional domains such as social networks, bioinformatics, telecommunication etc, having several hundreds and thousands of variables. It poses a serious challenge for existing Bayesian network structure learning algorithms. To keep abreast with the latest trends, learning algorithms need to incorporate novel data continuously. The existing state of the art in incremental structure learning involves only several tens of variables and they do not scale well beyond a few tens to hundreds of variables. This work investigates a Bayesian network structure learning problem in high dimensional domains. It makes a number of contributions in order to solve these problems. In the first step we proposed an incremental local search approach iMMPC to learn a local skeleton for each variable. Further, we proposed an incremental version of Max-Min Hill-Climbing (MMHC) algorithm to learn the whole structure of the network. We also proposed some guidelines to adapt it with sliding and damped window environments. Finally, experimental results and theoretical justifications that demonstrate the feasibility of our approach demonstrated through extensive experiments on synthetic datasets.
Document type :
Theses
Complete list of metadatas

Cited literature [136 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01284332
Contributor : Lina Duke <>
Submitted on : Monday, March 7, 2016 - 2:46:35 PM
Last modification on : Thursday, April 5, 2018 - 10:36:49 AM
Long-term archiving on : Wednesday, June 8, 2016 - 2:41:56 PM

Identifiers

  • HAL Id : tel-01284332, version 1

Collections

Citation

Amanullah Yasin. Incremental Bayesian network structure learning from data streams. Machine Learning [cs.LG]. Université de Nantes, 2013. English. ⟨tel-01284332⟩

Share

Metrics

Record views

304

Files downloads

2336