Bayesian networks for static and temporal data fusion

Thibaud Rahier 1
1 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : Prediction and inference on temporal data is very frequently performed using timeseries data alone. We believe that these tasks could benefit from leveraging the contextual metadata associated to timeseries - such as location, type, etc. Conversely, tasks involving prediction and inference on metadata could benefit from information held within timeseries. However, there exists no standard way of jointly modeling both timeseries data and descriptive metadata. Moreover, metadata frequently contains highly correlated or redundant information, and may contain errors and missing values.We first consider the problem of learning the inherent probabilistic graphical structure of metadata as a Bayesian Network. This has two main benefits: (i) once structured as a graphical model, metadata is easier to use in order to improve tasks on temporal data and (ii) the learned model enables inference tasks on metadata alone, such as missing data imputation. However, Bayesian network structure learning is a tremendous mathematical challenge, that involves a NP-Hard optimization problem. We present a tailor-made structure learning algorithm, inspired from novel theoretical results, that exploits (quasi)-determinist dependencies that are typically present in descriptive metadata. This algorithm is tested on numerous benchmark datasets and some industrial metadatasets containing deterministic relationships. In both cases it proved to be significantly faster than state of the art, and even found more performant structures on industrial data. Moreover, learned Bayesian networks are consistently sparser and therefore more readable.We then focus on designing a model that includes both static (meta)data and dynamic data. Taking inspiration from state of the art probabilistic graphical models for temporal data (Dynamic Bayesian Networks) and from our previously described approach for metadata modeling, we present a general methodology to jointly model metadata and temporal data as a hybrid static-dynamic Bayesian network. We propose two main algorithms associated to this representation: (i) a learning algorithm, which while being optimized for industrial data, is still generalizable to any task of static and dynamic data fusion, and (ii) an inference algorithm, enabling both usual tasks on temporal or static data alone, and tasks using the two types of data.%We then provide results on diverse cross-field applications such as forecasting, metadata replenishment from timeseries and alarms dependency analysis using data from some of Schneider Electric’s challenging use-cases.Finally, we discuss some of the notions introduced during the thesis, including ways to measure the generalization performance of a Bayesian network by a score inspired from the cross-validation procedure from supervised machine learning. We also propose various extensions to the algorithms and theoretical results presented in the previous chapters, and formulate some research perspectives.
Complete list of metadatas

Cited literature [40 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01971371
Contributor : Abes Star <>
Submitted on : Tuesday, April 16, 2019 - 4:24:10 PM
Last modification on : Monday, April 29, 2019 - 12:01:16 PM

File

RAHIER_2018_diffusion.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01971371, version 3

Collections

Citation

Thibaud Rahier. Bayesian networks for static and temporal data fusion. Statistics [math.ST]. Université Grenoble Alpes, 2018. English. ⟨NNT : 2018GREAM083⟩. ⟨tel-01971371v3⟩

Share

Metrics

Record views

184

Files downloads

79