Skip to Main content Skip to Navigation
Theses

Bayesian networks for static and temporal data fusion

Thibaud Rahier 1, 2 
1 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
Abstract : Prediction and inference on temporal data is very frequently performed using time series data alone. We believe that these tasks could benefit from leveraging the contextual metadata associated to time series - such as location, type, etc. Conversely, tasks involving prediction and inference on metadata could benefit from information held within time series. However, there exists no standard way of jointly modeling both time series data and descriptive metadata. Moreover, metadata frequently contains highly correlated or redundant information, and may contain errors and missing values. We first consider the problem of learning the inherent probabilistic graphical structure of metadata as a Bayesian Network. This has two main benefits: (i) once structured as a graphical model, metadata is easier to use in order to improve tasks on temporal data and (ii) the learned model enables inference tasks on metadata alone, such as missing data imputation. However, Bayesian network structure learning is a tremendous mathematical challenge that involves a NP-Hard optimization problem. We present a tailor-made structure learning algorithm, inspired from novel theoretical results, that exploits (quasi)-deterministic dependencies that are typically present in descriptive metadata. This algorithm is tested on numerous benchmark datasets and some industrial metadata sets containing deterministic relationships. In both cases it proved to be significantly faster than state-of-the-art, and even found more performant structures on industrial data. Moreover, learned Bayesian networks are consistently sparser and therefore more readable. We then focus on designing a model that includes both static (meta)data and dynamic data. Taking inspiration from state-of-the-art probabilistic graphical models for temporal data (Dynamic Bayesian Networks) and from our previously described approach for metadata modeling, we present a general methodology to jointly model metadata and temporal data as a hybrid static-dynamic Bayesian network. We propose two main algorithms associated to this representation: (i) a learning algorithm, which while being optimized for industrial data, still generalizes to any task of static and dynamic data fusion, and (ii) an inference algorithm, enabling both usual tasks on temporal or static data alone, and tasks using the two types of data. Finally, we discuss some of the notions introduced during the thesis, including ways to measure the generalization performance of a Bayesian network by a score inspired from the cross validation procedure from supervised machine learning. We also propose various extensions to the algorithms and theoretical results presented in the previous chapters, and formulate some research perspectives. .
Complete list of metadata

Cited literature [79 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-01971371
Contributor : Thibaud Rahier Connect in order to contact the contributor
Submitted on : Friday, March 1, 2019 - 4:45:08 PM
Last modification on : Tuesday, October 19, 2021 - 11:27:21 AM
Long-term archiving on: : Thursday, May 30, 2019 - 4:33:16 PM

File

PhD_thesis_final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01971371, version 2

Citation

Thibaud Rahier. Bayesian networks for static and temporal data fusion. Statistics [math.ST]. Communauté Université Grenoble-Alpes, 2018. English. ⟨tel-01971371v2⟩

Share

Metrics

Record views

852

Files downloads

832