Détection d'anomalies à la volée dans des flux de données de grande dimension

Abstract : The subject of this Thesis is to study anomaly detection in high-dimensional data streams with a specific application to aircraft engine Health Monitoring. In this work, we consider the problem of anomaly detection as an unsupervised learning problem. Modern data, especially those issued from industrial systems, are often streams of high-dimensional data samples, since multiple measurements can be taken at a high frequency and at a possibly infinite time horizon. Moreover, data can contain anomalies (malfunctions, failures) of the system being monitored. Most existing unsupervised learning methods cannot handle data which possess these features. We first introduce an offline subspace clustering algorithm for high-dimensional data based on the expectation-maximization (EM) algorithm, which is also robust to anomalies through the use of the trimming technique. We then address the problem of online clustering of high-dimensional data streams by developing an online inference algorithm for the popular mixture of probabilistic principal component analyzers (MPPCA) model. We show the efficiency of both methods on synthetic and real datasets, including aircraft engine data with anomalies. Finally, we develop a comprehensive application for the aircraft engine Health Monitoring domain, which aims at detecting anomalies in aircraft engine data in a dynamic manner and introduces novel anomaly detection visualization techniques based on Self-Organizing Maps. Detection results are presented and anomaly identification is also discussed.
Liste complète des métadonnées

https://tel.archives-ouvertes.fr/tel-00944263
Contributor : Anastasios Bellas <>
Submitted on : Monday, February 10, 2014 - 2:17:02 PM
Last modification on : Monday, November 27, 2017 - 2:14:02 PM
Document(s) archivé(s) le : Sunday, May 11, 2014 - 4:37:27 AM

Identifiers

  • HAL Id : tel-00944263, version 1

Citation

Anastasios Bellas. Détection d'anomalies à la volée dans des flux de données de grande dimension. Statistiques [math.ST]. Université Panthéon-Sorbonne - Paris I, 2014. Français. ⟨tel-00944263⟩

Share

Metrics

Record views

1396

Files downloads

862