Skip to Main content Skip to Navigation
Theses

Détection d'anomalies dans les flux de données par structure d'indexation et approximation : Application à l'analyse en continu des flux de messages du système d'information de la SNCF

Lucas Foulon 1, 2
Abstract : In this thesis, we propose methods to approximate an anomaly score in order to detect abnormal parts in data streams. Two main problems are considered in this context. Firstly, the handling of the high dimensionality of the objects describing the time series extracted from the raw streams, and secondly, the low computation cost required to perform the analysis on-the-fly. To tackle the curse of dimensionality, we have selected the CFOF anomaly score, that has been proposed recently and proven to be robust to the increase of the dimensionality. Our main contribution is then the proposition of two methods to quickly approximate the CFOF score of new objects in a stream. The first one is based on safe pruning and approximation during the exploration of object neighbourhood. The second one is an approximation obtained by the aggregation of scores computed in several subspaces. Both contributions complete each other and can be combined. We show on a reference benchmark that our proposals result in important reduction of the execution times, while providing approximations that preserve the quality of anomaly detection. Then, we present our application of these approaches within the SNCF information system. In this context, we have extended the existing monitoring modules by a new tool to help to detect abnormal behaviours in the real stream of messages within the SNCF communication system.
Document type :
Theses
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-03125747
Contributor : Abes Star :  Contact
Submitted on : Friday, January 29, 2021 - 4:45:48 PM
Last modification on : Saturday, January 30, 2021 - 3:29:44 AM

File

these.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03125747, version 1

Citation

Lucas Foulon. Détection d'anomalies dans les flux de données par structure d'indexation et approximation : Application à l'analyse en continu des flux de messages du système d'information de la SNCF. Intelligence artificielle [cs.AI]. Université de Lyon, 2020. Français. ⟨NNT : 2020LYSEI082⟩. ⟨tel-03125747⟩

Share

Metrics

Record views

48

Files downloads

26