Skip to Main content Skip to Navigation
Theses

Recherche de séries temporelles à l’aide de DTW-preserving shapelets

Abstract : Establishing the similarity of time series is at the core of many data mining tasks such as time series classification, time series clustering, time series retrieval, among others. Metrics to establish similarities between time series are specific in the sense that they must be able to take into account the differences in the values making the series as well as distortions along the timelines. The most popular similarity metric is the Dynamic Time Warping (DTW) measure. However, it is costly to compute, and using it against numerous and/or very long time series is difficult in practice. There has been numerous attempts to accelerate the DTW, yet, scaling DTW remains a major difficulty. An elegant research direction proposes to change the representation of time series such that it is much cheaper to establish similarities. This typically relies on an embedding process where vectorial representations of time series are constructed, allowing then to estimate their similarity using e.g. L2 distances, much faster to compute than DTW. Naturally, the quality of this representation largely depends on the embedding process, and the family of contributions relying on the concept of shapelets prove to work particularly well. Shapelets, and the transform operation materializing the embedding process, were originally proposed for time series classification. Shapelets are independent subsequences extracted or learned from time series to form discriminatory features. Shapelets are used to transform time series in high dimensional (Euclidean) vectors. Recently, it was proposed to embed time series into an Euclidean space such that the distance in this embedded space well approximates the true DTW. This contribution targets time series clustering. The work presented in this Ph.D. manuscript builds on the idea of transforming time series using shapelets. It shows how shapelets that preserve DTW measures can be used in the specific context of large scale time series retrieval. This manuscript is making major contributions: (1) it explains how DTW-preserving shapelets can be used in the specific context of time series retrieval; (2) it proposes some shapelet selection strategies in order to cope with scale, that is, in order to deal with extremely large collection of time series; (3) it details how to handle both univariate and multivariate time series, hence covering the whole spectrum of time series retrieval problems. The core of the contribution presented in this manuscript allows to easily trade-off the complexity of the transformation against the accuracy of the retrieval. Experiments using the UCR and the UEA datasets demonstrate the vast performance improvements compared to state of the art techniques.
Document type :
Theses
Complete list of metadatas

Cited literature [290 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02510229
Contributor : Abes Star :  Contact
Submitted on : Tuesday, March 17, 2020 - 3:38:09 PM
Last modification on : Wednesday, August 5, 2020 - 3:47:52 AM
Long-term archiving on: : Thursday, June 18, 2020 - 3:33:02 PM

File

CARLINI_SPERANDIO_Ricardo.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02510229, version 1

Citation

Ricardo Carlini Sperandio. Recherche de séries temporelles à l’aide de DTW-preserving shapelets. Machine Learning [cs.LG]. Université Rennes 1, 2019. English. ⟨NNT : 2019REN1S061⟩. ⟨tel-02510229⟩

Share

Metrics

Record views

104

Files downloads

145