Skip to Main content Skip to Navigation

Deep-learning for high dimensional sequential observations : application to continuous gesture recognition

Abstract : This thesis aims to improve the intuitiveness of human-computer interfaces. In particular, machines should try to replicate human's ability to process streams of information continuously. However, the sub-domain of Machine Learning dedicated to recognition on time series remains barred by numerous challenges. Our studies use gesture recognition as an exemplar application, gestures intermix static body poses and movements in a complex manner using widely different modalities. The first part of our work compares two state-of-the-art temporal models for continuous sequence recognition, namely Hybrid Neural Network--Hidden Markov Models (NN-HMM) and Bidirectional Recurrent Neural Networks (BDRNN) with gated units. To do so, we reimplemented the two within a shared test-bed which is more amenable to a fair comparative work. We propose adjustments to Neural Network training losses and the Hybrid NN-HMM expressions to accommodate for highly imbalanced data classes. Although recent publications tend to prefer BDRNNs, we demonstrate that Hybrid NN-HMM remain competitive. However, the latter rely significantly on their input layers to model short-term patterns. Finally, we show that input representations learned via both approaches are largely inter-compatible. The second part of our work studies one-shot learning, which has received relatively little attention so far, in particular for sequential inputs such as gestures. We propose a model built around a Bidirectional Recurrent Neural Network. Its effectiveness is demonstrated on the recognition of isolated gestures from a sign language lexicon. We propose several improvements over this baseline by drawing inspiration from related works and evaluate their performances, exhibiting different advantages and disadvantages for each
Complete list of metadatas

Cited literature [156 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Friday, February 8, 2019 - 2:36:07 PM
Last modification on : Monday, August 24, 2020 - 4:16:12 PM
Long-term archiving on: : Thursday, May 9, 2019 - 2:32:30 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02012106, version 1


Nicolas Granger. Deep-learning for high dimensional sequential observations : application to continuous gesture recognition. Human-Computer Interaction [cs.HC]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLL002⟩. ⟨tel-02012106⟩



Record views


Files downloads