Automatic non linear metric learning : Application to gesture recognition

Abstract : As consumer devices become more and more ubiquitous, new interaction solutions are required. In this thesis, we explore inertial-based gesture recognition on Smartphones, where gestures holding a semantic value are drawn in the air with the device in hand. In our research, speed and delay constraints required by an application are critical, leading us to the choice of neural-based models. Thus, our work focuses on metric learning between gesture sample signatures using the "Siamese" architecture (Siamese Neural Network, SNN), which aims at modelling semantic relations between classes to extract discriminative features, applied to the MultiLayer Perceptron. Contrary to some popular versions of this algorithm, we opt for a strategy that does not require additional parameter fine tuning, namely a set threshold on dissimilar outputs, during training. Indeed, after a preprocessing step where the data is filtered and normalised spatially and temporally, the SNN is trained from sets of samples, composed of similar and dissimilar examples, to compute a higher-level representation of the gesture, where features are collinear for similar gestures, and orthogonal for dissimilar ones. While the original model already works for classification, multiple mathematical problems which can impair its learning capabilities are identified. Consequently, as opposed to the classical similar or dissimilar pair; or reference, similar and dissimilar sample triplet input set selection strategies, we propose to include samples from every available dissimilar classes, resulting in a better structuring of the output space. Moreover, we apply a regularisation on the outputs to better determine the objective function. Furthermore, the notion of polar sine enables a redefinition of the angular problem by maximising a normalised volume induced by the outputs of the reference and dissimilar samples, which effectively results in a Supervised Non-Linear Independent Component Analysis. Finally, we assess the unexplored potential of the Siamese network and its higher-level representation for novelty and error detection and rejection. With the help of two real-world inertial datasets, the Multimodal Human Activity Dataset as well as the Orange Dataset, specifically gathered for the Smartphone inertial symbolic gesture interaction paradigm, we characterise the performance of each contribution, and prove the higher novelty detection and rejection rate of our model, with protocols aiming at modelling unknown gestures and open world configurations. To summarise, the proposed SNN allows for supervised non-linear similarity metric learning, which extracts discriminative features, improving both inertial gesture classification and rejection.
Document type :
Complete list of metadatas

Cited literature [68 references]  Display  Hide  Download
Contributor : Abes Star <>
Submitted on : Thursday, March 23, 2017 - 6:49:26 PM
Last modification on : Friday, May 17, 2019 - 10:19:05 AM
Long-term archiving on : Saturday, June 24, 2017 - 4:24:17 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01379579, version 2


Samuel Berlemont. Automatic non linear metric learning : Application to gesture recognition. Artificial Intelligence [cs.AI]. Université de Lyon, 2016. English. ⟨NNT : 2016LYSEI014⟩. ⟨tel-01379579v2⟩



Record views


Files downloads