Skip to Main content Skip to Navigation

Apprentissage Few Shot et méthode d'élagage pour la détection d'émotions sur bases de données restreintes

Abstract : Emotion detection plays a major part in human interactions, a goodunderstanding of the speaker's emotional state leading to a betterunderstanding of his speech. It is de facto the same in human-machineinteractions.In the area of emotion detection using computers, deep learning hasemerged as the state of the art. However, classical deep learningtechnics perform poorly when training sets are small. This thesis explores two possible ways for tackling this issue, pruning and fewshot learning.Many pruning methods exist but focus on maximising pruning withoutlosing too much accuracy.We propose a new pruning method, improving the choice of the weightsto remove. This method is based on the rivalry of two networks, theoriginal network and a network we name rival.The idea is to share weights between both models in order to maximisethe accuracy. During training, weights impacting negatively the accuracy will be removed, thus optimising the architecture while improving accuracy. This technic is tested on different networks as well asdifferent databases and achieves state of the art results, improvingaccuracy while pruning a significant percentage of weights.The second area of this thesis is the exploration of matching networks(both siamese and triple), as an answer to learning on small datasets.Sounds and Images were merged to learn their main features, in orderto detect emotions.We show that, while restricting ourselves to 200 training instancesfor each class, triplet network achieves state of the art (trained on hundreds of thousands instances) on some databases.We also show that, in the area of emotion detection, triplet networksprovide a better vectorial embedding of the emotions thansiamese networks, and thusdeliver better results.A new loss function based on triplet loss is also introduced, facilitatingthe training process of the triplet and siamese networks. To allow abetter comparison of our model, different methods are used to provideelements of validation, especially on the vectorial embedding.In the long term, both methods can be combined to propose lighter and optimised networks. As thenumber of parameters is lowered by pruning, the triplet network shouldlearn more easily and could achieve better performances.
Complete list of metadata
Contributor : Abes Star :  Contact
Submitted on : Tuesday, February 16, 2021 - 3:43:10 PM
Last modification on : Wednesday, February 24, 2021 - 4:24:03 PM
Long-term archiving on: : Monday, May 17, 2021 - 8:28:21 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03143123, version 1



Kergann Le Cornec. Apprentissage Few Shot et méthode d'élagage pour la détection d'émotions sur bases de données restreintes. Réseau de neurones [cs.NE]. Université Clermont Auvergne, 2020. Français. ⟨NNT : 2020CLFAC034⟩. ⟨tel-03143123⟩



Record views


Files downloads