Skip to Main content Skip to Navigation
Theses

Apprentissage neuronal profond pour l'analyse de contenus multimodaux et temporels

Valentin Vielzeuf 1
1 Equipe Image - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : Our perception is by nature multimodal, i.e. it appeals to many of our senses. To solve certain tasks, it is therefore relevant to use different modalities, such as sound or image.This thesis focuses on this notion in the context of deep learning. For this, it seeks to answer a particular problem: how to merge the different modalities within a deep neural network?We first propose to study a problem of concrete application: the automatic recognition of emotion in audio-visual contents.This leads us to different considerations concerning the modeling of emotions and more particularly of facial expressions. We thus propose an analysis of representations of facial expression learned by a deep neural network.In addition, we observe that each multimodal problem appears to require the use of a different merge strategy.This is why we propose and validate two methods to automatically obtain an efficient fusion neural architecture for a given multimodal problem, the first one being based on a central fusion network and aimed at preserving an easy interpretation of the adopted fusion strategy. While the second adapts a method of neural architecture search in the case of multimodal fusion, exploring a greater number of strategies and therefore achieving better performance.Finally, we are interested in a multimodal view of knowledge transfer. Indeed, we detail a non-traditional method to transfer knowledge from several sources, i.e. from several pre-trained models. For that, a more general neural representation is obtained from a single model, which brings together the knowledge contained in the pre-trained models and leads to state-of-the-art performances on a variety of facial analysis tasks.
Document type :
Theses
Complete list of metadatas

Cited literature [357 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02437035
Contributor : Abes Star :  Contact
Submitted on : Monday, January 13, 2020 - 3:13:23 PM
Last modification on : Monday, February 10, 2020 - 3:48:58 PM
Long-term archiving on: : Tuesday, April 14, 2020 - 4:39:06 PM

File

sygal_fusion_28417-vielzeuf-va...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02437035, version 1

Citation

Valentin Vielzeuf. Apprentissage neuronal profond pour l'analyse de contenus multimodaux et temporels. Bio-informatique [q-bio.QM]. Normandie Université, 2019. Français. ⟨NNT : 2019NORMC229⟩. ⟨tel-02437035⟩

Share

Metrics

Record views

275

Files downloads

307