Skip to Main content Skip to Navigation

Analyse de signaux sociaux multimodaux : application à la synthèse d’attitudes sociales chez un agent conversationnel animé

Thomas Janssoone 1, 2, 3
1 MultiMedia
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract : During an interaction, non-verbal behavior reflects the emotional state of the speaker, such as attitude or personality. Modulations in social signals tell about someone’s affective state like variations in head movements, facial expressions or prosody. Nowadays, machines can use embodied conversational agents to express the same kind of social cues. Thus, these agents can improve the quality of life in our modern societies if they provide natural interactions with users. Indeed, the virtual agent must express different attitudes according to its purpose, such as dominance for a tutor or kindness for a companion. Literature in sociology and psychology underlines the importance of the dynamic of social signals for the expression of different affective states. Thus, this thesis proposes mo- dels focused on temporality to express a desired affective phenomenon. They are designed to handle social signals that are automatically extracted from a corpus. The purpose of this analysis is the generation of embodied conversational agents expressing a specific stance. A survey of existing databases lead to the design of a corpus composed of presidential addresses. The high definition videos allow algorithms to automatically evaluate the social signals. After a corrective process of the extracted social signals, an agent clones the human’s behavior during the addresses. This provides an evaluation of the perception of attitudes with a human or a virtual agent as a protagonist. The SMART model use sequence mining to find temporal association rules in interaction data. It finds accurate temporal information in the use of social signals and links it with a social attitude. The structure of these rules allows an easy transposition of this information to synthesize the behavior of a virtual agent. Perceptual studies validate this approach. A second model, SSN, designed during an international collaboration, is based on deep learning and domain separation. It allows multi-task learning of several affective phenomena and proposes a method to analyse the dynamics of the signals used. These different contributions underline the importance of temporality for the synthesis of virtual agents to improve the expression of certain affective phenomena. Perspectives give recommendation to integrate this information into multimodal solutions.
Complete list of metadatas

Cited literature [148 references]  Display  Hide  Download
Contributor : Thomas Janssoone <>
Submitted on : Monday, June 17, 2019 - 5:15:13 PM
Last modification on : Monday, December 14, 2020 - 9:55:35 AM


Files produced by the author(s)


  • HAL Id : tel-02158084, version 1


Thomas Janssoone. Analyse de signaux sociaux multimodaux : application à la synthèse d’attitudes sociales chez un agent conversationnel animé. Intelligence artificielle [cs.AI]. université sorbonne université, 2018. Français. ⟨tel-02158084⟩



Record views


Files downloads