Skip to Main content Skip to Navigation
Theses

Generation of Audio-Visual Prosody for Expressive Virtual Actors

Adela Barbulescu 1, 2
2 IMAGINE - Intuitive Modeling and Animation for Interactive Graphics & Narrative Environments
Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann, Inria Grenoble - Rhône-Alpes
Abstract : The work presented in this thesis addresses the problem of generating audiovisual prosody for expressive virtual actors. A virtual actor is represented by a 3D talking head and an audiovisual performance refers to facial expressions, head movements, gaze direction and the speech signal.
While an important amount of work has been dedicated to emotions, we explore here expressive verbal behaviors that signal mental states, i.e "how speakers feel about what they say". We explore the characteristics of these so-called dramatic attitudes and the way they are encoded with speaker-specific prosodic signatures i.e. patterns of trajectories of audio-visual prosodic parameters. We analyze and model a set of 16 attitudes which encode interactive dimensions of face-to-face communication in dramatic dialogues.
We ask two semi-professional actors to perform these 16 attitudes first in isolation (exercises in style) in a series of 35 carrier sentences and secondly in a short interactive dialog extracted from the theater play "Hands around" by Arthur Schnitzler, under the guidance of a professional theater director. The audiovisual trajectories are analyzed both at frame-level and at utterance-level. In order to synthesize expressive performances, we used both a frame-based conversion system for generating segmental features and a prosodic model for suprasegmental features. The prosodic model considers both the spatial and temporal dimension in the analysis and generation of prosody by introducing dynamic audiovisual units.
Along with the implementation of the presented system, the following topics are discussed in detail: state of the art (virtual actors, visual prosody, speech-driven animation, text-tovisual speech, expressive audiovisual conversion), the recording of an expressive corpus of dramatic attitudes, the data analysis and characterization, the generation of audiovisual prosody and evaluation of the synthesized audiovisual performances.
Document type :
Theses
Complete list of metadata

Cited literature [153 references]  Display  Hide  Download


https://tel.archives-ouvertes.fr/tel-01241413
Contributor : Adela Barbulescu Connect in order to contact the contributor
Submitted on : Tuesday, December 22, 2015 - 9:16:16 PM
Last modification on : Tuesday, October 19, 2021 - 11:22:43 PM
Long-term archiving on: : Sunday, April 30, 2017 - 12:00:28 AM

Identifiers

  • HAL Id : tel-01241413, version 1

Collections

Citation

Adela Barbulescu. Generation of Audio-Visual Prosody for Expressive Virtual Actors. Graphics [cs.GR]. Université de Grenoble, 2015. English. ⟨tel-01241413v1⟩

Share

Metrics

Record views

648

Files downloads

653