MeLos: Analysis and Modelling of Speech Prosody and Speaking Style

Abstract : This thesis addresses the issue of modelling speech prosody for speech synthesis and presents MeLos: a complete system for the analysis and modelling of speech prosody, "the music of speech". The objective of this thesis is to model the strategy, alternatives, and speaking style of a speaker for natural, expressive, and varied speech synthesis. The present study presents original contributions with special attention paid to the combination of theoretical linguistic and statistical modelling to provide a complete speech prosody system. A unified discrete/continuous context-dependent HMM is presented to model the symbolic and the acoustic characteristics of speech prosody: 1) A rich description of the text characteristics based on a linguistic processing chain that includes surface and deep syntactic parsing is proposed to refine the modelling of the speech prosody in context. 2) Segmental HMMs and Dempster-Shafer fusion are used to balance linguistic and metric constrains in the production of a pause. 3) A trajectory model is proposed based on the stylization and the simultaneous modelling of short and long-term F0 variations over various temporal domains. The proposed system is used to model the strategies, alternatives and speaking style of a speaker, and is extended to model the speaking style of any arbitrary number of speakers using shared-context-dependent modelling and speaker normalization techniques.
Complete list of metadatas
Contributor : Nicolas Obin <>
Submitted on : Saturday, May 5, 2012 - 10:39:50 PM
Last modification on : Friday, August 31, 2018 - 9:14:29 AM
Long-term archiving on : Friday, November 30, 2012 - 11:15:09 AM


  • HAL Id : tel-00694687, version 1


Nicolas Obin. MeLos: Analysis and Modelling of Speech Prosody and Speaking Style. Signal and Image processing. Université Pierre et Marie Curie - Paris VI, 2011. English. ⟨tel-00694687v1⟩



Record views


Files downloads