Skip to Main content Skip to Navigation

Prise en compte de critères acoustiques pour la synthèse de la parole

Abstract : This thesis relates to text-to-speech synthesis and deals more particularly with the corpus based approach. In the last few years, this approach based on the concatenation of acoustic segments contained in large databases has become increasingly popular. Indeed, selecting units which best fit the text to be synthesized leads to a synthesised signal whose naturalness can be rather well preserved. The quality of the synthesized speech obtained by corpus-based methods is closely related on the one hand to the corpus used for synthesis and on the other hand to the unit selection algorithm. In spite of the notable increase of quality reached with this technology, corpus-based speech synthesis is not able to guarantee a synthesised speech whose quality is constant on an entire utterance. This is mainly due to the lack of acoustic control of the existing corpus-based speech synthesis systems. The main objective of this thesis is therefore to introduce a mechanism allowing a better acoustical control during synthesis.
The proposed method uses statistical approaches to generate a smooth acoustic target from which the sequence of synthesis units will be selected. This target is deduced from acoustic models, namely context dependent senone models, estimated during a training phase. Initially, we propose an algorithm of selection based only on this acoustic target. Then, the proposed selection method is modified so as to better control the information of fundamental frequency. This unit selection module is also combined with a pre-selection module so as to drastically reduce the computational load. Formal listening tests show that the proposed method leads to a significant reduction in acoustic discontinuities during the concatenation.

The proposed method is also applied to acoustic database reduction and enables a compression of about 60% of the acoustic database without perceptible decrease of the speech quality.
Document type :
Complete list of metadata

Cited literature [87 references]  Display  Hide  Download
Contributor : Soufiane Rouibia <>
Submitted on : Monday, November 6, 2006 - 4:19:26 PM
Last modification on : Friday, October 23, 2020 - 4:37:19 PM
Long-term archiving on: : Thursday, September 20, 2012 - 2:25:22 PM


  • HAL Id : tel-00111952, version 1



Soufiane Rouibia. Prise en compte de critères acoustiques pour la synthèse de la parole. Autre [cs.OH]. Université Rennes 1, 2006. Français. ⟨tel-00111952⟩



Record views


Files downloads