Skip to Main content Skip to Navigation
Theses

Prise en compte de critères acoustiques pour la synthèse de la parole

Abstract : This thesis relates to text-to-speech synthesis and deals more particularly with the corpus based approach. In the last few years, this approach based on the concatenation of acoustic segments contained in large databases has become increasingly popular. Indeed, selecting units which best fit the text to be synthesized leads to a synthesised signal whose naturalness can be rather well preserved. The quality of the synthesized speech obtained by corpus-based methods is closely related on the one hand to the corpus used for synthesis and on the other hand to the unit selection algorithm. In spite of the notable increase of quality reached with this technology, corpus-based speech synthesis is not able to guarantee a synthesised speech whose quality is constant on an entire utterance. This is mainly due to the lack of acoustic control of the existing corpus-based speech synthesis systems. The main objective of this thesis is therefore to introduce a mechanism allowing a better acoustical control during synthesis.
The proposed method uses statistical approaches to generate a smooth acoustic target from which the sequence of synthesis units will be selected. This target is deduced from acoustic models, namely context dependent senone models, estimated during a training phase. Initially, we propose an algorithm of selection based only on this acoustic target. Then, the proposed selection method is modified so as to better control the information of fundamental frequency. This unit selection module is also combined with a pre-selection module so as to drastically reduce the computational load. Formal listening tests show that the proposed method leads to a significant reduction in acoustic discontinuities during the concatenation.

The proposed method is also applied to acoustic database reduction and enables a compression of about 60% of the acoustic database without perceptible decrease of the speech quality.
Document type :
Theses
Complete list of metadatas

Cited literature [87 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00111952
Contributor : Soufiane Rouibia <>
Submitted on : Monday, November 6, 2006 - 4:19:26 PM
Last modification on : Friday, October 23, 2020 - 4:37:19 PM
Long-term archiving on: : Thursday, September 20, 2012 - 2:25:22 PM

Identifiers

  • HAL Id : tel-00111952, version 1

Collections

Citation

Soufiane Rouibia. Prise en compte de critères acoustiques pour la synthèse de la parole. Autre [cs.OH]. Université Rennes 1, 2006. Français. ⟨tel-00111952⟩

Share

Metrics

Record views

266

Files downloads

477