Study of unit selection text-to-speech synthesis algorithms

Abstract : This PhD thesis focuses on the automatic speech synthesis field, and more specifically on unit selection. A deep analysis and a diagnosis of the unit selection algorithm (lattice search algorithm) is provided. The importance of the solution optimality is discussed and a new unit selection implementation based on a A* algorithm is presented. Three cost function enhancements are also presented. The first one is a new way – in the target cost – to minimize important spectral differences by selecting sequences of candidate units that minimize a mean cost instead of an absolute one. This cost is tested on a phonemic duration distance but can be applied to others. Our second proposition is a target sub-cost addressing intonation that is based on coefficients extracted through a generalized version of Fujisaki's command-response model. This model features gamma functions modeling F0 called atoms. Finally, our third contribution concerns a penalty system that aims at enhancing the concatenation cost. It penalizes units in function of classes defining the risk a concatenation artifact occurs when concatenating on a phone of this class. This system is different to others in the literature in that it is tempered by a fuzzy function that allows to soften penalties for units presenting low concatenation costs.
Complete list of metadatas

Cited literature [36 references]  Display  Hide  Download
Contributor : Abes Star <>
Submitted on : Wednesday, January 18, 2017 - 3:57:10 PM
Last modification on : Friday, January 11, 2019 - 2:27:03 PM
Long-term archiving on : Wednesday, April 19, 2017 - 3:13:24 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01439413, version 1


David Guennec. Study of unit selection text-to-speech synthesis algorithms. Data Structures and Algorithms [cs.DS]. Université Rennes 1, 2016. English. ⟨NNT : 2016REN1S055⟩. ⟨tel-01439413⟩



Record views


Files downloads