Skip to Main content Skip to Navigation

Extraction semi-automatique des mouvements du tractus vocal à partir de données cinéradiographiques

Abstract : The work described in this dissertation is grounded by two major findings. Firstly, long existing sequences of cineradiographic data of the vocal tract in the context of continuous speech are under-exploited; indeed the manual marking of a complete sequence is a too laborious task. Secondly, cineradiography is generally well framed and then adapted to the use of the retro-marking algorithm. This latter builds a transformation function of implicit parameters, extracted from the video signal, into explicit and controlled geometrical parameters. The one-path semi-automatic technique of vocal tract contours extraction presented here is based on an adaptation of this algorithm and allows to minimize the user interaction. For one sequence and one articulator, a first step consists in a manual processing applied for a small number of key images and defining the geometrical features. Then an automatic indexing step of the full sequence according to these key images and based on low frequency DCT components allows an association of the geometrical marking for each frame. This treatment is applied independently for each articulator (tongue, velum, lips, etc.); the acquired contours are then combined to obtain the shape of the whole vocal tract. The computation of the mid-sagittal sections and area functions is furthermore realized for the whole sequence.

In the first part, we describe the proposed method and we evaluate the reconstruction error. The second part tends to phonetically validate the estimated geometrical configurations and to know if our measures are enough precise to be associated with speech temporal and spectral features. With a study based on vowels, we show that we recover some standard phonetic results. Formants are then considered, by using two competing approaches: a linear model and an acoustic one. With this latter, by introducing a 2-subbands amplitude modulation extracted from the original audio signal, the synthesis of intelligible speech is realized. At last, two subsequent studies are carried out, focusing on the consonants and on the velum. All these results show that the proposed method can be used to phonetically exploit these long cineradiographic sequences of speech.
Complete list of metadata
Contributor : Pierre Badin <>
Submitted on : Tuesday, January 8, 2008 - 5:46:39 PM
Last modification on : Friday, November 6, 2020 - 4:38:51 AM
Long-term archiving on: : Tuesday, April 13, 2010 - 4:44:22 PM


  • HAL Id : tel-00203082, version 1



Julie Fontecave. Extraction semi-automatique des mouvements du tractus vocal à partir de données cinéradiographiques. Traitement du signal et de l'image [eess.SP]. Institut National Polytechnique de Grenoble - INPG, 2006. Français. ⟨tel-00203082⟩



Record views


Files downloads