Modélisation Sinusoïdale à Long Terme du Signal de Parole

Abstract : The sinusoidal model of speech signals is usually defined on a “short-term” basis, i.e. on successive frames of about 10–30 ms. In this thesis, we add to this usual spectral modeling a new level of modeling along the temporal axis: the goal is to model the temporal trajectories of the sinusoidal parameters (amplitudes and phases) over durations which are significantly longer than the short-term frames (typically several hundreds of ms; continuously voiced sections of speech are considered in this study). For this, we propose to use different long-term models based on discrete cosine and polynomial functions. The fitting of these models with the parameters trajectories is achieved by a weighted least square minimisation technique, the weights being derived from perceptual criteria which are adapted to the long-term processing. For this task, a series of iterative algorithms is proposed and tested. The proposed long-term approach is shown to provide an efficent and sparse representation of the dynamics of voiced speech signals.
Document type :
Theses
Signal and Image processing. Institut National Polytechnique de Grenoble - INPG, 2007. French


https://tel.archives-ouvertes.fr/tel-00211294
Contributor : Patricia Reynier <>
Submitted on : Monday, January 21, 2008 - 2:01:01 PM
Last modification on : Wednesday, July 22, 2015 - 4:31:44 PM

Identifiers

  • HAL Id : tel-00211294, version 1

Citation

Mohammad Firouzmand. Modélisation Sinusoïdale à Long Terme du Signal de Parole. Signal and Image processing. Institut National Polytechnique de Grenoble - INPG, 2007. French. <tel-00211294>

Export

Share

Metrics

Consultation de
la notice

149

Téléchargement du document

401