Skip to Main content Skip to Navigation

L’analyse factorielle pour la modélisation acoustique des systèmes de reconnaissance de la parole

Abstract : In this thesis, we propose to use techniques based on factor analysis to build acoustic models for automatic speech processing, especially Automatic Speech Recognition (ASR). Frstly, we were interested in reducing the footprint memory of acoustic models. Our factor analysis-based method demonstrated that it is possible to pool the parameters of acoustic models and still maintain performance similar to the one obtained with the baseline models. The proposed modeling leads us to deconstruct the ensemble of the acoustic model parameters into independent parameter sub-sets, which allow a great flexibility for particular adaptations (speakers, genre, new tasks etc.). With current modeling techniques, the state of a Hidden Markov Model (HMM) is represented by a combination of Gaussians (GMM : Gaussian Mixture Model). We propose as an alternative a vector representation of states : the factors of states. These factors of states enable us to accurately measure the similarity between the states of the HMM by means of an euclidean distance for example. Using this vector represen- tation, we propose a simple and effective method for building acoustic models with shared states. This procedure is even more effective when applied to under-resourced languages. Finally, we concentrated our efforts on the robustness of the speech recognition sys- tems to acoustic variabilities, particularly those generated by the environment. In our various experiments, we examined speaker variability, channel variability and additive noise. Through our factor analysis-based approach, we demonstrated the possibility of modeling these different types of acoustic variability as an additive component in the cepstral domain. By compensation of this component from the cepstral vectors, we are able to cancel out the harmful effect it has on speech recognition
Document type :
Complete list of metadata

Cited literature [2 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Friday, August 29, 2014 - 10:17:29 AM
Last modification on : Tuesday, January 14, 2020 - 10:38:05 AM
Long-term archiving on: : Sunday, November 30, 2014 - 10:16:44 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01059020, version 1



Mohamed Bouallegue. L’analyse factorielle pour la modélisation acoustique des systèmes de reconnaissance de la parole. Autre [cs.OH]. Université d'Avignon, 2013. Français. ⟨NNT : 2013AVIG0197⟩. ⟨tel-01059020⟩



Record views


Files downloads