L’analyse factorielle pour la modélisation acoustique des systèmes de reconnaissance de la parole

Abstract : In this thesis, we propose to use techniques based on factor analysis to build acoustic models for automatic speech processing, especially Automatic Speech Recognition (ASR). Frstly, we were interested in reducing the footprint memory of acoustic models. Our factor analysis-based method demonstrated that it is possible to pool the parameters of acoustic models and still maintain performance similar to the one obtained with the baseline models. The proposed modeling leads us to deconstruct the ensemble of the acoustic model parameters into independent parameter sub-sets, which allow a great flexibility for particular adaptations (speakers, genre, new tasks etc.). With current modeling techniques, the state of a Hidden Markov Model (HMM) is represented by a combination of Gaussians (GMM : Gaussian Mixture Model). We propose as an alternative a vector representation of states : the factors of states. These factors of states enable us to accurately measure the similarity between the states of the HMM by means of an euclidean distance for example. Using this vector represen- tation, we propose a simple and effective method for building acoustic models with shared states. This procedure is even more effective when applied to under-resourced languages. Finally, we concentrated our efforts on the robustness of the speech recognition sys- tems to acoustic variabilities, particularly those generated by the environment. In our various experiments, we examined speaker variability, channel variability and additive noise. Through our factor analysis-based approach, we demonstrated the possibility of modeling these different types of acoustic variability as an additive component in the cepstral domain. By compensation of this component from the cepstral vectors, we are able to cancel out the harmful effect it has on speech recognition
Document type :
Theses
Complete list of metadatas

Cited literature [2 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01059020
Contributor : Abes Star <>
Submitted on : Friday, August 29, 2014 - 10:17:29 AM
Last modification on : Tuesday, January 14, 2020 - 10:38:05 AM
Long-term archiving on: Sunday, November 30, 2014 - 10:16:44 AM

File

pdf2star-1397650195-Th--se_Bou...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01059020, version 1

Collections

Citation

Mohamed Bouallegue. L’analyse factorielle pour la modélisation acoustique des systèmes de reconnaissance de la parole. Autre [cs.OH]. Université d'Avignon, 2013. Français. ⟨NNT : 2013AVIG0197⟩. ⟨tel-01059020⟩

Share

Metrics

Record views

685

Files downloads

1659