Skip to Main content Skip to Navigation
New interface
Theses

Modèles de langage ad hoc pour la reconnaissance automatique de la parole

Abstract : The three pillars of an automatic speech recognition system are the lexicon, the languagemodel and the acoustic model. The lexicon provides all the words that can betranscribed, associated with their pronunciation. The acoustic model provides an indicationof how the phone units are pronounced, and the language model brings theknowledge of how words are linked. In modern automatic speech recognition systems,the acoustic and language models are statistical. Their estimation requires large volumesof data selected, standardized and annotated.At present, the Web is by far the largest textual corpus available for English andFrench languages. The data it holds can potentially be used to build the vocabularyand the estimation and adaptation of language model. The work presented here is topropose new approaches to take advantage of this resource in the context of languagemodeling.The document is organized into two parts. The first deals with the use of the Webdata to dynamically update the lexicon of the automatic speech recognition system.The proposed approach consists on increasing dynamically and locally the lexicon onlywhen unknown words appear in the speech. New words are extracted from the Webthrough the formulation of queries submitted toWeb search engines. The phonetizationof the words is obtained by an automatic grapheme-to-phoneme transcriber.The second part of the document presents a new way of handling the informationcontained on the Web by relying on possibility theory concepts. A Web-based possibilisticlanguage model is proposed. It provides an estition of the possibility of a wordsequence from knowledge of the existence of its sub-sequences on the Web. A probabilisticWeb-based language model is also proposed. It relies on Web document countsto estimate n-gram probabilities. Several approaches for combining these models withclassical models are proposed. The results show that combining probabilistic and possibilisticmodels gives better results than classical probabilistic models alone. In addition,the models estimated from Web data perform better than those estimated on corpus.
Document type :
Theses
Complete list of metadata

Cited literature [38 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00954220
Contributor : ABES STAR :  Contact
Submitted on : Friday, February 28, 2014 - 4:57:08 PM
Last modification on : Tuesday, January 14, 2020 - 10:38:05 AM
Long-term archiving on: : Wednesday, May 28, 2014 - 8:50:11 PM

File

These_Stanislas_Oger.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00954220, version 1

Collections

Citation

Stanislas Oger. Modèles de langage ad hoc pour la reconnaissance automatique de la parole. Autre [cs.OH]. Université d'Avignon, 2011. Français. ⟨NNT : 2011AVIG0193⟩. ⟨tel-00954220⟩

Share

Metrics

Record views

407

Files downloads

1162