Construction et stratégie d'exploitation des réseaux de confusion en lien avec le contexte applicatif de la compréhension de la parole

Abstract : The work presented in this PhD deals with the confusion networks as a compact and structured representation of multiple aligned recognition hypotheses produced by a speech recognition system and used by different applications. The confusion networks (CN) are constructed from word graphs and structure information as a sequence of classes containing several competing word hypothesis. In this work we focus on the problem of robust understanding from spontaneous speech input in a dialogue application, using CNs as structured representation of recognition hypotheses for the spoken language understanding module. We use France Telecom spoken dialogue system for customer care. Two issues inherent to this context are tackled. A dialogue system does not only have to recognize what a user says but also to understand the meaning of his request and to act upon it. From the user's point of view, system performance is more accurately represented by the performance of the understanding process than by speech recognition performance only. Our work aims at improving the performance of the understanding process. Using a real application implies being able to process real heterogeneous data. An utterance can be more or less noisy, in the domain or out of the domain of the application, covered or not by the semantic model of the application, etc. A question raised by the variability of the data is whether applying the same processes to the entire data set, as done in classical approaches, is a suitable solution. This work follows a double perspective : to improve the CN construction algorithm with the intention of optimizing the understanding process and to propose an adequate strategy for the use of CN in a real application. Following a detailed analysis of two CN construction algorithms on a test set collected using the France Telecom customer care service, we decided to use the "pivot" algorithm for our work. We present a modified and adapted version of this algorithm. The new algorithm introduces different processing techniques for the words which are important for the understanding process. As for the variability of the real data the application has to process, we present a new multiple level decision strategy aiming at applying different processing techniques for different utterance categories. We show that it is preferable to process multiple recognition hypotheses only on utterances having a valid interpretation. This strategy optimises computation time and yields better global performance
Document type :
Informatique [cs]. Université d'Avignon, 2008. Français. <NNT : 2008AVIG0176>
Contributor : Abes Star <>
Submitted on : Wednesday, October 5, 2011 - 12:37:36 PM
Last modification on : Wednesday, October 5, 2011 - 2:02:35 PM
Document(s) archivé(s) le : Friday, January 6, 2012 - 2:26:12 AM




  • HAL Id : tel-00629195, version 1



Bogdan Minescu. Construction et stratégie d'exploitation des réseaux de confusion en lien avec le contexte applicatif de la compréhension de la parole. Informatique [cs]. Université d'Avignon, 2008. Français. <NNT : 2008AVIG0176>. <tel-00629195>




Notice views


Document downloads