Turn-taking enhancement in spoken dialogue systems with reinforcement learning

Abstract : Incremental dialogue systems are able to process the user’s speech as it is spoken (without waiting for the end of a sentence before starting to process it). This makes them able to take the floor whenever they decide to (the user can also speak whenever she wants, even if the system is still holding the floor). As a consequence, they are able to perform a richer set of turn-taking behaviours compared to traditional systems. Several contributions are described in this thesis with the aim of showing that dialogue systems’ turn-taking capabilities can be automatically improved from data. First, human-human dialogue is analysed and a new taxonomy of turn-taking phenomena in human conversation is established. Based on this work, the different phenomena are analysed and some of them are selected for replication in a human-machine context (the ones that are more likely to improve a dialogue system’s efficiency). Then, a new architecture for incremental dialogue systems is introduced with the aim of transforming a traditional dialogue system into an incremental one at a low cost (also separating the turn-taking manager from the dialogue manager). To be able to perform the first tests, a simulated environment has been designed and implemented. It is able to replicate user and ASR behaviour that are specific to incremental processing, unlike existing simulators. Combined together, these contributions led to the establishement of a rule-based incremental dialogue strategy that is shown to improve the dialogue efficiency in a task-oriented situation and in simulation. A new reinforcement learning strategy has also been proposed. It is able to autonomously learn optimal turn-taking behavious throughout the interactions. The simulated environment has been used for training and for a first evaluation, where the new data-driven strategy is shown to outperform both the non-incremental and rule-based incremental strategies. In order to validate these results in real dialogue conditions, a prototype through which the users can interact in order to control their smart home has been developed. At the beginning of each interaction, the turn-taking strategy is randomly chosen among the non-incremental, the rule-based incremental and the reinforcement learning strategy (learned in simulation). A corpus of 206 dialogues has been collected. The results show that the reinforcement learning strategy significantly improves the dialogue efficiency without hurting the user experience (slightly improving it, in fact).
Document type :
Complete list of metadatas

Cited literature [117 references]  Display  Hide  Download

Contributor : Abes Star <>
Submitted on : Thursday, May 25, 2017 - 3:10:18 AM
Last modification on : Friday, March 22, 2019 - 11:34:07 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01498847, version 2



Hatim Khouzaimi. Turn-taking enhancement in spoken dialogue systems with reinforcement learning. Artificial Intelligence [cs.AI]. Université d'Avignon, 2016. English. ⟨NNT : 2016AVIG0213⟩. ⟨tel-01498847v2⟩



Record views


Files downloads