Deep Regression Models and Computer Vision Applications for Multiperson Human-Robot Interaction

Stéphane Lathuilière 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : In order to interact with humans, robots need to perform basic perception taskssuch as face detection, human pose estimation or speech recognition. However, in order have a natural interaction with humans, the robot needs to modelhigh level concepts such as speech turns, focus of attention or interactions between participants in a conversation. In this manuscript, we follow a top-downapproach. On the one hand, we present two high-level methods that model collective human behaviors. We propose a model able to recognize activities thatare performed by different groups of people jointly, such as queueing, talking.Our approach handles the general case where several group activities can occur simultaneously and in sequence. On the other hand, we introduce a novelneural network-based reinforcement learning approach for robot gaze control.Our approach enables a robot to learn and adapt its gaze control strategy inthe context of human-robot interaction. The robot is able to learn to focus itsattention on groups of people from its own audio-visual experiences.Second, we study in detail deep learning approaches for regression prob-lems. Regression problems are crucial in the context of human-robot interaction in order to obtain reliable information about head and body poses or theage of the persons facing the robot. Consequently, these contributions are really general and can be applied in many different contexts. First, we proposeto couple a Gaussian mixture of linear inverse regressions with a convolutionalneural network. Second, we introduce a Gaussian-uniform mixture model inorder to make the training algorithm more robust to noisy annotations. Finally,we perform a large-scale study to measure the impact of several architecturechoices and extract practical recommendations when using deep learning approaches in regression tasks. For each of these contributions, a strong experimental validation has been performed with real-time experiments on the NAOrobot or on large and diverse data-sets.
Document type :
Theses
Complete list of metadatas

Cited literature [163 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01801807
Contributor : Abes Star <>
Submitted on : Monday, October 29, 2018 - 1:39:06 PM
Last modification on : Thursday, January 24, 2019 - 3:28:02 PM
Long-term archiving on : Wednesday, January 30, 2019 - 3:21:13 PM

File

LATHUILIERE_2018_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01801807, version 2

Collections

Citation

Stéphane Lathuilière. Deep Regression Models and Computer Vision Applications for Multiperson Human-Robot Interaction. Human-Computer Interaction [cs.HC]. Université Grenoble Alpes, 2018. English. ⟨NNT : 2018GREAM026⟩. ⟨tel-01801807v2⟩

Share

Metrics

Record views

177

Files downloads

205