Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation

Conjugate Mixture Models for the Modeling of Visual and Auditory Perception

Vasil Khalidov 1 
1 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology
Abstract : In this thesis, the modelling of audio-visual perception with a head-like device is considered. The related problems, namely audio-visual calibration, audio-visual object detection, localization and tracking are addressed. A spatio-temporal approach to the head-like device calibration is proposed based on probabilistic multimodal trajectory matching. The formalism of conjugate mixture models is introduced along with a family of efficient optimization algorithms to perform multimodal clustering. One instance of this algorithm family, namely the conjugate expectation maximization (ConjEM) algorithm is further improved to gain attractive theoretical properties. The multimodal object detection and object number estimation methods are developed, their theoretical properties are discussed. Finally, the proposed multimodal clustering method is combined with the object detection and object number estimation strategies and known tracking techniques to perform multimodal multiobject tracking. The performance is demonstrated on simulated data and the database of realistic audio-visual scenarios (CAVA database).
Document type :
Complete list of metadata

Cited literature [125 references]  Display  Hide  Download
Contributor : Perception team Connect in order to contact the contributor
Submitted on : Wednesday, December 12, 2012 - 4:34:10 PM
Last modification on : Saturday, March 26, 2022 - 3:18:10 AM
Long-term archiving on: : Wednesday, March 13, 2013 - 3:54:17 AM


  • HAL Id : tel-00584080, version 2



Vasil Khalidov. Conjugate Mixture Models for the Modeling of Visual and Auditory Perception. Human-Computer Interaction [cs.HC]. Université Joseph-Fourier - Grenoble I, 2010. English. ⟨tel-00584080v2⟩



Record views


Files downloads