Skip to Main content Skip to Navigation

Estimation algorithms for ambiguous visual models : Three Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences

Cristian Sminchisescu 1
1 MOVI - Modeling, localization, recognition and interpretation in computer vision
GRAVIR - IMAG - Graphisme, Vision et Robotique, Inria Grenoble - Rhône-Alpes, CNRS - Centre National de la Recherche Scientifique : FR71
Abstract : This thesis studies the problem of tracking and reconstructing three-dimensional articulated human motion in monocular video sequences. This is an important problem with applications in areas like markerless motion capture for animation and virtual reality, video indexing, human-computer interaction or intelligent surveillance. A system that aims to reconstruct 3D human motion using single camera sequences faces difficulties caused by the lossy nature of monocular projection and the high-dimensionality required for 3D human modeling. The complexities of human articular structure, shape and their physical constraints, and the large variability in image observations involving humans, render the solution non-trivial. We focus on the general problem of 3D human motion estimation using monocular video streams. Hence, we can not exploit the simplifications brought by using multiple cameras or strong dynamical models such as walking, and we minimize assumptions about clothing and background structure. In this unrestricted setting, the posterior likelihoods over human pose space are inevitably highly multi-modal, and efficiently locating and tracking the most prominent peaks is a major computational challenge. To address these problems, we propose a model that incorporates realistic kinematics and several important human body constraints, and a principled, robust and probabilistically motivated integration of different visual cues like contours, intensity or silhouettes. We then derive three novel continuous multiple-hypothesis search techniques that allow either deterministic or stochastic localization of nearby peaks in the high-dimensional human pose likelihood surface: Covariance Scaled Sampling, Eigenvector Tracking and Hypersurface Sweeping and Hyperdynamic Importance Sampling. The search methods give general, principled approaches to the deterministic exploration of the non-convex error surfaces so often encountered in computational vision problems. The combined system allows monocular tracking of unconstrained human motions in clutter."
keyword : no keyword
Document type :
Complete list of metadatas

Cited literature [167 references]  Display  Hide  Download
Contributor : Team Perception <>
Submitted on : Thursday, April 7, 2011 - 3:16:42 PM
Last modification on : Friday, June 26, 2020 - 4:04:03 PM
Long-term archiving on: : Thursday, November 8, 2012 - 3:40:18 PM


  • HAL Id : tel-00584112, version 1




Cristian Sminchisescu. Estimation algorithms for ambiguous visual models : Three Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences. Human-Computer Interaction [cs.HC]. Institut National Polytechnique de Grenoble - INPG, 2002. English. ⟨tel-00584112⟩



Record views


Files downloads