Skip to Main content Skip to Navigation

Model-based 3D hand pose estimation from monocular video

Abstract : In this thesis we propose two methods that allow to recover automatically a full description of the 3d motion of a hand given a monocular video sequence of this hand. Using the information provided by the video, our aimto is to determine the full set of kinematic parameters that are required to describe the pose of the skeleton of the hand. This set of parameters is composed of the angles associate to each joint/articulation and the global position and orientation of the wrist. This problem is extremely challenging. The hand as many degrees of freedom and auto-occlusion are ubiquitous, which makes difficult the estimation of occluded or partially ocluded hand parts.In this thesis, we introduce two novel methods of increasing complexity that improve to certain extend the state-of-the-art for monocular hand tracking problem. Both are model-based methods and are based on a hand model that is fitted to the image. This process is guided by an objective function that defines some image-based measure of the hand projection given the model parameters. The fitting process is achieved through an iterative refinement technique that is based on gradient-descent and aims a minimizing the objective function. The two methos differ mainly by the choice of the hand model and of the cost function.The first method relies on a hand model made of ellipsoids and a simple discrepancy measure based on global color distributions of the hand and the background. The second method uses a triangulated surface model with texture and shading and exploits a robust distance between the synthetic and observed image as discrepancy measure.While computing the gradient of the discrepancy measure, a particular attention is given to terms related to the changes of visibility of the surface near self occlusion boundaries that are neglected in existing formulations. Our hand tracking method is not real-time, which makes interactive applications not yet possible. Increase of computation power of computers and improvement of our method might make real-time attainable.
Document type :
Complete list of metadatas

Cited literature [218 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Tuesday, September 6, 2011 - 4:18:43 PM
Last modification on : Wednesday, October 14, 2020 - 4:20:12 AM
Long-term archiving on: : Wednesday, December 7, 2011 - 2:41:02 AM


Version validated by the jury (STAR)


  • HAL Id : tel-00619637, version 1



Martin De La Gorce. Model-based 3D hand pose estimation from monocular video. Other. Ecole Centrale Paris, 2009. English. ⟨NNT : 2009ECAP0045⟩. ⟨tel-00619637⟩



Record views


Files downloads