. Edward-h-adelson, H. Charles, J. R. Anderson, . Bergen, J. Peter et al., Pyramid methods in image processing, RCA engineer, vol.29, issue.6, p.15, 1984.

I. Akhter, . Michael, and . Black, Pose-conditioned joint angle limits for 3D human pose reconstruction, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.71, 2015.
DOI : 10.1109/cvpr.2015.7298751

A. Aldoma, F. Tombari, J. Prankl, A. Richtsfeld, L. D. Stefano et al., Multimodal cue integration through hypotheses verification for RGB-D object recognition and 6DOF pose estimation, Robotics and Automation (ICRA), 2013 IEEE International Conference on, p.111, 2013.
DOI : 10.1109/icra.2013.6630859

M. Andriluka, S. Roth, and B. Schiele, Monocular 3d pose estimation and tracking by detection, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, p.105, 2010.
DOI : 10.1109/cvpr.2010.5540156

URL : http://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ws1011/cvpr10_andriluka.pdf

M. Andriluka and L. Pishchulin, Peter Gehler and Bernt Schiele. 2d human pose estimation: New benchmark and state of the art analysis, Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pp.3686-3693, 2014.

C. Astua, R. Barber, J. Crespo, and A. Jardon, Object detection techniques applied on mobile robot semantic navigation, Sensors, vol.14, issue.4, p.111, 2014.
DOI : 10.3390/s140406734

URL : http://www.mdpi.com/1424-8220/14/4/6734/pdf

M. Aubry, D. Maturana, A. A. Efros, C. Bryan, J. Russell et al., Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models, Proceedings of the IEEE conference on computer vision and pattern recognition, vol.44, pp.3762-3769, 2014.
DOI : 10.1109/cvpr.2014.487

URL : https://hal.archives-ouvertes.fr/hal-01057240

T. Jonathan, J. Barron, and . Malik, Color constancy, intrinsic images, and shape estimation, European Conference on Computer Vision, p.44, 2012.

T. Jonathan, J. Barron, and . Malik, Shape, illumination, and reflectance from shading, IEEE transactions on pattern analysis and machine intelligence, vol.37, pp.1670-1687, 2015.

D. Batra, C. Andrew, D. Gallagher, T. Parikh, and . Chen, Beyond trees: MRF inference via outer-planar decomposition, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, p.78, 2010.
DOI : 10.1109/cvpr.2010.5539951

URL : http://www.ece.cmu.edu/%7Edbatra/publications/assets/opd_cvpr10.pdf

H. Bay, T. Tuytelaars, and L. Van-gool, Surf: Speeded up robust features. Computer vision-ECCV, p.111, 2006.
DOI : 10.1007/11744023_32

V. Belagiannis, S. Amin, M. Andriluka, and B. Schiele, Nassir Navab and Slobodan Ilic. 3D pictorial structures for multiple human pose estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.105, 2014.

R. Bellman, On the theory of dynamic programming, Proceedings of the National Academy of Sciences, vol.38, issue.8, p.27, 1952.

E. Richard, S. E. Bellman, and . Dreyfus, Applied dynamic programming, p.27, 2015.

V. Blanz and T. Vetter, A morphable model for the synthesis of 3D faces, Proceedings of the 26th annual conference on Computer graphics and interactive techniques, vol.45, pp.187-194, 1999.

. Matthew-b-blaschko, Slack and Margin Rescaling as Convex Extensions of Supermodular Functions, p.36, 2016.

F. Bogo, A. Kanazawa, C. Lassner, P. Gehler, J. Romero et al., Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image, European Conference on Computer Vision, pp.561-578, 2016.
DOI : 10.1007/978-3-319-46454-1_34

URL : http://arxiv.org/pdf/1607.08128

J. Booth, E. Antonakos, S. Ploumpis, and G. Trigeorgis, Yannis Panagakis, Stefanos Zafeiriouet al. 3D Face Morphable Models "In-the-Wild, Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, p.44, 2017.

A. Bosch, A. Zisserman, and X. Munoz, Representing shape with a spatial pyramid kernel, Proceedings of the 6th ACM international conference on Image and video retrieval, p.76, 2007.
DOI : 10.1145/1282280.1282340

L. Bourdev and J. Malik, Poselets: Body part detectors trained using 3d human pose annotations, IEEE 12th International Conference on, p.71, 2009.
DOI : 10.1109/iccv.2009.5459303

L. Bourdev, S. Maji, T. Brox, and J. Malik, Detecting people using mutually consistent poselet activations, European conference on computer vision, p.18, 2010.
DOI : 10.1007/978-3-642-15567-3_13

URL : http://www.cs.berkeley.edu/%7Esmaji/papers/bmbm-poselets-eccv10.pdf

H. Boussaid and I. Kokkinos, Fast and exact: ADMMbased discriminative shape segmentation with loopy part models, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.78, 2014.
DOI : 10.1109/cvpr.2014.517

URL : https://hal.archives-ouvertes.fr/hal-01109287

S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends R in Machine Learning, vol.3, issue.1, pp.1-122, 2011.
DOI : 10.1561/2200000016

URL : http://www.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf

C. Bregler, A. Hertzmann, and H. Biermann, Recovering non-rigid 3D shape from image streams, Computer Vision and Pattern Recognition, vol.2, p.43, 2000.
DOI : 10.1109/cvpr.2000.854941

URL : http://www.mrl.nyu.edu/publications/recovering-nonrigid/bhb-cvpr00.pdf

C. Michael, M. Burl, P. Weber, and . Perona, A probabilistic approach to object recognition using local photometry and global geometry, European conference on computer vision, p.19, 1998.

Z. Cao, T. Simon, S. Wei, and Y. Sheikh, Realtime multi-person 2d pose estimation using part affinity fields, 2016.
DOI : 10.1109/cvpr.2017.143

URL : http://arxiv.org/pdf/1611.08050

X. Angel, T. Chang, L. Funkhouser, P. Guibas, Q. Hanrahan et al., An information-rich 3d model repository, p.43, 2015.

X. Chen and A. L. Yuille, Articulated pose estimation by a graphical model with image dependent pairwise relations, Advances in Neural Information Processing Systems, pp.1736-1744, 2014.

C. Chen and D. Ramanan, 3D Human Pose Estimation= 2D Pose Estimation+ Matching, p.71, 2016.
DOI : 10.1109/cvpr.2017.610

URL : http://arxiv.org/pdf/1612.06524

L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, Deeplab: Semantic image segmentation with Bibliography deep convolutional nets, atrous convolution, and fully connected crfs, vol.55, 2016.
DOI : 10.1109/tpami.2017.2699184

URL : http://arxiv.org/pdf/1606.00915

A. Cherian, J. Mairal, K. Alahari, and C. Schmid, Mixing body-part sequences for human pose estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.104, 2014.
DOI : 10.1109/cvpr.2014.302

URL : https://hal.archives-ouvertes.fr/hal-00978643

, Accessed: 21/10/2017 under GNU Free Documentation License, Version 1.2 or later, vol.63, 2007.

C. Cortes and V. Vapnik, Support-vector networks. Machine learning, vol.20, p.36, 1995.

D. Crandall, P. Felzenszwalb, and D. Huttenlocher, Spatial priors for part-based recognition using statistical models, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, p.19, 2005.

B. Curless and M. Levoy, A volumetric method for building complex models from range images, Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, p.104, 1996.

, Histograms of oriented gradients for human detection, Computer Vision and Pattern Recognition, vol.1, pp.886-893, 2005.

, IEEE, 2005.

M. Dantone, J. Gall, C. Leistner, and L. Van-gool, Human pose estimation using body parts dependent joint regressors, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.71, 2013.

A. A. Santosh-k-divvala, M. Efros, and . Hebert, How important are "deformable parts" in the deformable parts model, European Conference on Computer Vision, p.13, 2012.

P. Dollár, Z. Tu, P. Perona, and S. Belongie, Integral channel features, p.15, 2009.

C. Dubout and F. Fleuret, Exact acceleration of linear object detectors, Computer Vision-ECCV 2012, vol.112, p.107, 2012.

M. Eichner, M. Marin-jimenez, A. Zisserman, and V. Ferrari, 2d articulated human pose estimation and retrieval in (almost) unconstrained still images, International journal of computer vision, vol.99, issue.2, p.77, 2012.

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The PASCAL Visual Object Classes Challenge, p.26, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00548597

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The PASCAL Visual Object Classes Challenge, p.113, 2012.
URL : https://hal.archives-ouvertes.fr/inria-00548597

M. Everingham, L. Van-gool, K. I. Christopher, J. Williams, A. Winn et al., The pascal visual object classes (voc) challenge, International journal of computer vision, vol.88, issue.2, pp.303-338, 2010.

B. Zhi-gang-fan and . Lu, Fast recognition of multi-view faces with feature selection, Computer vision, 2005. ICCV 2005. Tenth IEEE international conference on, vol.1, pp.76-81, 2005.

P. Felzenszwalb and D. Huttenlocher, Distance transforms of sampled functions, 2004.

F. Pedro, . Felzenszwalb, . Daniel, and . Huttenlocher, Pictorial structures for object recognition, International journal of computer vision, vol.61, issue.1, pp.55-79, 2005.

P. Felzenszwalb, D. Mcallester, and D. Ramanan, A discriminatively trained, multiscale, deformable part model, Computer Vision and Pattern Recognition, vol.46, pp.1-8, 2008.

. Pedro-f-felzenszwalb, B. Ross, D. Girshick, and . Mcallester, Cascade object detection with deformable part models, Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, pp.2241-2248, 2010.

R. B. Pedro-f-felzenszwalb, D. Girshick, D. Mcallester, and . Ramanan, Object detection with discriminatively trained part-based models, vol.32, pp.1627-1645, 2010.

R. Fergus, P. Perona, and A. Zisserman, Object class recognition by unsupervised scale-invariant learning, Proceedings. 2003 IEEE Computer Society Conference on, vol.2, p.19, 2003.

R. Fergus, P. Perona, and A. Zisserman, A sparse object category model for efficient learning and exhaustive recognition, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, p.19, 2005.

V. Ferrari, M. Marin-jimenez, and A. Zisserman, Progressive search space reduction for human pose estimation, Computer Vision and Pattern Recognition, vol.77, pp.1-8, 2008.

, Sanja Fidler, Sven Dickinson and Raquel Urtasun. 3d object detection and viewpoint estimation with a deformable 3d cuboid model, Advances in neural information processing systems, pp.611-619, 2012.

A. Martin, R. Fischler, and . Elschlager, The representation and matching of pictorial structures, IEEE Transactions on computers, vol.100, issue.1, p.19, 1973.

T. Frie, N. Cristianini, and C. Campbell, The kerneladatron algorithm: a fast and simple learning procedure for support vector machines, Machine Learning: Proceedings of the Fifteenth International Conference (ICML'98), p.36, 1998.

R. B. Girshick, P. F. Felzenszwalb, and D. Mcallester, Discriminatively Trained Deformable Part Models

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.580-587, 2014.
DOI : 10.1109/cvpr.2014.81

URL : http://arxiv.org/pdf/1311.2524

R. Girshick, Fast r-cnn, Proceedings of the IEEE international conference on computer vision, vol.72, pp.1440-1448, 2015.
DOI : 10.1109/iccv.2015.169

R. Girshick, F. Iandola, T. Darrell, and J. Malik, Deformable part models are convolutional neural networks, Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.437-446, 2015.
DOI : 10.1109/cvpr.2015.7298641

URL : http://arxiv.org/pdf/1409.5403

I. Goodfellow, H. Lee, V. Quoc, A. Le, A. Saxe et al., Measuring invariances in deep networks, Advances in neural information processing systems, vol.54, pp.646-654, 2009.

G. R?za-alp-güler, E. Trigeorgis, P. Antonakos, and . Snape, Stefanos Zafeiriou and Iasonas Kokkinos. Densereg: Fully convolutional dense shape regression in-the-wild, vol.72, 2016.

M. John, P. Hammersley, and . Clifford, Markov fields on finite graphs and lattices, vol.20, 1971.

C. Harris and M. Stephens, A combined corner and edge detector, Alvey vision conference, vol.15, p.14, 1988.
DOI : 10.5244/c.2.23

URL : http://www.bmva.org/bmvc/1988/avc-88-023.pdf

R. Hartley and A. Zisserman, Multiple view geometry in computer vision, vol.41, 2003.

H. Bs-he, S. L. Yang, and . Wang, Alternating direction method with selfadaptive penalty parameters for monotone variational inequalities, Journal of Optimization Theory and applications, vol.106, issue.2, pp.337-356, 2000.

K. He, X. Zhang, S. Ren, and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, European Conference on Computer Vision, p.15, 2014.
DOI : 10.1109/tpami.2015.2389824

URL : http://arxiv.org/pdf/1406.4729

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, p.75, 2016.
DOI : 10.1109/cvpr.2016.90

URL : http://arxiv.org/pdf/1512.03385

M. Hejrati and D. Ramanan, Analysis by synthesis: 3d object recognition by object reconstruction, Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, p.44, 2014.
DOI : 10.1109/cvpr.2014.314

S. S. Hinterstoisser, C. Holzer, S. Cagniart, K. Ilic, N. Konolige et al., Multimodal Templates for Real-Time Detection of Texture-less Objects in Heavily Cluttered Scenes, p.111, 2011.
DOI : 10.1109/iccv.2011.6126326

URL : http://cvlab.epfl.ch/%7Evlepetit/papers/hinterstoisser_iccv11.pdf

A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard, OctoMap: An efficient probabilistic 3D mapping framework based on octrees, Autonomous Robots, vol.34, issue.3, p.41, 2013.
DOI : 10.1007/s10514-012-9321-0

URL : http://www.informatik.uni-freiburg.de/~stachnis/pdf/hornung13auro.pdf

P. Daniel, G. A. Huttenlocher, . Klanderman, and . William-j-rucklidge, Comparing images using the Hausdorff distance, IEEE Transactions, vol.15, issue.9, p.62, 1993.

E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, Deepercut: A deeper, stronger, and faster multi-person pose estimation model, European Conference on Computer Vision, vol.105, pp.34-50, 2016.
DOI : 10.1007/978-3-319-46466-4_3

URL : http://arxiv.org/pdf/1605.03170

S. Ioffe and D. A. Forsyth, Probabilistic methods for finding people, International Journal of Computer Vision, vol.43, issue.1, p.19, 2001.

C. Ionescu, F. Li, and C. Sminchisescu, Latent structured models for human pose estimation, Computer Vision (ICCV), 2011.

, IEEE International Conference on, p.71, 2011.

, Dragos Papava, Vlad Olaru and Cristian Sminchisescu. Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments, IEEE transactions on pattern analysis and machine intelligence, vol.36, pp.1325-1339, 2014.

H. Jiang, 3D human pose reconstruction using millions of exemplars, 20th International Conference on, p.71, 2010.

T. Joachims, T. Finley, and C. Yu, Cutting-plane training of structural SVMs, Machine Learning, vol.77, p.37, 2009.

S. Johnson and M. Everingham, Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation, Proceedings of the British Machine Vision Conference, 2010.

A. Kar, S. Tulsiani, J. Carreira, and J. Malik, Category-specific object reconstruction from a single image, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.45, p.43, 1966.

L. Ke, M. Chang, H. Qi, and S. Lyu, MultiScale Structure-Aware Network for Human Pose Estimation, 2018.

S. Kinauer, M. Berman, and I. Kokkinos, Monocular Surface Reconstruction using 3D Deformable Part Models, Computer Vision-ECCV 2016 Workshops, pp.296-308, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01416479

S. Kinauer, A. Riza, S. Güler, I. Chandra, and . Kokkinos, Structured output prediction and learning for deep monocular 3d human pose estimation, International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, pp.34-48, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01672592

S. Knoop, S. Vacek, and R. Dillmann, Sensor fusion for 3D human body tracking with an articulated 3D body model, Proceedings 2006 IEEE International Conference on, p.104, 2006.

I. Kokkinos, Rapid deformable object detection using dualtree branch-and-bound, Advances in Neural Information Processing Systems, pp.2681-2689, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00857520

, Ubernet: Training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory, vol.2, p.22, 2016.

D. Koller and N. Friedman, Probabilistic graphical models: principles and techniques, p.19, 2009.

P. Kontaxakis, K. Gulzar, and S. Kinauer, Iasonas Kokkinos and Ville Kyrki. Robot-Robot Gesturing for Anchoring Representations, IEEE Transactions on Robotics, 2018.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, p.24, 2012.

A. Kushal, C. Schmid, and J. Ponce, Flexible object models for category-level 3d object recognition, Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, vol.46, pp.1-8, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00548682

]. Ladicky, H. S. Philip, A. Torr, and . Zisserman, Human pose estimation using a joint pixel-wise and part-wise formulation, proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3578-3585, 2013.

C. Lassner, J. Romero, M. Kiefel, F. Bogo, J. Michael et al., Unite the people: Closing the loop between 3d and 2d human representations, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR, 2017.

. Quoc-v-le, Building high-level features using large scale unsupervised learning, Acoustics, Speech and Signal Processing, vol.54, pp.8595-8598, 2013.

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation applied to handwritten zip code recognition, Neural computation, vol.1, issue.4, p.23, 1989.

Y. Lecun and Y. Bengio, Convolutional networks for images, speech, and time series, The handbook of brain theory and neural networks, vol.3361, p.23, 1995.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, issue.11, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

URL : http://www.cs.berkeley.edu/~daf/appsem/Handwriting/papers/00726791.pdf

V. Lepetit, J. Pilet, and P. Fua, Point matching as a classification problem for fast and robust object pose estimation, Proceedings of the 2004 IEEE Computer Society Conference on, vol.2, p.104, 2004.
DOI : 10.1109/cvpr.2004.1315170

URL : http://cvlab.epfl.ch/~lepetit/papers/lepetit-cvpr04.pdf

S. Li and A. Chan, 3d human pose estimation from monocular images with deep convolutional neural network, Asian Conference on Computer Vision, vol.72, pp.332-347, 2014.
DOI : 10.1007/978-3-319-16808-1_23

S. Li, W. Zhang, and A. Chan, Maximum-margin structured learning with deep networks for 3d human pose estimation, Proceedings of the IEEE International Conference on Computer Vision, pp.2848-2856, 2015.
DOI : 10.1109/iccv.2015.326

URL : http://arxiv.org/pdf/1508.06708

J. Liebelt and C. Schmid, Multi-view object class detection with a 3d geometric model, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, p.43, 2010.
DOI : 10.1109/cvpr.2010.5539836

URL : https://hal.archives-ouvertes.fr/inria-00548634

J. Joseph, A. Lim, A. Khosla, and . Torralba, Fpm: Fine pose parts-based model with 3d cad models, European Conference on Computer Vision, vol.44, pp.478-493, 2014.

T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona et al., Microsoft coco: Common objects in context, European conference on computer vision, p.43, 2014.
DOI : 10.1007/978-3-319-10602-1_48

URL : http://arxiv.org/pdf/1405.0312.pdf

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed et al., Ssd: Single shot multibox detector, European conference on computer vision, p.18, 2016.
DOI : 10.1007/978-3-319-46448-0_2

URL : http://arxiv.org/pdf/1512.02325

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, p.22, 2015.
DOI : 10.1109/cvpr.2015.7298965

URL : http://arxiv.org/pdf/1411.4038

. David-g-lowe, Distinctive image features from scale-invariant keypoints, International journal of computer vision, vol.60, issue.2, p.14, 2004.

I. Lysenkov, V. Eruhimov, and G. Bradski, Recognition and pose estimation of rigid transparent objects with a kinect sensor, Robotics, p.111, 2013.
DOI : 10.15607/rss.2012.viii.035

URL : https://doi.org/10.15607/rss.2012.viii.035

I. Lysenkov and V. Rabaud, Pose estimation of rigid transparent objects in transparent clutter, Robotics and Automation (ICRA), 2013 IEEE International Conference on, p.111, 2013.
DOI : 10.1109/icra.2013.6630571

T. Malisiewicz, A. Gupta, and A. A. Efros, Ensemble of exemplar-svms for object detection and beyond, Computer Vision (ICCV), 2011 IEEE International Conference on, p.44, 2011.
DOI : 10.1109/iccv.2011.6126229

URL : http://www.cs.cmu.edu/%7Etmalisie/projects/iccv11/exemplarsvm-iccv11.pdf

L. Olvi, D. Mangasarian, and . Musicant, Successive overrelaxation for support vector machines, IEEE Transactions on Neural Networks, vol.10, issue.5, p.36, 1999.

F. Massa, C. Bryan, M. Russell, and . Aubry, Deep exemplar 2d-3d detection by adapting from real to rendered views, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.44, pp.6024-6033, 2016.
DOI : 10.1109/cvpr.2016.648

URL : https://hal.archives-ouvertes.fr/hal-01800639

J. Stephen, Y. Mckenna, S. Raja, and . Gong, Tracking colour objects using adaptive mixture models, Image and vision computing, vol.17, issue.3-4, p.16, 1999.

D. Mehta, H. Rhodin, D. Casas, P. Fua, O. Sotnychenko et al., Monocular 3d human pose estimation in the wild using improved cnn supervision, 2017 Fifth International Conference on, p.71, 2017.
DOI : 10.1109/3dv.2017.00064

URL : http://arxiv.org/pdf/1611.09813

J. Michels, A. Saxena, and A. Ng, High speed obstacle avoidance using monocular vision and reinforcement learning, Proceedings of the 22nd international conference on Machine learning, p.43, 2005.
DOI : 10.1145/1102351.1102426

URL : http://ai.stanford.edu/~asaxena/rccar/ICML_ObstacleAvoidance.ps

B. Moghaddam and A. Pentland, Probabilistic visual learning for object detection, Computer Vision, 1995. Proceedings., Fifth International Conference on, p.16, 1995.

M. Muja and D. Lowe, Flann-fast library for approximate nearest neighbors user manual, p.111, 2009.

P. Kevin, Y. Murphy, and M. Weiss, Loopy belief propagation for approximate inference: An empirical study, Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, p.78, 1999.

S. Richard-a-newcombe, O. Izadi, D. Hilliges, D. Molyneaux, A. J. Kim et al., KinectFusion: Real-time dense surface mapping and tracking, Mixed and augmented reality (ISMAR), 2011 10th IEEE international symposium on, vol.104, pp.127-136, 2011.

A. Newell, K. Yang, and J. Deng, Stacked hourglass networks for human pose estimation, European Conference on Computer Vision, vol.70, pp.483-499, 2016.

, Open Source Computer Vision (OpenCV), p.114, 2014.

W. Ouyang, X. Wang, X. Zeng, S. Qiu, P. Luo et al., Deepid-net: Deformable deep convolutional neural networks for object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.14, 2015.

G. Pavlakos, X. Zhou, G. Konstantinos, K. Derpanis, and . Daniilidis, Coarse-to-fine volumetric prediction for single-image 3D human pose, 2016.

]. Pearl, Probabilistic reasoning in intelligent systems: networks of plausible inference, vol.77, 2014.

B. Pepik and P. Gehler, Michael Stark and Bernt Schiele. 3d2pm3d deformable part models, European Conference on Computer Vision, vol.46, pp.356-370, 2012.

B. Pepik, M. Stark, P. Gehler, and B. Schiele, Teaching 3d geometry to deformable part models, Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, p.45, 2012.

B. Pepik, R. Benenson, T. Ritschel, and B. Schiele, What is holding back convnets for detection, German Conference on Pattern Recognition, p.43, 2015.

B. Pepik, M. Stark, and P. Gehler, Tobias Ritschel and Bernt Schiele. 3d object class detection in the wild, Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 IEEE Conference on, p.45, 2015.

B. Pepik, M. Stark, P. Gehler, and B. Schiele, Multiview and 3d deformable part models, IEEE transactions on pattern analysis and machine intelligence, vol.37, pp.2232-2245, 2015.

G. Peyré and . Laurent-d-cohen, Geodesic remeshing using front propagation, International Journal of Computer Vision, vol.69, issue.1, p.48, 2006.

J. Pilet, V. Lepetit, and P. Fua, Real-time nonrigid surface detection, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, p.104, 2005.

L. Pishchulin, A. Jain, M. Andriluka, T. Thormählen, and B. Schiele, Articulated people detection and pose estimation: Reshaping the future, Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp.3178-3185, 2012.

L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka et al., Deepcut: Joint subset partition and labeling for multi person pose estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.105, pp.4929-4937, 2016.

G. Pons-moll, A. Baak, J. Gall, L. Lealtaixe, M. Mueller et al., Outdoor human motion capture using inverse kinematics and von mises-fisher sampling, Computer Vision (ICCV), 2011 IEEE International Conference on, p.71, 2011.

J. Prankl, A. Aldoma, A. Svejda, and M. Vincze, RGB-D object modelling for object recognition and tracking, Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on, p.111, 2015.

M. Prasad, A. Fitzgibbon, A. Zisserman, and L. Van-gool, Finding nemo: Deformable object class modelling using curve matching, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, vol.45, pp.1720-1727, 2010.
DOI : 10.1109/cvpr.2010.5539840

V. Ramakrishna, T. Kanade, and Y. Sheikh, Reconstructing 3d human pose from 2d image landmarks, Computer Vision-ECCV 2012, p.71, 2012.
DOI : 10.1007/978-3-642-33765-9_41

URL : http://www.cs.cmu.edu/%7Evramakri/cameraAndPoseCameraReady.pdf

D. Ramanan, Learning to parse images of articulated bodies, Advances in neural information processing systems, vol.77, pp.1129-1136, 2007.

R. Ranjan, M. Vishal, R. Patel, and . Chellappa, A deep pyramid deformable part model for face detection, Biometrics Theory, Applications and Systems (BTAS), p.14, 2015.
DOI : 10.1109/btas.2015.7358755

URL : http://arxiv.org/pdf/1508.04389

K. Shaoqing-ren, R. He, J. Girshick, and . Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, vol.24, pp.91-99, 2015.

K. Shaoqing-ren, R. He, X. Girshick, J. Zhang, and . Sun, Object detection networks on convolutional feature maps. IEEE transactions on pattern analysis and machine intelligence, vol.39, p.15, 2017.

. Tyrrell-rockafellar, Monotone operators and the proximal point algorithm, SIAM journal on control and optimization, vol.14, issue.5, pp.877-898, 1976.

G. Rogez and C. Schmid, MoCap-guided data augmentation for 3D pose estimation in the wild, Advances in Neural Information Processing Systems, p.91, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01389486

R. Ronfard, C. Schmid, and B. Triggs, Learning to parse pictures of people, European Conference on Computer Vision, p.77, 2002.
DOI : 10.1007/3-540-47979-1_47

URL : https://hal.archives-ouvertes.fr/inria-00545109

, Accessed: 25/10/2017 under BSD license, vol.110, 2012.

R. , Accessed: 24/10/2017 under BSD license, vol.111, 2012.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision (IJCV), vol.115, issue.3, pp.211-252, 2015.

P. Savalle, S. Tsogkas, G. Papandreou, and I. Kokkinos, Deformable part models with cnn features, European Conference on Computer Vision, Parts and Attributes Workshop, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01109290

H. Schneiderman and T. Kanade, A statistical method for 3D object detection applied to faces and cars, Computer Vision and Pattern Recognition, vol.1, p.16, 2000.

M. Steven, B. Seitz, J. Curless, D. Diebel, R. Scharstein et al., A comparison and evaluation of multi-view stereo reconstruction algorithms, Computer vision and pattern recognition, vol.1, p.41, 2006.

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus et al., Overfeat: Integrated recognition, localization and detection using convolutional networks, p.22, 2013.

S. Shalev-shwartz, Y. Singer, N. Srebro, and A. Cotter, Pegasos: Primal estimated sub-gradient solver for svm, Mathematical programming, vol.127, p.36, 2011.

Y. Sheikh and M. Shah, Bayesian modeling of dynamic scenes for object detection, IEEE transactions on pattern analysis and machine intelligence, vol.27, p.16, 2005.

S. E. Shimony, Finding MAPs for belief networks is NPhard, Artificial Intelligence, vol.68, issue.2, p.78, 1994.

M. Shneier, Extracting linear features from images using pyramids. Rapport technique, MARYLAND UNIV COLLEGE PARK COMPUTER VISION LAB, p.15, 1980.

L. Sigal, . Michael, and . Black, Measure locally, reason globally: Occlusion-sensitive articulated pose estimation, Computer Vision and Pattern Recognition, vol.2, p.77, 2006.
DOI : 10.1109/cvpr.2006.180

URL : http://www.cs.brown.edu/people/black/Papers/1087_Sigal_L.pdf

L. Sigal, O. Alexandru, M. Balan, and . Black, Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion, International journal of computer vision, vol.87, issue.1, p.71, 2010.
DOI : 10.1007/s11263-009-0273-6

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, vol.24, 2014.

J. Starck and A. Hilton, Model-based multiple view reconstruction of people, p.104, 2003.
DOI : 10.1109/iccv.2003.1238446

URL : http://www.ee.surrey.ac.uk/CVSSP/VMRG/Publications/hilton043dpvt.pdf

I. Steinwart and A. Christmann, Support vector machines, p.16, 2008.

G. Stockman, Object recognition and localization via pose clustering. Computer vision, graphics, and image processing, vol.40, pp.361-387, 1987.
DOI : 10.1016/s0734-189x(87)80147-0

J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, A benchmark for the evaluation of RGB-D SLAM systems, Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on, p.41, 2012.

X. Sun, J. Shang, S. Liang, and Y. Wei, , 2017.

J. Michael, D. Swain, and . Ballard, Color indexing, International journal of computer vision, vol.7, issue.1, p.14, 1991.

R. Szeliski, Image alignment and stitching: A tutorial. Foundations and Trends R in Computer Graphics and Vision, vol.2, p.14, 2006.

S. Tai-peng-tian and . Sclaroff, Fast globally optimal 2d human detection with loopy graph models, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, p.77, 2010.

E. Tola, V. Lepetit, and P. Fua, Daisy: An efficient dense descriptor applied to wide-baseline stereo, IEEE transactions on pattern analysis and machine intelligence, vol.32, p.14, 2010.

D. Tome, C. Russell, and L. Agapito, Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image, vol.91, 2017.

A. Jonathan-j-tompson, Y. Jain, C. Lecun, and . Bregler, Joint training of a convolutional network and a graphical model for human pose estimation, Advances in neural information processing systems, vol.72, pp.1799-1807, 2014.

J. Tompson, R. Goroshin, A. Jain, Y. Lecun, and C. Bregler, Efficient object localization using convolutional networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.648-656, 2015.

L. Torresani, A. Hertzmann, and C. Bregler, Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors, IEEE transactions on pattern analysis and machine intelligence, vol.30, p.43, 2008.

A. Toshev and C. Szegedy, Deeppose: Human pose estimation via deep neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.70, pp.1653-1660, 2014.

G. Trigeorgis, P. Snape, A. Mihalis, E. Nicolaou, S. Antonakos et al., Mnemonic descent method: A recurrent process applied for end-to-end face alignment, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.44, 2016.

I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, Support vector machine learning for interdependent and structured output spaces, Proceedings of the twenty-first international conference on Machine learning, vol.37, p.104, 2004.

N. Vladimir, V. Vapnik, and . Vapnik, Statistical learning theory, vol.1, p.36, 1998.

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Computer Vision and Pattern Recognition, vol.1, p.14, 2001.
DOI : 10.1109/cvpr.2001.990517

URL : http://www.cc.gatech.edu/ccg/./paper_of_week/viola01rapid.pdf

D. Vlasic, R. Adelsberger, G. Vannucci, J. Barnwell, M. Gross et al., Practical motion capture in everyday surroundings, ACM transactions on graphics (TOG), vol.26, p.71, 2007.
DOI : 10.1145/1276377.1276421

C. Vondrick, A. Khosla, T. Malisiewicz, and A. Torralba, Hoggles: Visualizing object detection features, Proceedings of the IEEE International Conference on Computer Vision, p.114, 2013.
DOI : 10.1109/iccv.2013.8

URL : http://people.csail.mit.edu/tomasz/papers/vondrick_iccv2013.pdf

Y. Wang and G. Mori, Multiple tree models for occlusion and spatial constraints in human pose estimation, European Conference on Computer Vision, p.77, 2008.
DOI : 10.1007/978-3-540-88690-7_53

URL : http://www.cs.sfu.ca/~mori/research/papers/wang_eccv08.pdf

M. Weber, M. Welling, and P. Perona, Towards automatic discovery of object categories, Computer Vision and Pattern Recognition, vol.2, p.16, 2000.
DOI : 10.1109/cvpr.2000.854754

URL : http://vision.ics.uci.edu/papers/WeberWP_CVPR_2000/WeberWP_CVPR_2000.pdf

. Shih-en, V. Wei, T. Ramakrishna, Y. Kanade, and . Sheikh, Convolutional pose machines, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.70, pp.4724-4732, 2016.

Y. Weiss and . William-t-freeman, On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs, IEEE Transactions on Information Theory, vol.47, issue.2, p.28, 2001.

T. Whelan, M. Kaess, M. Fallon, H. Johannsson, J. Leonard et al., Kintinuous: Spatially extended kinectfusion, p.41, 2012.

Y. Xiang, R. Mottaghi, and S. Savarese, Beyond pascal: A benchmark for 3d object detection in the wild, Applications of Computer Vision (WACV), vol.62, pp.75-82, 2014.
DOI : 10.1109/wacv.2014.6836101

P. Yan, M. Saad, M. Khan, and . Shah, 3d model based object class detection in an arbitrary view, Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, vol.46, pp.1-6, 2007.
DOI : 10.1109/iccv.2007.4409042

URL : http://www.cs.ucf.edu/~vision/papers/PingkunICCV07.pdf

Y. Yang and D. Ramanan, Articulated pose estimation with flexible mixtures-of-parts, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, vol.77, pp.1385-1392, 2011.
DOI : 10.1109/cvpr.2011.5995741

Y. Yang and D. Ramanan, Articulated human detection with flexible mixtures of parts, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.12, pp.2878-2890, 2013.
DOI : 10.1109/tpami.2012.261

URL : http://www.ics.uci.edu/~dramanan/papers/pose_pami.pdf

W. Yang, W. Ouyang, H. Li, and X. Wang, End-toend learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.77, pp.3073-3082, 2016.

H. Yasin, U. Iqbal, B. Kruger, A. Weber, and J. Gall, A dual-source approach for 3D pose estimation from a single image, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol.91, pp.4948-4956, 2016.
DOI : 10.1109/cvpr.2016.535

URL : http://arxiv.org/pdf/1509.06720

J. Yu, D. Farin, C. Krüger, and B. Schiele, Improving person detection using synthetic training data, 17th IEEE International Conference on, p.43, 2010.
DOI : 10.1109/icip.2010.5650143

. Alan-l-yuille, W. Peter, . Hallinan, and . David-s-cohen, Feature extraction from faces using deformable templates, International journal of computer vision, vol.8, issue.2, p.19, 1992.

L. Alan, A. Yuille, and . Rangarajan, The concave-convex procedure, Neural computation, vol.15, issue.4, p.39, 2003.

D. Matthew, R. Zeiler, and . Fergus, Visualizing and understanding convolutional networks, European conference on computer vision, pp.818-833, 2014.

T. Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms, Proceedings of the twenty-first international conference on Machine learning, p.36, 2004.
DOI : 10.1145/1015330.1015332

URL : http://www.aicml.cs.ualberta.ca/banff04/icml/pages/papers/12.ps

]. Zhou and S. Leonardos, , 2015.

, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p.71, 2015.

X. Zhou, M. Zhu, S. Leonardos, G. Konstantinos, K. Derpanis et al., Sparseness meets deepness: 3D human pose estimation from monocular video, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4966-4975, 2016.
DOI : 10.1109/cvpr.2016.537

URL : http://arxiv.org/pdf/1511.09439

Q. Zhu, M. Yeh, K. Cheng, and S. Avidan, Fast human detection using a cascade of histograms of oriented gradients, Computer Vision and Pattern Recognition, vol.2, p.76, 2006.

M. Zhu, X. Zhou, and K. Daniilidis, Single image popup from discriminatively learned parts, Proceedings of the IEEE International Conference on Computer Vision, vol.46, pp.927-935, 2015.
DOI : 10.1109/iccv.2015.112

L. Zitnick and P. Dollár, Edge boxes: Locating object proposals from edges, European Conference on Computer Vision, p.18, 2014.
DOI : 10.1007/978-3-319-10602-1_26