The baseline for the Biwi data set is inspired from [44, 93], where the authors Figure A.3: Example of images from the Fashion Landmark Dataset: landmarks detected as outliers by DeepGUM are shown in red, while inliers are shown in green Recognition of group activities in videos based on single-and two-person descriptors, all these images, the detected outliers correspond to occluded landmarks. APPENDIX A. APPENDIX ARTICLES INCLUDED IN THIS MANUSCRIPT: ? [92] StéphaneLathuilì ere, Georgios Evangelidis, and Radu Horaud IEEE Winter Conference on Applications of Computer Vision (WACV), 2017. ,
Deep mixture of linear inverse regressions applied to head-pose estimation, ? [94] StéphaneLathuilì ere IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01504847
DeepGUM: Deep Robust Regression with Gaussian-Uniform Mixtures, ? [96] StéphaneLathuilì ere Submitted to IEEE European Conference of Computer Vision (ECCV), 2018. ,
A Comprehensive Analysis of Deep Regression, ? [91], 2018. ,
Depression severity estimation from multiple modalities, 2018. ,
Real-time head orientation from a monocular camera using deep neural network, 2014. ,
A chains model for localizing participants of group activities in videos, ICCV, 2011. ,
HIRF: Hierarchical random field for collective activity recognition in videos, ECCV, 2014. ,
Pictorial structures revisited: People detection and articulated pose estimation, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.1014-1021, 2009. ,
DOI : 10.1109/CVPR.2009.5206754
URL : http://www.gris.informatik.tu-darmstadt.de/~sroth/pubs/cvpr09andriluka.pdf
Georgios Evangelidis, and Radu Horaud. A distributed architecture for interacting with nao, ACM ICMI, 2015. ,
Tracking a varying number of people with a visually-controlled robotic head, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017. ,
DOI : 10.1109/IROS.2017.8206274
URL : https://hal.archives-ouvertes.fr/hal-01542987
Model-based gaussian and non-gaussian clustering, Biometrics, 1993. ,
Training deep neural-networks based on unreliable labels, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. ,
DOI : 10.1109/ICASSP.2016.7472164
Robust Optimization for Deep Regression, 2015 IEEE International Conference on Computer Vision (ICCV), 2015. ,
DOI : 10.1109/ICCV.2015.324
URL : http://arxiv.org/pdf/1505.06606
Robust artificial neural networks and outlier detection, 2011. ,
Practical Recommendations for Gradient-Based Training of Deep Architectures, Neural networks: Tricks of the trade, pp.437-478, 2012. ,
DOI : 10.1162/089976602317318938
URL : http://arxiv.org/pdf/1206.5533.pdf
Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks, vol.5, issue.2, 1994. ,
DOI : 10.1109/72.279181
URL : http://www.research.microsoft.com/~patrice/PDF/long_term.pdf
Towards a humanoid museum guide robot that interacts with multiple persons, 5th IEEE-RAS International Conference on Humanoid Robots, 2005., pp.418-423, 2005. ,
DOI : 10.1109/ICHR.2005.1573603
URL : http://www.informatik.uni-freiburg.de/~maren/papers/bennewitz_humanoids05.pdf
On the unification of line processes, outlier rejection, and robust statistics with applications in early vision, IJCV, 1996. ,
How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks), 2017 IEEE International Conference on Computer Vision (ICCV), 2017. ,
DOI : 10.1109/ICCV.2017.116
URL : http://arxiv.org/pdf/1703.07332
Robust face landmark estimation under occlusion, ICCV, pp.1513-1520, 2013. ,
Towards information-based feedback control for binaural active localization, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. ,
DOI : 10.1109/ICASSP.2016.7472894
Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ,
DOI : 10.1109/CVPR.2017.143
URL : http://arxiv.org/pdf/1611.08050
A practical guide to CNNs and Fisher Vectors for image instance retrieval, Signal Processing, vol.128, pp.426-439, 2016. ,
DOI : 10.1016/j.sigpro.2016.05.021
URL : http://arxiv.org/pdf/1508.02496
SMOTE: Synthetic minority over-sampling technique, JAIR, 2002. ,
Cross-age reference coding for age-invariant face recognition and retrieval, ECCV, 2014. ,
A Unified Framework for Multi-target Tracking and Collective Activity Recognition, ECCV, 2012. ,
DOI : 10.1007/978-3-642-33765-9_16
URL : http://www.eecs.umich.edu/vision/papers/choi_eccv_12.pdf
Discovering Groups of People in Images, ECCV, 2014. ,
DOI : 10.1007/978-3-319-10593-2_28
URL : http://cvgl.stanford.edu/projects/groupdiscovery/eccv2014choi.pdf
Understanding collective activities of people from videos, IEEE TPAMI, 2013. ,
DOI : 10.1109/tpami.2013.220
What are they doing?: Collective activity classification using spatio-temporal relationship among people, ICCV Workshops, 2009. ,
Practical Nonparametric Statistics, 1998. ,
Robust Improper Maximum Likelihood: Tuning, Computation, and a Comparison With Other Methods for Robust Gaussian Clustering, Journal of the American Statistical Association, vol.8, issue.516, 2016. ,
DOI : 10.1007/3-540-28084-7_79
URL : http://www.tandfonline.com/doi/pdf/10.1080/01621459.2015.1100996?needAccess=true
Application of motion-based visual servoing to target tracking. IJRR, 2001. ,
Multimodal integration of dynamic audiovisual patterns for an interactive reinforcement learning scenario, IEEE/RSJ IROS, 2016. ,
DOI : 10.1109/iros.2016.7759137
Group behavior recognition with multiple cameras, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings., 2002. ,
DOI : 10.1109/ACV.2002.1182178
URL : http://www-sop.inria.fr/orion/personnel/Francois.Bremond/Postscript/acv02.ps
Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005. ,
DOI : 10.1109/CVPR.2005.177
URL : https://hal.archives-ouvertes.fr/inria-00548512
Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005. ,
DOI : 10.1109/CVPR.2005.177
URL : https://hal.archives-ouvertes.fr/inria-00548512
Real-time facial feature detection using conditional regression forests, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.2578-2585, 2012. ,
DOI : 10.1109/CVPR.2012.6247976
Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.23, issue.4, 2015. ,
DOI : 10.1109/TASLP.2015.2405475
URL : https://hal.archives-ouvertes.fr/hal-01112834
Siì eye Ba, and Radu Horaud. Hyper-Spectral Image Analysis with Partially-Latent Regression and Spatial Markov Dependencies, IEEE STSP, 2015. ,
High-dimensional regression with gaussian mixtures and partially-latent response variables, Statistics and Computing, vol.19, issue.11, 2015. ,
DOI : 10.1109/TNN.2008.2003467
URL : https://hal.archives-ouvertes.fr/hal-01107604
Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos, Computer Vision and Image Understanding, vol.136, 2015. ,
DOI : 10.1016/j.cviu.2015.03.005
Deep Structured Models For Group Activity Recognition, Procedings of the British Machine Vision Conference 2015, 2015. ,
DOI : 10.5244/C.29.179
URL : http://arxiv.org/pdf/1506.04191
A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms, Swarm and Evolutionary Computation, vol.1, issue.1, pp.3-18, 2011. ,
DOI : 10.1016/j.swevo.2011.02.002
Head pose estimation via probabilistic high-dimensional regression, 2015 IEEE International Conference on Image Processing (ICIP), 2015. ,
DOI : 10.1109/ICIP.2015.7351683
URL : https://hal.archives-ouvertes.fr/hal-01163663
Siì eye Ba, and Georgios Evangelidis . Robust head-pose estimation based on partially-latent mixture of linear regressions, IEEE TIP, 2016. ,
Siì eye Ba, and Georgios Evangelidis Robust head-pose estimation based on partially-latent mixture of linear regressions, IEEE TIP, vol.26, issue.3, pp.1428-1440, 2017. ,
DOI : 10.1109/tip.2017.2654165
URL : http://arxiv.org/pdf/1603.09732
Siì eye Ba, and Georgios Evangelidis Robust head-pose estimation based on partially-latent mixture of linear regressions, IEEE TIP, vol.26, issue.3, pp.1428-1440, 2017. ,
DOI : 10.1109/tip.2017.2654165
URL : http://arxiv.org/pdf/1603.09732
Antoine Deleforge, Silx00E8ye Ba, and Georgios Evangelidis. Robust head-pose estimation based on partially-latent mixture of linear regressions, p.2017 ,
Adaptive subgradient methods for online learning and stochastic optimization, JMLR, vol.12, issue.7, pp.2121-2159, 2011. ,
Multiple Comparisons among Means, Journal of the American Statistical Association, vol.25, issue.293, pp.52-64, 1961. ,
DOI : 10.1214/aoms/1177728724
Why Does Unsupervised Pre-training Help Deep Learning?, pp.625-660, 2010. ,
Random Forests for Real Time 3D Face Analysis, International Journal of Computer Vision, vol.41, issue.5, 2013. ,
DOI : 10.1109/TSMCB.2011.2148711
URL : http://files.is.tue.mpg.de/jgall/download/jgall_RFdepthFace_ijcv12.pdf
Real time head pose estimation with random regression forests, CVPR 2011, pp.617-624, 2011. ,
DOI : 10.1109/CVPR.2011.5995458
Statistical methods for research workers, 1925. ,
A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering, Statistics and Computing, vol.94, issue.1, 2014. ,
DOI : 10.1016/S0378-3758(00)00208-1
Bostjan Likar, and Ziga Spiclin. Robust estimation of unbalanced mixture models on samples with outliers. TPAMI, 2015. ,
Deep self-taught learning for facial beauty prediction, Neurocomputing, vol.144, p.129, 2014. ,
DOI : 10.1016/j.neucom.2014.05.028
Reinforcement learning for visual servoing of a mobile robot, Australian Conference on Robotics and Automation, 2000. ,
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.40, issue.5, 2017. ,
DOI : 10.1109/TPAMI.2017.2648793
URL : https://hal.archives-ouvertes.fr/hal-01413403
Bayesian Data Analysis. Chapman & Hall/CRC Texts in Statistical Science, 2003. ,
A sensorimotor reinforcement learning framework for physical Human-Robot Interaction, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016. ,
DOI : 10.1109/IROS.2016.7759417
URL : http://arxiv.org/pdf/1607.07939
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. ,
DOI : 10.1109/CVPR.2014.81
URL : http://arxiv.org/pdf/1311.2524
Deep learning, 2016. ,
Deep learning. Book in preparation for, 2016. ,
LSTM: A Search Space Odyssey, IEEE Transactions on Neural Networks and Learning Systems, vol.28, issue.10, pp.2222-2232, 2017. ,
DOI : 10.1109/TNNLS.2016.2582924
URL : http://arxiv.org/pdf/1503.04069
Image-based human age estimation by manifold learning and locally adjusted robust regression, IEEE TIP, vol.17, issue.7, pp.1178-1188, 2008. ,
Observing humanobject interactions: Using spatial and functional compatibility for recognition, IEEE TPAMI, 2009. ,
DOI : 10.1109/tpami.2009.83
Visual recognition by counting instances: A multi-instance cardinality potential kernel, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/CVPR.2015.7298875
URL : http://arxiv.org/pdf/1502.02063
Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770-778, 2016. ,
DOI : 10.1109/CVPR.2016.90
URL : http://arxiv.org/pdf/1512.03385
Reducing the Dimensionality of Data with Neural Networks, Science, vol.313, issue.5786, pp.313504-507, 2006. ,
DOI : 10.1126/science.1127647
A sharper Bonferroni procedure for multiple tests of significance, Biometrika, vol.75, issue.4, pp.800-802, 1988. ,
DOI : 10.1093/biomet/75.4.800
Long Short-Term Memory, Neural Computation, vol.4, issue.8, 1997. ,
DOI : 10.1016/0893-6080(88)90007-X
Putting objects in perspective, 2008. ,
DOI : 10.1109/cvpr.2006.232
URL : http://www.cs.cmu.edu/~dhoiem/publications/ijcv2008ObjectsInPerspective.pdf
An Improved Sequentially Rejective Bonferroni Test Procedure, Biometrics, vol.43, issue.2, pp.417-423, 1987. ,
DOI : 10.2307/2531823
URL : http://sci2s.ugr.es/keel/pdf/algorithm/articulo/1987-Holland-BIO.pdf
A simple sequentially rejective multiple test procedure, Scandinavian Journal of Statistics, vol.6, issue.2, pp.65-70, 1979. ,
Robust Statistics, 2004. ,
DOI : 10.1002/0471725250
A Hierarchical Deep Temporal Model for Group Activity Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.217
URL : http://arxiv.org/pdf/1511.06040
Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint, 2015. ,
On architectural choices in deep learning: From network structure to gradient convergence and parameter estimation, 1702. ,
Reading Text in the Wild with Convolutional Neural Networks, International Journal of Computer Vision, vol.20, issue.9, 2016. ,
DOI : 10.1109/TIP.2011.2126586
URL : http://arxiv.org/pdf/1412.1842
What do 15,000 object categories tell us about classifying and localizing actions?, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/CVPR.2015.7298599
URL : https://pure.uva.nl/ws/files/2493740/167605_JainCVPR2015.pdf
Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation, Procedings of the British Machine Vision Conference 2010, 2010. ,
DOI : 10.5244/C.24.12
URL : http://www.bmva.org/bmvc/2010/conference/paper12/paper12.pdf
Combining Per-frame and Per-track Cues for Multi-person Action Recognition, ECCV, 2012. ,
DOI : 10.1007/978-3-642-33718-5_9
URL : http://www.umiacs.umd.edu/%7Esameh/khamis-eccv2012.pdf
Abundant Inverse Regression Using Sufficient Reduction and Its Applications, ECCV, 2016. ,
Adam: A method for stochastic optimization, ICLR, 2014. ,
Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects, ECCV, 2008. ,
DOI : 10.1109/CVPR.2007.383299
Reinforcement learning in robotics: A survey, p.131, 2013. ,
DOI : 10.1007/978-3-319-03194-1_2
URL : http://www.ri.cmu.edu/pub_files/2013/7/Kober_IJRR_2013.pdf
ImageNet classification with deep convolutional neural networks, NIPS, 2012. ,
DOI : 10.1162/neco.2009.10-08-881
URL : http://dl.acm.org/ft_gateway.cfm?id=3065386&type=pdf
Retrieving Actions in Group Contexts, Trends and Topics in Computer Vision, 2010. ,
DOI : 10.1007/978-3-642-35749-7_14
URL : http://www.cs.sfu.ca/%7Emori/research/papers/lan_sga10.pdf
Beyond actions: Discriminative models for contextual group activities, NIPS, 2010. ,
Discriminative latent models for recognizing contextual group activities, IEEE TPAMI, 2012. ,
Beyond Gaussian pyramid: Multi-skip feature stacking for action recognition, CVPR, 2015. ,
A comprehensive analysis of deep regression, 2018. ,
Recognition of group activities in videos based on single-and two-person descriptors, IEEE WACV, 2017. ,
Deep Mixture of Linear Inverse Regressions Applied to Head-Pose Estimation, CVPR, 2017. ,
Deep mixture of linear inverse regressions applied to head-pose estimation, IEEE CVPR, 2017. ,
Deep reinforcement learning for audio-visual servoing in human-robot interaction, 2017. ,
Deepgum: Deep robust regression with gaussian-uniform mixtures, 2018. ,
Effiicient backprop, Neural Networks: Tricks of the Trade, pp.9-50, 1998. ,
Sliced Inverse Regression for Dimension Reduction, Journal of the American Statistical Association, vol.13, issue.414, 1991. ,
DOI : 10.1214/aos/1176345514
URL : http://www.unc.edu/~chongz/Spring2012/SIR.pdf
Learning multi-modal densities on discriminative temporal interaction manifold for group activity recognition, CVPR, 2009. ,
DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection, IEEE Transactions on Image Processing, vol.25, issue.8, 2016. ,
DOI : 10.1109/TIP.2016.2579306
URL : http://arxiv.org/pdf/1510.05484
Reverberant sound localization with a robot head based on direct-path relative transfer function, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016. ,
DOI : 10.1109/IROS.2016.7759437
URL : https://hal.archives-ouvertes.fr/hal-01349771
Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.10, 2017. ,
DOI : 10.1109/TASLP.2017.2740001
URL : https://hal.archives-ouvertes.fr/hal-01413417
Learning from Noisy Labels with Distillation. arXiv preprint, 2017. ,
DOI : 10.1109/iccv.2017.211
URL : http://arxiv.org/pdf/1703.02391
3D head pose estimation with convolutional neural network trained on synthetic images, 2016 IEEE International Conference on Image Processing (ICIP), 2016. ,
DOI : 10.1109/ICIP.2016.7532566
DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.124
Fashion Landmark Detection in the Wild, ECCV, 2016. ,
DOI : 10.5244/C.24.12
URL : http://arxiv.org/pdf/1608.03049
Aural Servo: Sensor-Based Control From Robot Audition, IEEE Transactions on Robotics, 2018. ,
DOI : 10.1109/TRO.2018.2805310
URL : https://hal.archives-ouvertes.fr/hal-01694366
Robust statistics, 2006. ,
Actions in context, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. ,
DOI : 10.1109/CVPR.2009.5206557
URL : https://hal.archives-ouvertes.fr/inria-00548645
Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017. ,
DOI : 10.1109/TPAMI.2017.2782819
Robust regression methods for computer vision: A review, International Journal of Computer Vision, vol.53, issue.1, 1991. ,
DOI : 10.1002/0471725250
Artificial Neuron???Glia Networks Learning Approach Based on Cooperative Coevolution, International Journal of Neural Systems, vol.21, issue.04, pp.25-2015 ,
DOI : 10.1142/S0129065714400061
URL : https://hal.archives-ouvertes.fr/hal-01221226
A CNN Regression Approach for Real-Time 2D/3D Registration, IEEE Transactions on Medical Imaging, vol.35, issue.5, 2016. ,
DOI : 10.1109/TMI.2016.2521800
Systematic evaluation of convolution neural network advances on the Imagenet, Computer Vision and Image Understanding, vol.161, pp.11-19, 2017. ,
DOI : 10.1016/j.cviu.2017.05.007
URL : http://arxiv.org/pdf/1606.02228
Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning, JRSJ, 2006. ,
DOI : 10.1109/iros.2005.1545206
URL : http://kth.diva-portal.org/smash/get/diva2:436245/FULLTEXT01
Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing Atari With Deep Reinforcement Learning, NIPS Deep Learning Workshop, 2013. ,
Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, 2015. ,
DOI : 10.1016/S0004-3702(98)00023-X
Deep Head Pose: Gaze-Direction Estimation in Multimodal Video, IEEE Transactions on Multimedia, vol.17, issue.11, 2015. ,
DOI : 10.1109/TMM.2015.2482819
URL : http://ieeexplore.ieee.org:80/stamp/stamp.jsp?tp=&arnumber=7279167
Using the forest to see the trees: a graphical model relating features, objects and scenes, NIPS, 2003. ,
Machine learning: a probabilistic perspective, 2012. ,
Head Pose Estimation in Computer Vision: A Survey, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.4, 2009. ,
DOI : 10.1109/TPAMI.2008.106
URL : http://cvrr.ucsd.edu/publications/2008/MurphyChutorian_Trivedi_PAMI08.pdf
Temporal Poselets for Collective Activity Detection and Recognition, 2013 IEEE International Conference on Computer Vision Workshops, 2013. ,
DOI : 10.1109/ICCVW.2013.71
URL : http://haci2013.umiacs.umd.edu/papers/NabiHACI2013.pdf
Evaluation of convolutional neural networks for visual recognition, IEEE Transactions on Neural Networks, vol.9, issue.4, pp.685-696, 1998. ,
DOI : 10.1109/72.701181
Distribution-free multiple comparisons, 1963. ,
How to train neural networks, Neural Networks: Tricks of the Trade, 1998. ,
DOI : 10.1007/978-3-642-35289-8_23
Robust fitting of mixtures using the trimmed likelihood estimator. CSDA, 2007. ,
DOI : 10.1016/j.csda.2006.12.024
Scientific method: Statistical errors, Nature, vol.506, issue.7487, pp.506150-152, 2014. ,
DOI : 10.1038/506150a
URL : http://www.nature.com:80/polopoly_fs/1.14700!/menu/main/topColumns/topLeftColumn/pdf/506150a.pdf
Collective Activity Localization with Contextual Spatial Pyramid, ECCV, 2012. ,
DOI : 10.1007/978-3-642-33885-4_25
Synergistic face detection and pose estimation with energy-based models. JMLR, 2007. ,
DOI : 10.1007/11957959_10
Multi-source Deep Learning for Human Pose Estimation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.2329-2336, 2014. ,
DOI : 10.1109/CVPR.2014.299
Inverse regression approach to robust non-linear high-to-low dimensional mapping ,
DOI : 10.1016/j.jmva.2017.09.009
URL : https://hal.archives-ouvertes.fr/hal-01347455
Articulated people detection and pose estimation: Reshaping the future, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.3178-3185, 2012. ,
DOI : 10.1109/CVPR.2012.6248052
URL : http://www.informatik.uni-marburg.de/~thormae/paper/CVPR12.pdf
A survey on vision-based human action recognition, Image and Vision Computing, vol.28, issue.6, 2010. ,
DOI : 10.1016/j.imavis.2009.11.014
Robot gains social intelligence through multimodal deep reinforcement learning, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), 2016. ,
DOI : 10.1109/HUMANOIDS.2016.7803357
URL : http://arxiv.org/pdf/1702.07492
Show, attend and interact: Perceivable human-robot social interaction through neural attention Q-network, IEEE ICRA, 2017. ,
Objects in Context, 2007 IEEE 11th International Conference on Computer Vision, 2007. ,
DOI : 10.1109/ICCV.2007.4408986
Learning to parse images of articulated bodies, NIPS, 2007. ,
HyperFace: A Deep Multi-task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016. ,
DOI : 10.1109/TPAMI.2017.2781233
URL : http://arxiv.org/pdf/1603.01249
Faster r-cnn: Towards real-time object detection with region proposal networks, IEEE TPAMI, vol.39, pp.1137-1149, 2015. ,
Hough Networks for Head Pose Estimation and Facial Feature Localization, Proceedings of the British Machine Vision Conference 2014, 2014. ,
DOI : 10.5244/C.28.66
URL : http://www.bmva.org/bmvc/2014/files/abstract039.pdf
A sequentially rejective test procedure based on a modified Bonferroni inequality, Biometrika, vol.77, issue.3, pp.663-665, 1990. ,
DOI : 10.1093/biomet/77.3.663
Robotic gaze control using reinforcement learning, 2012 IEEE International Workshop on Haptic Audio Visual Environments and Games (HAVE 2012) Proceedings, 2012. ,
DOI : 10.1109/HAVE.2012.6374444
Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks, International Journal of Computer Vision, vol.30, issue.6, p.2016 ,
DOI : 10.1109/ICCVW.2015.43
Robust regression and outlier detection, 2005. ,
Learning internal representations by error propagation, 1985. ,
Recognition of composite human activities through context-free grammar based representation, CVPR, 2006. ,
Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities, ICCV, 2009. ,
Stochastic representation and recognition of high-level group activities. IJCV, 2011. ,
DOI : 10.1007/s11263-010-0355-5
Image Classification with the Fisher Vector: Theory and Practice, International Journal of Computer Vision, vol.73, issue.2, 2013. ,
DOI : 10.1007/s11263-006-9794-4
Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004. ,
DOI : 10.1109/ICPR.2004.1334462
URL : http://www.nada.kth.se/%7Ecaputo/publik/icpr04actions.pdf
Overfeat: Integrated recognition, localization and detection using convolutional networks, ICLR, 2014. ,
Deformable gans for pose-based human image generation, IEEE CVPR, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01761539
Very deep convolutional networks for large-scale image recognition, 2014. ,
Very deep convolutional networks for large-scale image recognition. arXiv preprint, 2014. ,
Deep Convolutional Neural Network Design Patterns. CoRR, abs, 1611. ,
A tutorial on support vector regression, Stat Comput, 2004. ,
Dropout: A simple way to prevent neural networks from overfitting, JMLR, vol.15, pp.1929-1958, 2014. ,
Depression severity estimation from multiple modalities, 2018. ,
Sifting the evidence?what's wrong with significance tests?Another comment on the role of statistical methods, BMJ, issue.7280, pp.322226-231, 2001. ,
Robust parameter estimation in computer vision, SIAM Review, 1999. ,
Activity Group Localization by Modeling the Relations among Participants, ECCV, 2014. ,
DOI : 10.1007/978-3-319-10590-1_48
URL : http://media.cs.tsinghua.edu.cn/%7Eimagevision/papers/eccv14-sunlei-86890741.pdf
Deep Convolutional Network Cascade for Facial Point Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013. ,
DOI : 10.1109/CVPR.2013.446
On the importance of initialization and momentum in deep learning, ICML, pp.1139-1147, 2013. ,
Introduction to Reinforcement Learning, 1998. ,
Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/CVPR.2015.7298594
URL : http://arxiv.org/pdf/1409.4842
Rethinking the Inception Architecture for Computer Vision, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2818-2826, 2016. ,
DOI : 10.1109/CVPR.2016.308
URL : http://arxiv.org/pdf/1512.00567
Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning?, IEEE Transactions on Medical Imaging, vol.35, issue.5, pp.1299-1312, 2016. ,
DOI : 10.1109/TMI.2016.2535302
URL : http://arxiv.org/pdf/1706.00712
Reinforcement learning with human teachers: Understanding how people want to teach robots, IEEE ROMAN, 2006. ,
Rrmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural networks for machine learning, pp.26-31, 2012. ,
DeepPose: Human Pose Estimation via Deep Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. ,
DOI : 10.1109/CVPR.2014.214
URL : http://arxiv.org/pdf/1312.4659
Machine Recognition of Human Activities: A Survey, IEEE Transactions on Circuits and Systems for Video Technology, vol.18, issue.11, 2008. ,
DOI : 10.1109/TCSVT.2008.2005594
URL : http://www.cfar.umd.edu/%7Erama/Publications/Turaga_CSVT_2008.pdf
Maintaining awareness of the focus of attention of a conversation: A robot-centric reinforcement learning approach, 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), p.137, 2016. ,
DOI : 10.1109/ROMAN.2016.7745088
P-Values: Misunderstood and Misused, Frontiers in Physics, vol.13, issue.6, 2016. ,
DOI : 10.1038/nature.2014.15787
URL : http://journal.frontiersin.org/article/10.3389/fphy.2016.00006/pdf
Head Pose Estimation with Combined 2D SIFT and 3D HOG Features, 2013 Seventh International Conference on Image and Graphics, 2013. ,
DOI : 10.1109/ICIG.2013.133
Action Recognition with Improved Trajectories, 2013 IEEE International Conference on Computer Vision, 2013. ,
DOI : 10.1109/ICCV.2013.441
URL : https://hal.archives-ouvertes.fr/hal-00873267
Individual comparisons by ranking methods, Biometrics Bulletin, pp.80-83, 1945. ,
DOI : 10.2307/3001968
Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, 1992. ,
DOI : 10.1007/978-1-4615-3618-5_2
URL : http://www.cs.ualberta.ca/~sutton/williams-92.pdf
Learning from massive noisy labeled data for image classification, CVPR, 2015. ,
Supervised Descent Method and Its Applications to Face Alignment, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013. ,
DOI : 10.1109/CVPR.2013.75
URL : http://www.ri.cmu.edu/pub_files/2013/5/main.pdf
Learning Auto-Structured Regressor from Uncertain Nonnegative Labels, 2007 IEEE 11th International Conference on Computer Vision, pp.1-8, 2007. ,
DOI : 10.1109/ICCV.2007.4409050
URL : http://www.lv-nus.org/papers/2007/2007_c_2.pdf
Ioannis Patras, Hatice Gunes, and Peter Robinson. Face alignment assisted by head pose estimation, BMVC, 2015. ,
Articulated Human Detection with Flexible Mixtures of Parts, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.12, pp.2878-2890, 2013. ,
DOI : 10.1109/TPAMI.2012.261
URL : http://www.ics.uci.edu/~dramanan/papers/pose_pami.pdf
Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses, IEEE TPAMI, 2012. ,
Evolving artificial neural networks, Proceedings of the IEEE, vol.87, issue.9, pp.1423-1447, 1999. ,
How Transferable Are Features in, Deep Neural Networks? In NIPS, pp.3320-3328, 2014. ,
Multiclass spectral clustering, ICCV, 2003. ,
SUMMARY, Robotica, vol.5, issue.11, pp.2122-2138, 2017. ,
DOI : 10.1016/j.patrec.2010.09.011
Adadelta: an adaptive learning rate method. arXiv preprint, 2012. ,
Multi-scale deep networks and regression forests for direct bi-ventricular volume estimation, Medical Image Analysis, vol.30, 2016. ,
DOI : 10.1016/j.media.2015.07.003
Face detection, pose estimation, and landmark localization in the wild, CVPR, pp.2879-2886, 2012. ,