151 8.3.1 Human detector, p.154 ,
156 8.4.2 Multi-fold multiple instance learning Temporal supervision and detection, p.160 ,
Recently, signicant progress Incorporating segmentation Another useful cue consists in segmenting the humans In particular, this will allow to focus on features from trajectories that belong to the human, whereas boxes also contain background Jhuang et al. [2013] have shown that the human segmentation helps the action recognition task. Human segmentation can be obtained without additional annotation: recent works Kolesnikov and Lampert, 2016] show that reasonable segmentation performance can be obtained in a weakly-supervised setting. CNNs are learned using an estimation of the ground-truth segmentation based on the current estimate and priors such as the image or video labels, 2015. ,
DeepFlow: Large displacement optical ow with DeepMatching, Proceedings of the IEEE International Conference on Computer Vision (ICCV) 2013 ,
EpicFlow: Edge-preserving interpolation of correspondences for optical flow, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,
DOI : 10.1109/CVPR.2015.7298720
URL : https://hal.archives-ouvertes.fr/hal-01142656
Learning to detect Motion Boundaries, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,
DOI : 10.1109/CVPR.2015.7298873
URL : https://hal.archives-ouvertes.fr/hal-01142653
Learning to Track for Spatio-Temporal Action Localization, 2015 IEEE International Conference on Computer Vision (ICCV) ,
DOI : 10.1109/ICCV.2015.362
URL : https://hal.archives-ouvertes.fr/hal-01159941
Determining three-dimensional motion and structure from optical ow generated by several moving objects, IEEE Trans. PAMI, p.25, 1985. ,
DOI : 10.1109/tpami.1985.4767678
Human activity analysis, ACM Computing Surveys, vol.43, issue.3, p.118, 2011. ,
DOI : 10.1145/1922649.1922653
Multi-face tracking by extended bag-of-tracklets in egocentric videos, Computer Vision and Image Understanding, p.167, 2015. ,
A computational framework and an algorithm for the measurement of visual motion, International Journal of Computer Vision, vol.27, issue.4, p.25, 1989. ,
DOI : 10.1007/BF00158167
Introducing a smoothness constraint in a matching approach for the computation of displacement elds, Image Understanding Workshop, p.25, 1985. ,
2D Human Pose Estimation: New Benchmark and State of the Art Analysis, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.147-152, 2014. ,
DOI : 10.1109/CVPR.2014.471
Contour Detection and Hierarchical Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.5, pp.89-99, 2011. ,
DOI : 10.1109/TPAMI.2010.161
Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation, 2015 IEEE International Conference on Computer Vision (ICCV) ,
DOI : 10.1109/ICCV.2015.457
URL : http://arxiv.org/abs/1508.05151
Lucas-Kanade 20 Years On: A Unifying Framework, International Journal of Computer Vision, vol.56, issue.3, p.27, 2004. ,
DOI : 10.1023/B:VISI.0000011205.11775.fd
A database and evaluation methodology for optical ow, IJCV, vol.32, issue.8, p.77, 2011. ,
The Generalized PatchMatch Correspondence Algorithm, ECCV, pp.37-63, 2010. ,
DOI : 10.1007/978-3-642-15558-1_3
Surf: Speeded up robust features, ECCV, p.120, 2006. ,
DOI : 10.1007/11744023_32
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.679.3046
Rotationally invariant image operators, International Joint Conference on Pattern Recognition, p.119, 1978. ,
Learning deep architectures for AI. Foundations and Trends in Machine Learning, p.45, 2009. ,
Depth and motion discontinuities, p.97, 1999. ,
Robust dynamic motion estimation over time, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.25, 1991. ,
DOI : 10.1109/CVPR.1991.139705
The robust estimation of multiple motions: parametric and piecewise-smooth ow elds, Computer Vision and Image Understanding, vol.25, p.27, 1996. ,
Probabilistic detection and tracking of motion boundaries, IJCV, vol.97, p.98, 2000. ,
The recognition of human movement using temporal templates, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23, issue.3, 2001. ,
DOI : 10.1109/34.910878
Weakly Supervised Action Labeling in Videos under Ordering Constraints, ECCV, p.172, 2014. ,
DOI : 10.1007/978-3-319-10602-1_41
URL : https://hal.archives-ouvertes.fr/hal-01053967
Action Recognition by Weakly-Supervised Discriminative Region Localization, Proceedings of the British Machine Vision Conference 2014, p.124, 2014. ,
DOI : 10.5244/C.28.111
Shadow puppetry, Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999. ,
DOI : 10.1109/ICCV.1999.790422
A General Dense Image Matching Framework Combining Direct and Feature-Based Costs, 2013 IEEE International Conference on Computer Vision, pp.74-94 ,
DOI : 10.1109/ICCV.2013.30
Object Segmentation by Long Term Analysis of Point Trajectories, ECCV, p.119, 2010. ,
DOI : 10.1007/978-3-642-15555-0_21
Large displacement optical ow: descriptor matching in variational motion estimation, IEEE Trans. PAMI, vol.30, issue.110, pp.76-78, 2011. ,
High accuracy optical ow estimation based on a theory for warping, ECCV, pp.56-85, 0198. ,
Variational Motion Segmentation with Level Sets, ECCV, p.99, 2006. ,
DOI : 10.1007/11744023_37
Variational optical ow computation in real time, IEEE Trans. on Image Processing, vol.8, p.55, 2005. ,
Lucas/kanade meets horn/schunck: Combining local and global optic ow methods. IJCV, p.26, 2005. ,
DOI : 10.1023/b:visi.0000045324.43199.43
A Multigrid Platform for Real-Time Motion Computation with Discontinuity-Preserving Variational Methods, International Journal of Computer Vision, vol.44, issue.2, p.25, 2006. ,
DOI : 10.1007/s11263-006-6616-7
Multiresolution ow-through motion analysis, CVPR, p.22, 1983. ,
A naturalistic open source movie for optical ow evaluation, ECCV, 2012. 3, pp.32-77 ,
Invariant features for 3-D gesture recognition, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, 1996. ,
DOI : 10.1109/AFGR.1996.557258
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9145
A computational approach to edge detection, IEEE Trans. PAMI, vol.89, p.91, 1986. ,
Cross-dataset action detection, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.126, 2010. ,
DOI : 10.1109/CVPR.2010.5539875
Semantic image segmentation with deep convolutional nets and fully connected crfs, ICLR, p.121, 2015. ,
Learning brightness transfer functions for the joint recovery of illumination changes and optical ow, ECCV, pp.74-94, 2014. ,
Structured Forests for Fast Edge Detection, 2013 IEEE International Conference on Computer Vision, pp.91-98, 2013. ,
DOI : 10.1109/ICCV.2013.231
Behavior Recognition via Sparse Spatio-Temporal Features, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005. ,
DOI : 10.1109/VSPETS.2005.1570899
Long-term recurrent convolutional networks for visual recognition and description, CVPR, p.121, 2015. ,
Flownet: Learning optical ow with convolutional networks, ICCV, p.171, 2015. ,
DOI : 10.1109/iccv.2015.316
Combinatorial regularization of descriptor matching for optical ow estimation, BMVC, p.171, 2015. ,
Automatic annotation of human actions in video, 2009 IEEE 12th International Conference on Computer Vision, p.173, 2009. ,
DOI : 10.1109/ICCV.2009.5459279
A hierarchical non-parametric method for capturing non-rigid deformations, Image and Vision Computing, p.37, 2009. ,
DOI : 10.1109/crv.2005.6
Two-Frame Motion Estimation Based on Polynomial Expansion, Proceedings of the 13th Scandinavian conference on Image analysis, p.120, 2003. ,
DOI : 10.1007/3-540-45103-X_50
Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, p.122, 2010. ,
DOI : 10.1109/TPAMI.2009.167
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.153.2745
Velocity determination in scenes containing several moving objects, Computer Graphics and Image Processing, vol.9, issue.4, 1979. ,
DOI : 10.1016/0146-664X(79)90097-2
Modeling video evolution for action recognition, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/CVPR.2015.7299176
Design and use of linear models for image motion analysis, p.98, 2000. ,
Driver assistance systems based on vision in and out of vehicles, IEEE IV2003 Intelligent Vehicles Symposium. Proceedings (Cat. No.03TH8683), 2003. ,
DOI : 10.1109/IVS.2003.1212930
Aggregation of local parametric candidates with exemplar-based occlusion handling for optical ow, Computer Vision and Image Understanding, p.170, 2015. ,
Towards Internet-scale multi-view stereo, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.29, 2010. ,
DOI : 10.1109/CVPR.2010.5539802
Temporal Localization of Actions with Actoms, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.11, p.173, 2013. ,
DOI : 10.1109/TPAMI.2013.65
URL : https://hal.archives-ouvertes.fr/hal-00687312
On-line adaption of class-specic codebooks for instance tracking, BMVC, p.135, 2010. ,
Vision meets robotics: The KITTI dataset, The International Journal of Robotics Research, vol.32, issue.11, p.77, 2013. ,
DOI : 10.1177/0278364913491297
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.650.8155
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.132 ,
DOI : 10.1109/CVPR.2014.81
Fast image scanning with deep max-pooling convolutional neural networks, 2013 IEEE International Conference on Image Processing, p.134, 2013. ,
DOI : 10.1109/ICIP.2013.6738831
Finding action tubes, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.147-149, 2015. ,
DOI : 10.1109/CVPR.2015.7298676
Motion from Color, Computer Vision and Image Understanding, vol.68, issue.3, p.22, 1997. ,
DOI : 10.1006/cviu.1997.0553
Actions as Space-Time Shapes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.12, 2007. ,
DOI : 10.1109/TPAMI.2007.70711
Non-rigid dense correspondence with applications for image enhancement, pp.30-53, 2011. ,
Scale Space and Variational Methods in Computer Vision, chapter Why Is the Census Transform Good for Robust Optic Flow Computation?, p.22, 2013. ,
Struck: Structured output tracking with kernels, ICCV, pp.129-149, 2011. ,
DOI : 10.1109/iccv.2011.6126251
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.294.5858
A Combined Corner and Edge Detector, Procedings of the Alvey Vision Conference 1988, p.119, 1988. ,
DOI : 10.5244/C.2.23
Multiple View Geometry in Computer Vision, pp.521540518-81, 2003. ,
DOI : 10.1017/CBO9780511811685
Zelnik-Manor. On sifts and their scales, CVPR, p.65, 2012. ,
Computing nearest-neighbor elds via propagationassisted kd-trees, CVPR, 2012. 79, p.86 ,
Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.121, 2016. ,
DOI : 10.1109/CVPR.2016.90
URL : http://arxiv.org/abs/1512.03385
Going deeper into action recognition: A survey, Image and Vision Computing, vol.60, p.118, 2016. ,
DOI : 10.1016/j.imavis.2017.01.010
Learning discriminative localization from weakly labeled data, Pattern Recognition, vol.47, issue.3, p.122, 2014. ,
DOI : 10.1016/j.patcog.2013.09.028
Recovering Occlusion Boundaries from an Image, International Journal of Computer Vision, vol.14, issue.2, p.99, 2011. ,
DOI : 10.1007/s11263-010-0400-4
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.186.668
What makes for eective detection proposals?, IEEE Trans. PAMI, p.130, 2015. ,
Occlusion and Motion Reasoning for Long-Term Tracking, ECCV, p.134, 2014. ,
DOI : 10.1007/978-3-319-10599-4_12
URL : https://hal.archives-ouvertes.fr/hal-01020149
Online Object Tracking with Proposal Selection, 2015 IEEE International Conference on Computer Vision (ICCV), p.168, 2015. ,
DOI : 10.1109/ICCV.2015.354
URL : https://hal.archives-ouvertes.fr/hal-01207196
Learning to find occlusion regions, CVPR 2011, p.99, 2011. ,
DOI : 10.1109/CVPR.2011.5995517
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.473.7843
Learning actions from the Web, 2009 IEEE 12th International Conference on Computer Vision, p.119, 2009. ,
DOI : 10.1109/ICCV.2009.5459368
MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation, ACCV, p.167, 2014. ,
DOI : 10.1007/978-3-319-16808-1_21
Action Localization with Tubelets from Motion, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.149 ,
DOI : 10.1109/CVPR.2014.100
URL : https://hal.archives-ouvertes.fr/hal-00996844
On the analysis of accumulative dierence pictures from image sequences of real world scenes, IEEE Trans. PAMI, issue.7, 1979. ,
Separating non-stationary from stationary scene components in a sequence of real world TV-images, 1977. ,
Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9, p.120, 2012. ,
DOI : 10.1109/TPAMI.2011.235
Towards Understanding Action Recognition, 2013 IEEE International Conference on Computer Vision, pp.125-174 ,
DOI : 10.1109/ICCV.2013.396
URL : https://hal.archives-ouvertes.fr/hal-00906902
3D Convolutional Neural Networks for Human Action Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.1, p.121, 2013. ,
DOI : 10.1109/TPAMI.2012.59
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.169.4046
Cae: Convolutional architecture for fast feature embedding. arXiv preprint arXiv, pp.1408-5093, 2014. ,
Face-TLD: Tracking-Learning-Detection applied to faces, 2010 IEEE International Conference on Image Processing, p.135, 2010. ,
DOI : 10.1109/ICIP.2010.5653525
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.4326
Tracking-Learning-Detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.7, pp.129-149, 2012. ,
DOI : 10.1109/TPAMI.2011.239
Large-scale video classication with convolutional neural networks, CVPR, 2014. 105, p.121 ,
Optical ow with geometric occlusion estimation and fusion of multiple frames, EMMCVPR, 2015. 74, pp.75-93 ,
Deformation Models for Image Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.8, p.37, 2007. ,
DOI : 10.1109/TPAMI.2007.1153
Improved image boundaries for better video segmentation, p.168, 2016. ,
Deformable Spatial Pyramid Matching for Fast Dense Correspondences, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.65, 2013. ,
DOI : 10.1109/CVPR.2013.299
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.362.8285
A Spatio-Temporal Descriptor Based on 3D-Gradients, Procedings of the British Machine Vision Conference 2008, pp.123-148, 2008. ,
DOI : 10.5244/C.22.99
Human Focused Action Localization in Video, International Workshop on Sign, Gesture , and Activity (SGA), pp.123-129, 2010. ,
DOI : 10.1007/978-3-642-35749-7_17
Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation, ECCV, p.174, 2016. ,
DOI : 10.1007/978-3-319-46493-0_42
Coherency sensitive hashing, ICCV, p.37, 2011. ,
DOI : 10.1109/iccv.2011.6126421
Geodesic object proposals, ECCV, p.82, 2014. ,
Imagenet classication with deep convolutional neural networks, Advances in Neural Information Processing Systems 25, p.121, 2012. ,
HMDB: A large video database for human motion recognition, 2011 International Conference on Computer Vision, p.126, 2011. ,
DOI : 10.1109/ICCV.2011.6126543
Ecient subwindow search: A branch and bound framework for object localization, IEEE Trans. PAMI, p.122, 2009. ,
Discriminative gure-centric models for joint action localization and recognition, ICCV, p.125, 2011. ,
Action Recognition by Hierarchical Mid-Level Action Elements, 2015 IEEE International Conference on Computer Vision (ICCV), p.124, 2015. ,
DOI : 10.1109/ICCV.2015.517
On space-time interest points. IJCV, p.149, 2005. ,
DOI : 10.1007/s11263-005-1838-7
Retrieving actions in movies, 2007 IEEE 11th International Conference on Computer Vision, p.147, 2007. ,
DOI : 10.1109/ICCV.2007.4409105
Learning realistic human actions from movies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.121, 2008. ,
DOI : 10.1109/CVPR.2008.4587756
URL : https://hal.archives-ouvertes.fr/inria-00548659
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), p.121, 2006. ,
DOI : 10.1109/CVPR.2006.68
URL : https://hal.archives-ouvertes.fr/inria-00548585
Dense Rigid Reconstruction from Unstructured Discontinuous Video, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), p.167, 2015. ,
DOI : 10.1109/ICCVW.2015.110
Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.37-51, 1998. ,
DOI : 10.1109/5.726791
Ecient backprop, Neural Networks: Tricks of the trade, p.45, 1998. ,
Fusionow: Discrete-continuous optimization for optical ow estimation, CVPR, p.30, 2008. ,
Locally ane sparse-todense matching for motion and occlusion estimation, ICCV, 2013. 72, pp.79-89 ,
DOI : 10.1109/iccv.2013.216
Track to the future: Spatio-temporal video segmentation with long-range motion cues, CVPR 2011, p.119, 2011. ,
DOI : 10.1109/CVPR.2011.6044588
URL : https://hal.archives-ouvertes.fr/hal-00817961
Unsupervised Learning of Edges, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.172, 2016. ,
DOI : 10.1109/CVPR.2016.179
Analysis of contour motions, Advances in Neural Information Processing Systems, p.99, 2006. ,
SIFT ow: Dense correspondence across scenes and its applications, IEEE Trans. PAMI, p.65, 2011. ,
Recognizing realistic actions from videos in the wild, CVPR, p.119, 2009. ,
Distinctive image features from scale-invariant keypoints. IJCV, pp.41-62, 2004. ,
DOI : 10.1023/b:visi.0000029664.99615.94
Patch match lter: Ecient edge-aware ltering meets randomized search for fast correspondence eld estimation, CVPR, p.30, 2013. ,
DOI : 10.1109/cvpr.2013.242
An iterative image registration technique with an application to stereo vision, IJCAI, p.120, 1920. ,
Action Recognition and Localization by Hierarchical Space-Time Segments, 2013 IEEE International Conference on Computer Vision, p.162, 2013. ,
DOI : 10.1109/ICCV.2013.341
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.663.1492
Do less and achieve more: Training CNNs for action recognition utilizing action images from the Web, Pattern Recognition, p.172, 2015. ,
DOI : 10.1016/j.patcog.2017.01.027
Object tracking in cluttered background based on optical ow and edges, ICVPR, 1996. ,
Preattentive texture discrimination with early vision mechanisms, Journal of the Optical Society of America A, vol.7, issue.5, p.45, 1990. ,
DOI : 10.1364/JOSAA.7.000923
Unsupervised Tube Extraction Using Transductive Learning and Dense Trajectories, 2015 IEEE International Conference on Computer Vision (ICCV), pp.123-149 ,
DOI : 10.1109/ICCV.2015.193
Multispectral constraints for optical ow computation, ICCV, p.22, 1990. ,
DOI : 10.1109/iccv.1990.139488
Actions in context, 2009 IEEE Conference on Computer Vision and Pattern Recognition, p.119, 2009. ,
DOI : 10.1109/CVPR.2009.5206557
URL : https://hal.archives-ouvertes.fr/inria-00548645
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, p.106, 2001. ,
DOI : 10.1109/ICCV.2001.937655
Dynamic scene analysis, Computer Graphics and Image Processing, vol.7, issue.3, 1977. ,
DOI : 10.1016/S0146-664X(78)80003-3
Trajectons: Action recognition through the motion analysis of tracked features, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, p.120, 2009. ,
DOI : 10.1109/ICCVW.2009.5457659
Discrete Optimization for Optical Flow, GCPR, 2015. 28, pp.75-93 ,
DOI : 10.1007/978-3-319-24947-6_2
Spot On: Action Localization from Pointly-Supervised Proposals, ECCV, p.149, 2016. ,
DOI : 10.1007/978-3-319-46454-1_27
Estimation and interpretation of discontinuities in optical ow elds, ICCV, p.97, 2001. ,
A comparison of ane region detectors, IJCV, vol.29, issue.53, p.57, 2005. ,
A survey of advances in visionbased human motion capture and analysis, Computer vision and Image Understanding, issue.10, 2006. ,
Multi-label Discriminative Weakly-Supervised Human Activity Recognition and Localization, ACCV, p.162, 2014. ,
DOI : 10.1007/978-3-319-16814-2_16
Fast approximate nearest neighbors with automatic algorithm conguration, International Conference on Computer Vision Theory and Application VISSAPP'09, p.62, 2009. ,
Illuminationrobust dense optical ow using census signatures, Pattern Recognition, p.22, 2011. ,
An investigation of smoothness constraints for the estimation of displacement vector elds from image sequences ,
Optical Velocity Patterns, Velocity-Sensitive Neurons, and Space Perception: A Hypothesis, Perception, vol.225, issue.1, 1974. ,
DOI : 10.1068/p030063
Revised denition of optical ow: Integration of radiometric and geometric cues for dynamic scene analysis, IEEE Trans. PAMI, p.23, 1998. ,
The open world of micro-videos, p.172, 2016. ,
Modeling temporal structure of decomposable motion segments for activity classication, ECCV, p.173, 2010. ,
Over-parameterized variational optical ow, IJCV, vol.25, p.26, 2008. ,
DOI : 10.1007/s11263-007-0051-2
Spatio-temporal Object Detection Proposals, ECCV, 2014a. 10, p.149 ,
DOI : 10.1007/978-3-319-10578-9_48
URL : https://hal.archives-ouvertes.fr/hal-01021902
The LEAR submission at Thumos 2014, 2014b. URL https ,
Ecient Action Localization with Approximately Normalized Fisher Vectors, CVPR, p.122, 2014. ,
DOI : 10.1109/cvpr.2014.326
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.634.9146
Weakly-and semi-supervised learning of a dcnn for semantic image segmentation, ICCV, p.174, 2015. ,
Fast Object Segmentation in Unconstrained Video, 2013 IEEE International Conference on Computer Vision, pp.98-168 ,
DOI : 10.1109/ICCV.2013.223
High ve: Recognising human interactions in tv shows, BMVC, p.119, 2010. ,
Action recognition with stacked sher vectors, ECCV, p.121, 2014. ,
DOI : 10.1007/978-3-319-10602-1_38
Flowing convnets for human pose estimation in videos, ICCV, p.167, 2015. ,
Descriptor learning for ecient retrieval, ECCV, p.29, 2010. ,
A survey on vision-based human action recognition, Image and Vision Computing, vol.28, issue.6, p.118, 2010. ,
DOI : 10.1016/j.imavis.2009.11.014
Explicit Modeling of Human-Object Interactions in Realistic Videos, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.4, p.173, 2012. ,
DOI : 10.1109/TPAMI.2012.175
URL : https://hal.archives-ouvertes.fr/hal-00720847
Learning object class detectors from weakly annotated video, 2012 IEEE Conference on Computer Vision and Pattern Recognition, p.106 ,
DOI : 10.1109/CVPR.2012.6248065
URL : https://hal.archives-ouvertes.fr/hal-00695940
Non-local total generalized variation for optical ow estimation, ECCV, pp.75-93, 2014. ,
DOI : 10.1007/978-3-319-10590-1_29
Dynamic body VSLAM with semantic constraints, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), p.167, 2015. ,
DOI : 10.1109/IROS.2015.7353626
URL : http://arxiv.org/abs/1504.07269
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS, pp.2015-147 ,
DOI : 10.1109/TPAMI.2016.2577031
URL : http://arxiv.org/abs/1506.01497
Local grouping for optical ow, CVPR, p.79, 2008. ,
EpicFlow: Edge-preserving interpolation of correspondences for optical flow, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.72-73, 2015. ,
DOI : 10.1109/CVPR.2015.7298720
URL : https://hal.archives-ouvertes.fr/hal-01142656
DeepMatching: Hierarchical Deformable Dense Matching. IJCV, 2016, pp.76-77 ,
DOI : 10.1007/s11263-016-0908-3
URL : https://hal.archives-ouvertes.fr/hal-01148432
Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.125, 2008. ,
DOI : 10.1109/CVPR.2008.4587727
Towards model-based recognition of human movements in image sequences. CVGIP: Image understanding, 1994. ,
On the spatial statistics of optical ow, IJCV, vol.24, p.28, 2007. ,
Artistic style transfer for videos. arXiv preprint, p.167, 2016. ,
DOI : 10.1007/978-3-319-45886-1_3
Temporal constraints in large optical ow estimation, Computer Aided Systems TheoryEUROCAST, pp.709716-709741, 2007. ,
Image classication with the sher vector: Theory and practice. IJCV, pp.121-158, 2013. ,
Particle video: Long-range motion estimation using point trajectories. IJCV, p.119, 2008. ,
DOI : 10.1007/s11263-008-0136-6
Modeling the Temporal Extent of Actions, ECCV, p.122, 2010. ,
DOI : 10.1007/978-3-642-15549-9_39
Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., p.126, 2004. ,
DOI : 10.1109/ICPR.2004.1334462
Filter ow, ICCV, p.24, 2009. ,
Integrated recognition, localization and detection using CNN, ICLR, p.134, 2014. ,
Optical ow with semantic segmentation and localized layers, CVPR, p.171, 2016. ,
DOI : 10.1109/cvpr.2016.422
URL : http://arxiv.org/abs/1603.03911
Similarity Constrained Latent Support Vector Machine: An Application to Weakly Supervised Action Classification, ECCV, p.124, 2012. ,
DOI : 10.1007/978-3-642-33786-4_5
Optical ow-based real-time object tracking using non-prior training active feature model. Real-Time Imaging, 2005. ,
DOI : 10.1016/j.rti.2005.03.006
Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.173, 2016. ,
DOI : 10.1109/CVPR.2016.119
Two-stream convolutional networks for action recognition in videos, NIPS, p.132, 2014. ,
Very deep convolutional networks for largescale image recognition, ICLR, p.152, 2015. ,
Weakly Supervised Action Detection, Procedings of the British Machine Vision Conference 2011, p.149, 2011. ,
DOI : 10.5244/C.25.65
Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, p.120, 2003. ,
DOI : 10.1109/ICCV.2003.1238663
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild, CRCV-TR-12-01, p.133, 2012. ,
The Early Detection of Motion Boundaries, p.98, 1991. ,
Occlusion Boundaries from Motion: Low-Level Detection and??Mid-Level Reasoning, International Journal of Computer Vision, vol.14, issue.7, p.99, 2009. ,
DOI : 10.1007/s11263-008-0203-z
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.193.4976
Ecient computation of optical ow using the census transform, Pattern recognition, p.170, 2004. ,
Ecient Computation of Optical Flow Using the Census Transform, Proceedings of the 26th DAGM Symposium, p.22, 2004. ,
Adaptive integration of feature matches into variational optical ow methods, ACCV, 2012. 54, p.172 ,
Learning optical ow, ECCV, p.28, 2008. ,
Layered image motion with explicit occlusions, temporal consistency, and depth ordering, NIPS, p.31, 2010. ,
A fully-connected layered model of foreground and background ow, CVPR, p.99, 2013. ,
Local Layering for Joint Motion Estimation and Occlusion Detection, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.74, 2014. ,
DOI : 10.1109/CVPR.2014.144
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.665.5541
A quantitative analysis of current practices in optical ow estimation and the principles behind them. IJCV, 2014b, pp.93-110 ,
Computing nearest-neighbor elds via propagation-assisted kdtrees, CVPR, 2012. 29, p.63 ,
On-road vehicle detection using optical sensors: a review, Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749), 2004. ,
DOI : 10.1109/ITSC.2004.1398966
Dense point trajectories by gpuaccelerated large displacement optical ow, ECCV, p.203, 2010. ,
DOI : 10.1007/978-3-642-15549-9_32
Occlusion boundary detection and gure/ground assignment from optical ow, CVPR, p.99, 2011. ,
DOI : 10.1109/cvpr.2011.5995364
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.221.202
Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.121, 2015. ,
DOI : 10.1109/CVPR.2015.7298594
URL : http://arxiv.org/abs/1409.4842
Computer Vision: Algorithms and Applications, p.53, 2010. ,
DOI : 10.1007/978-1-84882-935-0
Motion Words for Videos, ECCV, p.121, 2014. ,
DOI : 10.1007/978-3-319-10590-1_47
Learning to extract motion from videos in convolutional neural networks. arXiv preprint, p.171, 2016. ,
Spatiotemporal Deformable Part Models for Action Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.123, 2013. ,
DOI : 10.1109/CVPR.2013.341
Sparse ow: Sparse matching for small to large displacement optical ow, Applications of Computer Vision (WACV), pp.72-167, 2015. ,
DOI : 10.1109/wacv.2015.151
A fast local descriptor for dense matching, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.29, 2008. ,
DOI : 10.1109/CVPR.2008.4587673
DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.5, p.65, 2010. ,
DOI : 10.1109/TPAMI.2009.77
Learning Spatiotemporal Features with 3D Convolutional Networks, 2015 IEEE International Conference on Computer Vision (ICCV), p.121, 2015. ,
DOI : 10.1109/ICCV.2015.510
URL : http://arxiv.org/abs/1412.0767
An Unbiased Second-Order Prior for High-Accuracy Motion Estimation, Pattern Recognition, vol.25, p.27, 2008. ,
DOI : 10.1007/978-3-540-69321-5_40
A monotonic and continuous two-dimensional warping based on dynamic programming, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170), p.37, 1998. ,
DOI : 10.1109/ICPR.1998.711195
Feature Tracking and Motion Compensation for Action Recognition, Procedings of the British Machine Vision Conference 2008, p.120, 2008. ,
DOI : 10.5244/C.22.30
Selective Search for Object Recognition, International Journal of Computer Vision, vol.57, issue.1, p.139, 2013. ,
DOI : 10.1007/s11263-013-0620-5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.3382
Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling, 2012 IEEE Conference on Computer Vision and Pattern Recognition, p.99, 2012. ,
DOI : 10.1109/CVPR.2012.6247887
Visual Word Ambiguity, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.7, p.120, 2010. ,
DOI : 10.1109/TPAMI.2009.132
APT: Action localization proposals from dense trajectories, Procedings of the British Machine Vision Conference 2015, pp.147-149 ,
DOI : 10.5244/C.29.177
Long-term Temporal Convolutions for Action Recognition, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01241518
An evaluation of data costs for optical ow, GCPR, p.170, 2013. ,
Piecewise rigid scene ow, ICCV, 2013b. 75, p.102 ,
DOI : 10.1109/iccv.2013.174
Modeling temporal coherence for optical ow, ICCV, p.25, 2011. ,
DOI : 10.1109/iccv.2011.6126359
URL : http://hdl.handle.net/11858/00-001M-0000-0024-513E-5
Dense Trajectories and Motion Boundary Descriptors for Action Recognition, International Journal of Computer Vision, vol.73, issue.2, pp.131-136, 1998. ,
DOI : 10.1007/s11263-012-0594-8
URL : https://hal.archives-ouvertes.fr/hal-00725627
A robust and ecient video representation for action recognition. IJCV, 2015, pp.131-136 ,
DOI : 10.1007/s11263-015-0846-5
URL : http://arxiv.org/abs/1504.05524
Representing moving images with layers, IEEE Transactions on Image Processing, vol.3, issue.5, p.99, 1994. ,
DOI : 10.1109/83.334981
Video Action Detection with Relational Dynamic-Poselets, ECCV, p.173, 2014. ,
DOI : 10.1007/978-3-319-10602-1_37
All of Statistics: A Concise Course in Statistical Inference, p.80, 2010. ,
DOI : 10.1007/978-0-387-21736-9
Parallel algorithms for approximation of distance maps on parametric surfaces, ACM Transactions on Graphics, vol.27, issue.4, p.82, 2008. ,
DOI : 10.1145/1409625.1409626
Structure-and motionadaptive regularization for high accuracy optic ow, ICCV, p.85, 2009. ,
DOI : 10.1109/iccv.2009.5459375
An improved algorithm for tv-l 1 optical ow, Statistical and Geometrical Approaches to Visual Motion Analysis, p.27, 2009. ,
A survey of vision-based methods for action representation, segmentation and recognition, Computer Vision and Image Understanding, vol.115, issue.2, p.118, 2011. ,
DOI : 10.1016/j.cviu.2010.10.002
URL : https://hal.archives-ouvertes.fr/inria-00459653
Deepow: Large displacement optical ow with deep matching, ICCV, pp.79-86, 2013. ,
DOI : 10.1109/iccv.2013.175
Learning to track for spatiotemporal action localization, ICCV, 2015a. 14, pp.149-159 ,
URL : https://hal.archives-ouvertes.fr/hal-01159941
Learning to detect Motion Boundaries, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.13, 2015. ,
DOI : 10.1109/CVPR.2015.7298873
URL : https://hal.archives-ouvertes.fr/hal-01142653
Towards Weakly-Supervised Action Localization. arXiv preprint, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01317558
An ecient dense and scaleinvariant spatio-temporal interest point detector, ECCV, p.119, 2008. ,
DOI : 10.1007/978-3-540-88688-4_48
A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion, International Journal of Computer Vision, vol.II, issue.12, p.37, 2006. ,
DOI : 10.1007/s11263-006-6660-3
Ecient sparse-to-dense optical ow estimation using a learned basis and layers, CVPR, p.28, 2015. ,
Can humans fly? Action understanding with multiple classes of actors, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.122, 2015. ,
DOI : 10.1109/CVPR.2015.7298839
Motion detail preserving optical ow estimation, IEEE Trans. PAMI, vol.8, issue.93, pp.76-78, 2012. ,
DAISY lter ow: A generalized discrete approach to dense correspondences, CVPR, 2014. 29, p.65 ,
DOI : 10.1109/cvpr.2014.435
Iterative solution of large linear systems, p.203, 1971. ,
Fast action proposals for human action detection and search, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.147-149 ,
DOI : 10.1109/CVPR.2015.7298735
Discriminative subvolume search for ecient action detection, CVPR, p.122, 2009. ,
Non-parametric local transforms for computing visual correspondence, ECCV, p.22, 1994. ,
DOI : 10.1007/BFb0028345
A duality based approach for realtime tv-l 1 optical ow, Pattern Recognition, p.110, 2007. ,
Image classication using super-vector coding of local image descriptors, ECCV, p.120, 2010. ,
Complementary optic ow, EMM-CVPR, p.26, 2009. ,
DOI : 10.1007/978-3-642-03641-5_16
Optic ow in harmony, IJCV, vol.54, issue.55, p.85, 2011. ,
Edge Boxes: Locating Object Proposals from Edges, ECCV, pp.130-139, 2014. ,
DOI : 10.1007/978-3-319-10602-1_26
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.453.5208