B. Pitch, . Basketball, B. Shooting, . Press, B. Biking et al., Jump Rope, Kayaking, Lunges, Military Parade, Mixing Batter, Nun chucks, Pizza Tossing, Playing Guitar, Playing Piano, Playing Tabla, Playing Violin, Pole Vault, Pommel Horse, Pull Ups, Horse Riding, Hula Hoop, JavelinThrow, Juggling Balls, Jumping Jack

I. Different, (a) Harris detector; (b) Laplace detector

Y. Jiang, C. Ngo, and J. Yang, Towards optimal bag-of-features for object categorization and semantic video retrieval, Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR '07, pp.7-494, 2007.
DOI : 10.1145/1282280.1282352

W. Hu, T. Tan, L. Wang, and S. Maybank, A Survey on Visual Surveillance of Object Motion and Behaviors, IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol.34, issue.3, pp.334-352
DOI : 10.1109/TSMCC.2004.829274

K. Jake, Q. Aggarwal, and . Cai, Human motion analysis: A review, Nonrigid and Articulated Motion Workshop, pp.90-102, 1997.

M. Dariu and . Gavrila, The visual analysis of human movement: A survey, Computer vision and image understanding, vol.73, issue.1, pp.82-98, 1999.

J. J. , W. , and S. Singh, Video analysis of human dynamicsa survey, Real-time imaging, vol.9, issue.5, pp.321-346, 2003.

H. Buxton, Learning and understanding dynamic scene activity: a review, Image and Vision Computing, vol.21, issue.1, pp.125-136, 2003.
DOI : 10.1016/S0262-8856(02)00127-0

K. Jake, S. Aggarwal, and . Park, Human motion: Modeling and recognition of actions and interactions, 3D Data Processing, Visualization and Transmission Proceedings. 2nd International Symposium on. IEEE, pp.640-647, 2004.

P. Turaga, R. Chellappa, S. Venkatramana, O. Subrahmanian, and . Udrea, Machine recognition of human activities: A survey Circuits and Systems for Video Technology, IEEE Transactions on, vol.18, issue.11, pp.1473-1488, 2008.

J. Aggarwal, S. Michael, and . Ryoo, Human activity analysis, ACM Computing Surveys, vol.43, issue.3, p.16, 2011.
DOI : 10.1145/1922649.1922653

T. Guha, K. Rabab, and . Ward, Learning sparse representations for human action recognition Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.34, issue.8, pp.1576-1588, 2012.

K. Schindler and L. Van-gool, Action snippets: How many frames does human action recognition require?, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.
DOI : 10.1109/CVPR.2008.4587730

URL : http://mplab.ucsd.edu/wp-content/uploads/CVPR2008/Conference/data/papers/390.pdf

A. Noguchi and K. Yanai, A surf-based spatio-temporal feature for featurefusion-based action recognition, Trends and Topics in Computer Vision, pp.153-167, 2012.

I. Laptev and P. Pérez, Retrieving actions in movies, 2007 IEEE 11th International Conference on Computer Vision, pp.1-8, 2007.
DOI : 10.1109/ICCV.2007.4409105

D. Dementhon and D. Doermann, Video retrieval using spatio-temporal descriptors, Proceedings of the eleventh ACM international conference on Multimedia , MULTIMEDIA '03, pp.508-517, 2003.
DOI : 10.1145/957013.957124

E. Koen, . Van-de-sande, R. Jasper, T. Uijlings, . Gevers et al., Segmentation as selective search for object recognition, Computer Vision (ICCV), 2011 IEEE International Conference on, pp.1879-1886, 2011.

J. Willamowski, D. Arregui, G. Csurka, R. Christopher, L. Dance et al., Categorizing nine visual classes using local appearance descriptors, p.21, 2004.

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, p.511, 2001.
DOI : 10.1109/CVPR.2001.990517

O. Maron and T. Lozano-pérez, A framework for multiple-instance learning Advances in neural information processing systems, pp.570-576, 1998.

T. Kim, S. Wong, and R. Cipolla, Tensor Canonical Correlation Analysis for Action Classification, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.
DOI : 10.1109/CVPR.2007.383137

Z. Lin, Z. Jiang, S. Larry, and . Davis, Recognizing actions by shape-motion prototype trees, IEEE 12th International Conference on Computer Vision, pp.444-451, 2009.

K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas et al., A Comparison of Affine Region Detectors, International Journal of Computer Vision, vol.65, issue.1-2, pp.43-72, 2005.
DOI : 10.1007/s11263-005-3848-x

URL : https://hal.archives-ouvertes.fr/inria-00548528

H. Wang, M. M. Ullah, A. Klaser, I. Laptev, and C. Schmid, Evaluation of local spatio-temporal features for action recognition, Procedings of the British Machine Vision Conference 2009, 2009.
DOI : 10.5244/C.23.124

URL : https://hal.archives-ouvertes.fr/inria-00439769

H. Bay, T. Tuytelaars, and L. Van-gool, Surf: Speeded up robust features, Computer Vision?ECCV 2006, pp.404-417, 2006.
DOI : 10.1007/11744023_32

M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, Actions as space-time shapes, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.1395-1402, 2005.
DOI : 10.1109/ICCV.2005.28

URL : http://www.wisdom.weizmann.ac.il/~yelenag/spaceTimeActionsTPAMI2007.pdf

F. Aaron, J. W. Bobick, and . Davis, The recognition of human movement using temporal templates, Transactions of Pattern Analysis and Machine Intelligence, vol.23, issue.3, pp.257-267, 2001.

D. Gavrila and . Davis, Towards 3-d model-based tracking and recognition of human movement: a multi-view approach, " in International workshop on automatic face-and gesture-recognition, pp.272-277, 1995.

H. Fujiyoshi, A. J. Lipton, and T. Kanade, Real-time human motion analysis by image skeletonization, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201), pp.113-120, 2004.
DOI : 10.1109/ACV.1998.732852

M. Andriluka, S. Roth, and B. Schiele, Pictorial structures revisited: People detection and articulated pose estimation, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.1014-1021, 2009.
DOI : 10.1109/CVPR.2009.5206754

V. Parameswaran and R. Chellappa, View Invariance for Human Action Recognition, International Journal of Computer Vision, vol.36, issue.3, pp.83-101, 2006.
DOI : 10.3758/BF03212378

F. Lv and R. Nevatia, Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.
DOI : 10.1109/CVPR.2007.383131

URL : http://iris.usc.edu/Outlines/papers/2007/lv-nev-cvpr07.pdf

H. Wang and C. Schmid, Lear-inria submission for the thumos workshop, ICCV Workshop on Action Recognition with a Large Number of Classes, 2013.

H. Wang, A. Kläser, C. Schmid, and C. Liu, Dense Trajectories and Motion Boundary Descriptors for Action Recognition, International Journal of Computer Vision, vol.73, issue.2, pp.1-20, 2013.
DOI : 10.1007/s11263-006-9794-4

URL : https://hal.archives-ouvertes.fr/hal-00803241

C. Schuldt, I. Laptev, and B. Caputo, Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., pp.32-36, 2004.
DOI : 10.1109/ICPR.2004.1334462

J. Carlos-niebles, H. Wang, and L. Fei-fei, Unsupervised learning of human action categories using spatial-temporal words, International Journal of Computer Vision, issue.3, pp.299-318, 2008.

P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, Behavior recognition via sparse spatio-temporal features, " in Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp.65-72, 2005.

P. Scovanner, S. Ali, and M. Shah, A 3-dimensional sift descriptor and its application to action recognition, Proceedings of the 15th international conference on Multimedia , MULTIMEDIA '07, pp.357-360, 2007.
DOI : 10.1145/1291233.1291311

H. Jhuang, T. Serre, L. Wolf, and T. Poggio, A Biologically Inspired System for Action Recognition, 2007 IEEE 11th International Conference on Computer Vision, 2007.
DOI : 10.1109/ICCV.2007.4408988

I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, Learning realistic human actions from movies, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.
DOI : 10.1109/CVPR.2008.4587756

URL : https://hal.archives-ouvertes.fr/inria-00548659

G. Willems, T. Tuytelaars, and L. Van-gool, An efficient dense and scaleinvariant spatio-temporal interest point detector, Computer Vision?ECCV, pp.650-663, 2008.
DOI : 10.1007/978-3-540-88688-4_48

A. Klaser and M. Marszalek, A Spatio-Temporal Descriptor Based on 3D-Gradients, Procedings of the British Machine Vision Conference 2008, 2008.
DOI : 10.5244/C.22.99

URL : https://hal.archives-ouvertes.fr/inria-00514853

L. Yeffet and L. Wolf, Local Trinary Patterns for human action recognition, 2009 IEEE 12th International Conference on Computer Vision, pp.492-497, 2009.
DOI : 10.1109/ICCV.2009.5459201

S. Megrhi, W. Souidene, and A. Beghdadi, Spatio-temporal salient feature extraction for perceptual content based video retrieval, 2013 Colour and Visual Computing Symposium (CVCS), pp.1-7, 2013.
DOI : 10.1109/CVCS.2013.6626272

I. Laptev, On Space-Time Interest Points, International Journal of Computer Vision, vol.17, issue.8, pp.107-123, 2005.
DOI : 10.1007/BFb0017862

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.886-893, 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

I. Laptev and T. Lindeberg, Local descriptors for spatio-temporal recognition , " in Spatial Coherence for Visual Motion Analysis, pp.91-103, 2006.

G. David and . Lowe, Distinctive image features from scale-invariant keypoints, International journal of computer vision, vol.60, issue.2, pp.91-110, 2004.

R. Jain and H. Nagel, On the Analysis of Accumulative Difference Pictures from Image Sequences of Real World Scenes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.1, issue.2, pp.206-214, 1979.
DOI : 10.1109/TPAMI.1979.4766907

A. Neri, S. Colonnese, G. Russo, and P. Talone, Automatic moving object and background separation, Signal Processing, vol.66, issue.2, pp.219-232, 1998.
DOI : 10.1016/S0165-1684(98)00007-3

C. Richard-wren, A. Azarbayejani, T. Darrell, and A. P. Pentland, Pfinder: real-time tracking of the human body, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, issue.7, pp.780-785, 1997.
DOI : 10.1109/34.598236

C. Stauffer, W. Eric, and L. Grimson, Learning patterns of activity using real-time tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.8, pp.747-757, 2000.
DOI : 10.1109/34.868677

J. Rittscher, J. Kato, S. Joga, and A. Blake, A Probabilistic Background Model for Tracking, Computer VisionECCV, pp.336-350, 2000.
DOI : 10.1007/3-540-45053-X_22

A. Monnet, A. Mittal, N. Paragios, and V. Ramesh, Background modeling and subtraction of dynamic scenes, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1305-1312, 2003.
DOI : 10.1109/ICCV.2003.1238641

J. Zhong and S. Sclaroff, Segmenting foreground objects from a dynamic textured background via a robust Kalman filter, Proceedings Ninth IEEE International Conference on Computer Vision, pp.44-50, 2003.
DOI : 10.1109/ICCV.2003.1238312

J. Carlos-niebles, C. Chen, and L. Fei-fei, Modeling temporal structure of decomposable motion segments for activity classification, Computer Vision? ECCV 2010, pp.392-405, 2010.

P. Matikainen, M. Hebert, and R. Sukthankar, Trajectons: Action recognition through the motion analysis of tracked features, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp.514-521, 2009.
DOI : 10.1109/ICCVW.2009.5457659

R. Messing, C. Pal, and H. Kautz, Activity recognition using the velocity histories of tracked keypoints, 2009 IEEE 12th International Conference on Computer Vision, pp.104-111, 2009.
DOI : 10.1109/ICCV.2009.5459154

C. Fanti, L. Zelnik-manor, and P. Perona, Hybrid Models for Human Motion Recognition, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.1166-1173, 2005.
DOI : 10.1109/CVPR.2005.179

O. Oreifej and Z. Liu, HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.716-723, 2013.
DOI : 10.1109/CVPR.2013.98

N. Ikizler, R. Gokberk-cinbis, and P. Duygulu, Human action recognition with line and flow histograms, 2008 19th International Conference on Pattern Recognition, pp.1-4, 2008.
DOI : 10.1109/ICPR.2008.4761434

Y. Song, L. Goncalves, and P. Perona, Unsupervised learning of human motion models, Advances in Neural Information Processing Systems, 2003.

C. Rao, A. Yilmaz, and M. Shah, View-invariant representation and recognition of actions, International Journal of Computer Vision, vol.50, issue.2, pp.203-226, 2002.
DOI : 10.1023/A:1020350100748

P. Naresh, R. Cuntoor, and . Chellappa, Epitomic representation of human activities, Computer Vision and Pattern Recognition, pp.1-8, 2007.

J. Sun, X. Wu, S. Yan, L. Cheong, T. Chua et al., Hierarchical spatio-temporal context modeling for action recognition, Computer Vision and Pattern Recognition, pp.2004-2011, 2009.

D. Bruce, T. Lucas, and . Kanade, An iterative image registration technique with an application to stereo vision, IJCAI, pp.674-679, 1981.

H. Uemura, S. Ishikawa, and K. Mikolajczyk, Feature Tracking and Motion Compensation for Action Recognition, Procedings of the British Machine Vision Conference 2008, pp.1-10, 2008.
DOI : 10.5244/C.22.30

J. Sun, Y. Mu, S. Yan, and L. Cheong, Activity recognition using dense long-duration trajectories, 2010 IEEE International Conference on Multimedia and Expo, pp.322-327, 2010.
DOI : 10.1109/ICME.2010.5583046

URL : http://www.ami-lab.org/uploads/Publications/Conference/WP4/Activity+recognition+using+dense+long-duration+trajectories.pdf

H. Wang, A. Klaser, C. Schmid, and C. Liu, Action recognition by dense trajectories, CVPR 2011, pp.3169-3176, 2011.
DOI : 10.1109/CVPR.2011.5995407

URL : https://hal.archives-ouvertes.fr/inria-00583818

N. Johnson and D. Hogg, Learning the distribution of object trajectories for event recognition, Image and Vision Computing, vol.14, issue.8, pp.609-615, 1996.
DOI : 10.1016/0262-8856(96)01101-8

W. Lu, Y. Wang, and C. Chen, Learning Dense Optical-Flow Trajectory Patterns for Video Object Extraction, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp.315-322, 2010.
DOI : 10.1109/AVSS.2010.79

A. Kläser, M. Marszaa-lek, C. Schmid, and A. Zisserman, Human Focused Action Localization in Video, Trends and Topics in Computer Vision, pp.219-233, 2012.
DOI : 10.1007/978-3-642-35749-7_17

N. Anjum and A. Cavallaro, Multifeature object trajectory clustering for video analysis Circuits and Systems for Video Technology, IEEE Transactions on, vol.18, issue.11, pp.1555-1564, 2008.

A. Hervieu, P. Bouthemy, and J. Cadre, A statistical video content recognition method using invariant features on object trajectories Circuits and Systems for Video Technology, IEEE Transactions on, vol.18, issue.11, pp.1533-1543, 2008.

M. Sapienza, F. Cuzzolin, and P. Torr, Learning discriminative spacetime actions from weakly labelled videos, 2012.
DOI : 10.5244/c.26.123

URL : http://www.bmva.org/bmvc/2012/BMVC/paper123/paper123.pdf

J. David, A. D. Fleet, and . Jepson, Stability of phase information Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.15, issue.12, pp.1253-1268, 1993.

P. Viola, J. Michael, D. Jones, and . Snow, Detecting pedestrians using patterns of motion and appearance, Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pp.734-741, 2003.

O. Kliper-gross, Y. Gurovich, T. Hassner, and L. Wolf, Motion Interchange Patterns for Action Recognition in Unconstrained Videos, Computer Vision?ECCV 2012, pp.256-269, 2012.
DOI : 10.1007/978-3-642-33783-3_19

G. Piriou, P. Bouthemy, and J. Yao, Recognition of Dynamic Video Contents With Global Probabilistic Models of Visual Motion, IEEE Transactions on Image Processing, vol.15, issue.11, pp.3417-3430, 2006.
DOI : 10.1109/TIP.2006.881963

URL : https://hal.archives-ouvertes.fr/hal-00453197

S. Wu, O. Oreifej, and M. Shah, Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories, 2011 International Conference on Computer Vision
DOI : 10.1109/ICCV.2011.6126397

M. Jain, H. Jégou, and P. Bouthemy, Better Exploiting Motion for Better Action Recognition, 2013 IEEE Conference on Computer Vision and Pattern Recognition
DOI : 10.1109/CVPR.2013.330

URL : https://hal.archives-ouvertes.fr/hal-00813014

N. Dalal, B. Triggs, and C. Schmid, Human Detection Using Oriented Histograms of Flow and Appearance, Computer Vision?ECCV 2006, pp.428-441, 2006.
DOI : 10.1109/ICCV.2003.1238422

URL : https://hal.archives-ouvertes.fr/inria-00548587

D. Cunado, S. Mark, . Nixon, N. John, and . Carter, Automatic extraction and description of human gait models for recognition purposes, Computer Vision and Image Understanding, vol.90, issue.1, pp.1-41, 2003.
DOI : 10.1016/S1077-3142(03)00008-0

H. Seo and P. Milanfar, Action recognition from one example Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.33, issue.5, pp.867-882, 2011.

J. Liu, S. Ali, and M. Shah, Recognizing human actions using multiple features, Computer Vision and Pattern Recognition, pp.1-8, 2008.

M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, and A. Baskurt, Sequential Deep Learning for Human Action Recognition, Human Behavior Understanding, pp.29-39, 2011.
DOI : 10.1007/978-3-642-25446-8_4

URL : https://hal.archives-ouvertes.fr/hal-01354493

O. Perez-concha, R. Y. , D. Xu, and M. Piccardi, Compressive sensing of time series for human action recognition, International Conference on Digital Image Computing: Techniques and Applications (DICTA, pp.454-461, 2010.

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Discriminative learned dictionaries for local image analysis, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.
DOI : 10.1109/CVPR.2008.4587652

X. Mei and H. Ling, Robust visual tracking using &# x2113; 1 minimization, International Conference on Computer Vision, pp.1436-1443, 2009.

R. Fernández and E. Viennet, Face identification using support vector machines, 1999.

M. Hoai, Z. Lan, and F. De-la-torre, Joint segmentation and classification of human actions in video, CVPR 2011, pp.3265-3272, 2011.
DOI : 10.1109/CVPR.2011.5995470

T. Zhang, J. Liu, S. Liu, Y. Ouyang, and H. Lu, Boosted exemplar learning for human action recognition, Computer Vision Workshops IEEE 12th International Conference on, pp.538-545, 2009.

A. Fathi and G. Mori, Action recognition by learning mid-level motion features, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.
DOI : 10.1109/CVPR.2008.4587735

M. Ahmad, I. Parvin, and S. Lee, Silhouette History and Energy Image Information for Human Movement Recognition, Journal of Multimedia, vol.5, issue.1, pp.12-21, 2010.
DOI : 10.4304/jmm.5.1.12-21

F. Moosmann, B. Triggs, and F. Jurie, Fast discriminative visual codebooks using randomized clustering forests, NIPS, p.4, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00203734

T. Yu, T. Kim, and R. Cipolla, Real-time Action Recognition by Spatiotemporal Semantic and Structural Forests, Procedings of the British Machine Vision Conference 2010, p.56, 2010.
DOI : 10.5244/C.24.52

N. Khai, . Tran, A. Ioannis, . Kakadiaris, K. Shishir et al., Modeling motion of body parts for action recognition, BMVC. Citeseer, pp.1-12, 2011.

A. Gaidon, Z. Harchaoui, and C. Schmid, Actom sequence models for efficient action detection, CVPR 2011, pp.3201-3208, 2011.
DOI : 10.1109/CVPR.2011.5995646

URL : https://hal.archives-ouvertes.fr/inria-00575217

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, Workshop on statistical learning in computer vision, ECCV, p.22, 2004.

A. P. , B. Lopes, E. Alves-do-valle-jr, J. Marques-de-almeida, and A. Albuquerque-de-araújo, Action recognition in videos: from motion capture labs to the web, 2010.

J. Liu, J. Luo, and M. Shah, Recognizing realistic actions from videos in the wild, Computer Vision and Pattern Recognition, pp.1996-2003, 2009.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), pp.2169-2178, 2006.
DOI : 10.1109/CVPR.2006.68

URL : https://hal.archives-ouvertes.fr/inria-00548585

D. Mikel, J. Rodriguez, M. Ahmed, and . Shah, Action mach a spatiotemporal maximum average correlation height filter for action recognition, Computer Vision and Pattern Recognition, pp.1-8, 2008.

K. Soomro, M. Amir-roshan-zamir, and . Shah, Ucf101: A dataset of 101 human actions classes from videos in the wild, 2012.

K. Soomro, M. Amir-roshan-zamir, and . Shah, UCF101: A dataset of 101 human actions classes from videos in the wild, 1212.

M. Mojarrad, M. A. Dezfouli, and A. Rahmani, Feature extraction of human body composition in images by segmentation method, World Academy of Science, Engineering and Technology, 2008.

D. Sun, S. Roth, J. Michael, and . Black, Secrets of optical flow estimation and their principles, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2432-2439, 2010.
DOI : 10.1109/CVPR.2010.5539939

J. Jan and . Koenderink, The structure of images, Biological cybernetics, vol.50, issue.5, pp.363-370, 1984.

R. Paul and . Beaudet, Rotationally invariant image operators, Proceedings of the International Joint Conference on Pattern Recognition, pp.579-583, 1978.

R. Tapu and T. Zaharia, High Level Video Temporal Segmentation, Advances in Visual Computing, pp.224-235, 2011.
DOI : 10.1023/B:VISI.0000029664.99615.94

URL : https://hal.archives-ouvertes.fr/hal-00625885

D. Dementhon and D. Doermann, Video retrieval of near-duplicates using ??-nearest neighbor retrieval of spatio-temporal descriptors, Multimedia Tools and Applications, pp.229-253, 2006.
DOI : 10.1109/34.589215

R. Poppe, A survey on vision-based human action recognition, Image and Vision Computing, vol.28, issue.6, pp.976-990, 2010.
DOI : 10.1016/j.imavis.2009.11.014

T. Yubing, F. Faouzi-alaya-cheikh, H. Fazal-elahi-guraya, A. Konik, and . Trémeau, A Spatiotemporal Saliency Model for Video Surveillance, Cognitive Computation, vol.82, issue.1, pp.241-263, 2011.
DOI : 10.1007/s11263-009-0215-3

Z. Jiang, Z. Lin, S. Larry, and . Davis, Recognizing human actions by learning and matching shape-motion prototype trees Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.34, issue.3, pp.533-547, 2012.

E. Vig, M. Dorr, and D. Cox, Space-Variant Descriptor Sampling for Action Recognition Based on Saliency and Eye Movements, Computer Vision? ECCV 2012, pp.84-97, 2012.
DOI : 10.1007/978-3-642-33786-4_7

M. D. Rodriguez, J. Ahmed, and M. Shah, Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587727

M. Marszalek, I. Laptev, and C. Schmid, Actions in context, " in Computer Vision and Pattern Recognition, pp.2929-2936, 2009.

W. Li, Z. Zhang, and Z. Liu, Expandable data-driven graphical modeling of human actions based on salient postures Circuits and Systems for Video Technology, IEEE Transactions on, vol.18, issue.11, pp.1499-1510, 2008.

L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, Actions as space-time shapes Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.29, issue.12, pp.2247-2253, 2007.

F. Martínez, A. Manzanera, and E. Romero, A motion descriptor based on statistics of optical flow orientations for action classification in videosurveillance, Multimedia and Signal Processing, pp.267-274, 2012.

Y. Ke, R. Sukthankar, and M. Hebert, Efficient visual event detection using volumetric features, Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on. IEEE, pp.166-173, 2005.

A. Alexei, . Efros, C. Alexander, G. Berg, J. Mori et al., Recognizing action at a distance, Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pp.726-733, 2003.

S. Wu, Y. Li, and J. Zhang, A hierarchical motion trajectory signature descriptor, Robotics and Automation ICRA 2008. IEEE International Conference on, pp.3070-3075, 2008.

J. Yang, K. Li, and . Wang, A new descriptor for 3d trajectory recognition via modified cdtw, Automation and Logistics (ICAL), 2010 IEEE International Conference on, pp.37-42, 2010.

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1470-1477, 2003.
DOI : 10.1109/ICCV.2003.1238663

J. Zhang, M. Marszaa-lek, S. Lazebnik, and C. Schmid, Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study, International Journal of Computer Vision, vol.36, issue.1, pp.213-238, 2007.
DOI : 10.1007/s11263-006-9794-4

URL : https://hal.archives-ouvertes.fr/inria-00548574

R. Jasper, . Uijlings, W. Arnold, . Smeulders, J. Remko et al., Real-time visual concept classification, Multimedia, IEEE Transactions on, vol.12, issue.7, pp.665-681, 2010.

M. Vrigkas, V. Karavasilis, C. Nikou, A. Ioannis, and . Kakadiaris, Matching mixtures of curves for human action recognition, Computer Vision and Image Understanding, vol.119, pp.27-40, 2014.
DOI : 10.1016/j.cviu.2013.11.007

A. Kovashka and K. Grauman, Learning a hierarchy of discriminative space-time neighborhood features for human action recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2046-2053, 2010.
DOI : 10.1109/CVPR.2010.5539881

V. Quoc, . Le, Y. Will, . Zou, Y. Serena et al., Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis, Computer Vision and Pattern Recognition (CVPR), pp.3361-3368, 2011.

N. Ikizler-cinbis and S. Sclaroff, Object, Scene and Actions: Combining Multiple Features for Human Action Recognition, Computer Vision?ECCV 2010, pp.494-507, 2010.
DOI : 10.1007/978-3-642-15549-9_36

T. Brox and J. Malik, Object Segmentation by Long Term Analysis of Point Trajectories, Computer Vision?ECCV 2010, pp.282-295, 2010.
DOI : 10.1007/978-3-642-15555-0_21

A. Gaidon, Z. Harchaoui, and C. Schmid, Recognizing activities with cluster-trees of tracklets, Procedings of the British Machine Vision Conference 2012, 2012.
DOI : 10.5244/C.26.30

URL : https://hal.archives-ouvertes.fr/hal-00722955

L. Shao, L. Ji, Y. Liu, and J. Zhang, Human action segmentation and recognition via motion and shape analysis, Pattern Recognition Letters, vol.33, issue.4, pp.438-445, 2012.
DOI : 10.1016/j.patrec.2011.05.015

URL : http://hdl.handle.net/10397/9173

L. Zappella, X. Lladó, and J. Salvi, Motion segmentation: A review, Proceedings of the 2008 conference on Artificial Intelligence Research and Development: Proceedings of the 11th International Conference of the Catalan Association for Artificial Intelligence, pp.398-407, 2008.

L. Zappella, X. Lladó, and J. Salvi, New Trends in Motion Segmentation, Pattern Recognition, pp.31-46, 2009.
DOI : 10.5772/7551

S. Ali and M. Shah, A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-6, 2007.
DOI : 10.1109/CVPR.2007.382977

D. Cremers and S. Soatto, Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation, International Journal of Computer Vision, vol.18, issue.9, pp.249-265, 2005.
DOI : 10.1109/34.537343

T. Brox, M. Rousson, R. Deriche, and J. Weickert, Colour, texture, and motion in level set based segmentation and tracking, Image and Vision Computing, vol.28, issue.3, pp.376-390, 2010.
DOI : 10.1016/j.imavis.2009.06.009

URL : https://hal.archives-ouvertes.fr/hal-00531465

K. Berthold, . Horn, G. Brian, and . Schunck, Determining optical flow, Technical Symposium East. International Society for Optics and Photonics, pp.319-331, 1981.

J. Bouguet, Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm, Intel Corporation, vol.5, 2001.

F. Jurie and B. Triggs, Creating efficient codebooks for visual recognition, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.604-610, 2005.
DOI : 10.1109/ICCV.2005.66

URL : https://hal.archives-ouvertes.fr/inria-00548511

L. Fei-fei and P. Perona, A Bayesian Hierarchical Model for Learning Natural Scene Categories, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.524-531, 2005.
DOI : 10.1109/CVPR.2005.16

E. Nowak, F. Jurie, and B. Triggs, Sampling Strategies for Bag-of-Features Image Classification, Computer Vision?ECCV 2006, pp.490-503, 2006.
DOI : 10.1007/BF01469346

URL : https://hal.archives-ouvertes.fr/hal-00203752

J. Odobez and P. Bouthemy, Robust Multiresolution Estimation of Parametric Motion Models, Journal of Visual Communication and Image Representation, vol.6, issue.4, pp.348-365, 1995.
DOI : 10.1006/jvci.1995.1029

D. Hang, N. , and K. Yanai, A spatio-temporal feature based on triangulation of dense surf, Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, pp.420-427, 2013.

J. Canny, A computational approach to edge detection Pattern Analysis and Machine Intelligence, IEEE Transactions on, issue.6, pp.679-698, 1986.

R. Brunelli, Template matching techniques in computer vision, 2008.
DOI : 10.1002/9780470744055

O. Murthy and R. Goecke, Ordered Trajectories for Large Scale Human Action Recognition, 2013 IEEE International Conference on Computer Vision Workshops
DOI : 10.1109/ICCVW.2013.61

F. Shi, R. Laganiere, E. Petriu, and H. Zhen, Lpm for fast action recognition with large number of classes, THUMOS: ICCV Workshop on Action Recognition with a Large Number of Classes. Notebook paper, 2013.

S. Karaman, L. Seidenari, D. Andrew, A. D. Bagdanov, and . Bimbo, L1-regularized logistic regression stacking and transductive crf smoothing for action recognition in video, ICCV Workshop on Action Recognition with a Large Number of Classes, 2013.

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar et al., Large-Scale Video Classification with Convolutional Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.223

K. Simonyan and A. Zisserman, Two-stream convolutional networks for action recognition in videos, 2014.