Joint Audio-Visual Words for Violent Scenes Detection in Movies, ACM International Conference on Multimedia Retrieval (ICMR) ,
Production d'annotations par plan pour l'indexation des vidéos, Rencontres Jeunes Chercheurs (RJC) ,
LIG at MediaEval 2013 Affect Task : Use of a Generic Method and Joint Audio-Visual Words, Workshop, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00953091
LIG at MediaEval 2012 affect task : use of a generic method, Workshop, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00770536
Good Practice in Large-Scale Learning for Image Classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, issue.3, p.35, 2013. ,
DOI : 10.1109/TPAMI.2013.146
URL : https://hal.archives-ouvertes.fr/hal-00690014
FREAK: Fast Retina Keypoint, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.510-517, 2012. ,
DOI : 10.1109/CVPR.2012.6247715
URL : https://infoscience.epfl.ch/record/175537/files/2069.pdf
POP: Patchwork of Parts Models for Object Recognition, International Journal of Computer Vision, vol.7, issue.2, pp.267-282, 2007. ,
DOI : 10.1007/978-1-4757-2440-0
Multimodal fusion for multimedia analysis: a survey, Multimedia systems, pp.345-379, 2010. ,
DOI : 10.1115/1.3662552
Indexation de documents multimédia par réseaux d'opérateurs, pp.385-400, 2007. ,
Classifier fusion for SVM-based multimedia semantic indexing IRIM at TRECVID 2012 : Semantic Indexing and Instance Search, Advances in Information Retrieval Proceedings of the workshop on TREC Video Retrieval Evaluation (TRECVID) IRIM at TRECVID 2013 : Semantic Indexing and Instance Search " . In : Proceedings of the workshop on TREC Video Retrieval Evaluation (TRECVID), pp.494-504, 2007. ,
Space-time shapelets for action recognition Speeded-up robust features (SURF), Motion and video Computing, pp.1-6, 2008. ,
DOI : 10.1109/wmvc.2008.4544051
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.1567
A graphical model for audiovisual object tracking " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.25, issue.7, pp.828-836, 2003. ,
DOI : 10.1109/tpami.2003.1206512
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.7327
Violence Detection in Video Using Computer Vision Techniques, Computer Analysis of Images and Patterns, pp.332-339, 2011. ,
DOI : 10.1109/AVSS.2007.4425310
Part-Based Statistical Models for Object Classification and Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.734-740, 2005. ,
DOI : 10.1109/CVPR.2005.270
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.297.3354
Recognition-by-components: A theory of human image understanding., Psychological Review, vol.94, issue.2, pp.115-135, 1987. ,
DOI : 10.1037/0033-295X.94.2.115
Actions as space-time shapes, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.1395-1402, 2005. ,
DOI : 10.1109/ICCV.2005.28
The recognition of human movement using temporal templates " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.23, issue.3, pp.257-267, 2001. ,
Comparison of video shot boundary detection techniques, Journal of Electronic Imaging, vol.5, issue.2, pp.122-128, 1996. ,
DOI : 10.1117/12.238675
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.2179
Representing shape with a spatial pyramid kernel, Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR '07, pp.401-408, 2007. ,
DOI : 10.1145/1282280.1282340
Scene classification using a hybrid generative/discriminative approach " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.30, issue.4, pp.712-727, 2008. ,
DOI : 10.1109/tpami.2007.70716
The Tradeoffs of Large Scale Learning, p.36, 2007. ,
LIBSVM : a library for support vector machines, p.31, 2001. ,
LIBSVM : a library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST), vol.2, issue.11 3, pp.27-36, 2011. ,
SMOTE : synthetic minority over-sampling technique " . arXiv preprint arXiv :1106.1813 Learning a similarity metric discriminatively, with application to face verification, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, pp.36-539, 2005. ,
Stratification approach to modeling video, Multimedia Tools and Applications, pp.79-97, 2002. ,
An exemplar model for learning object classes " . In : Computer Vision and Pattern Recognition, CV- PR'07. IEEE Conference on, pp.1-8, 2007. ,
Nearest neighbor pattern classification " . Information Theory, IEEE Transactions on, vol.13, issue.1, pp.21-27, 1967. ,
DOI : 10.1109/tit.1967.1053964
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.68.2616
On the algorithmic implementation of multiclass kernel-based vector machines, The Journal of Machine Learning Research, vol.2, pp.265-292, 2002. ,
Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition, Computer Vision?ECCV 2006, pp.16-29, 2006. ,
DOI : 10.1007/11744023_2
Audio-Visual Event Recognition in Surveillance Video Sequences, Multimedia, pp.257-267, 2007. ,
DOI : 10.1109/TMM.2006.886263
Visual categorization with bags of keypoints, Workshop on statistical learning in computer vision, ECCV, pp.1-2, 2004. ,
Human detection using oriented histograms of flow and appearance, Computer Vision?ECCV 2006, pp.428-441, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00548587
Action recognition for surveillance applications using optic flow and SVM, Computer Vision?ACCV 2007, pp.457-466, 2007. ,
Person-on-person violence detection in video data, Object recognition supported by user interaction for service robots, pp.433-438, 2002. ,
DOI : 10.1109/ICPR.2002.1044748
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.5.8417
Cinematic primitives for multimedia, IEEE Computer Graphics and Applications, vol.11, issue.4, pp.67-74, 1991. ,
DOI : 10.1109/38.126883
Revisiting the VLAD image representation, Proceedings of the 21st ACM international conference on Multimedia, MM '13, pp.653-656, 2013. ,
DOI : 10.1145/2502081.2502171
URL : https://hal.archives-ouvertes.fr/hal-00840653
The MediaEval 2013 Affect Task : Violent Scenes Detection, p.54, 1945. ,
URL : https://hal.archives-ouvertes.fr/hal-00932551
Benchmarking Violent Scenes Detection in movies, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI), p.59, 2014. ,
DOI : 10.1109/CBMI.2014.6849827
URL : https://hal.archives-ouvertes.fr/hal-00767036
LIG at MediaEval 2012 Affect Task : Use of a Generic Method, 1938. ,
URL : https://hal.archives-ouvertes.fr/hal-00770536
LIG at MediaEval 2013 Affect Task : Use of a Generic Method and Joint Audio-Visual Words, 1958. ,
URL : https://hal.archives-ouvertes.fr/hal-00953091
Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, vol.89, issue.1-2, pp.31-71, 1997. ,
DOI : 10.1016/S0004-3702(96)00034-3
URL : http://doi.org/10.1016/s0004-3702(96)00034-3
Behavior Recognition via Sparse Spatio-Temporal Features, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp.65-72, 2005. ,
DOI : 10.1109/VSPETS.2005.1570899
C4. 5, class imbalance, and cost sensitivity : why under-sampling beats over-sampling, Workshop on Learning from Imbalanced Datasets II, Citeseer, p.36, 2003. ,
A Local Temporal Context-Based Approach for TV News Story Segmentation, 2012 IEEE International Conference on Multimedia and Expo, pp.973-978, 2012. ,
DOI : 10.1109/ICME.2012.3
URL : https://hal.archives-ouvertes.fr/hal-00767396
Recognizing action at a distance, Proceedings Ninth IEEE International Conference on Computer Vision, pp.726-733, 2003. ,
DOI : 10.1109/ICCV.2003.1238420
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.331.921
The pascal visual object classes (voc) challenge, International journal of computer vision, vol.88, issue.2, pp.303-338, 2010. ,
Learning hierarchical features for scene labeling " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.35, issue.8, pp.1915-1929, 2013. ,
Two-frame motion estimation based on polynomial expansion In : Image Analysis Recognizing and learning object categories, CVPR Short Course, pp.363-370, 2003. ,
Object detection with discriminatively trained part-based models " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.32, issue.9, pp.1627-1645, 2010. ,
Multi-modal information fusion for news story segmentation in broadcast video, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1417-1420, 2012. ,
DOI : 10.1109/ICASSP.2012.6288156
Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition, International Journal of Computer Vision, vol.20, issue.1, pp.273-303, 2007. ,
DOI : 10.1109/34.655647
Groups of adjacent contour segments for object detection " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.30, issue.1, pp.36-51, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00203719
Learning joint statistical models for audio-visual fusion and segregation, pp.772-778, 2000. ,
Optimized cutting plane algorithm for support vector machines, Proceedings of the 25th international conference on Machine learning, pp.320-327, 2008. ,
Object localization/segmentation using generic shape priors, 18th International Conference on, pp.41-44, 2006. ,
Weakly Supervised Object Localization with Stable Segmentations, Computer Vision?ECCV, pp.193-207, 2008. ,
DOI : 10.1007/978-3-540-88682-2_16
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.210.4047
Kernel Codebooks for Scene Categorization, Computer Vision?ECCV, pp.696-709, 2008. ,
DOI : 10.1007/978-3-540-88690-7_52
Supervised Nonlinear Dimensionality Reduction for Visualization and Classification, Systems, Man, and Cybernetics, pp.1098-1107, 2005. ,
DOI : 10.1109/TSMCB.2005.850151
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.678.4764
Color-based object recognition, Pattern recognition, vol.32, issue.3, pp.453-464, 1999. ,
Violence Content Classification Using Audio Features, pp.502-507, 2006. ,
DOI : 10.1017/CBO9780511801389
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.74.3830
Audio-Visual Fusion for Detecting Violent Scenes in Videos, Artificial Intelligence : Theories, Models and Applications, pp.91-100, 2010. ,
DOI : 10.1007/978-3-642-12842-4_13
Detecting Violent Scenes in Movies by Auditory and Visual Cues, Advances in Multimedia Information Processing -PCM 2008, pp.317-326, 2008. ,
Actions as space-time shapes " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.29, issue.12, pp.2247-2253, 2007. ,
DOI : 10.1109/tpami.2007.70711
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.100.8218
IRIM at TRECVID 2010 : High level feature extraction and instance search, p.107, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00953839
A discriminative kernel-based approach to rank images from text queries " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.30, issue.8, pp.1371-1384, 2008. ,
DOI : 10.1109/tpami.2007.70791
Multi-layer multiinstance learning for video concept detection " . Multimedia, IEEE Transactions on, vol.10, issue.8, pp.1605-1616, 2008. ,
DOI : 10.1109/tmm.2008.2007290
Composite Concept Discovery for Zero-Shot Video Event Detection, p.30, 2014. ,
Stop-Frame Removal Improves Web Video Classification, Proceedings of International Conference on Multimedia Retrieval, pp.499-83, 2014. ,
Two-layers re-ranking approach based on contextual information for visual concepts detection in videos, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI), pp.1-6, 2012. ,
DOI : 10.1109/CBMI.2012.6269837
URL : https://hal.archives-ouvertes.fr/hal-00767172
Quaero at TRECVID 2013 : Semantic Indexing and Instance Search, Proc. TRECVID Workshop, p.74, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00953086
On independent component analysis for multimedia signals, Multimedia Image and Video Processing, pp.175-199, 2000. ,
Distributional structure, p.23, 1954. ,
Combining efficient object localization and image classification, 2009 IEEE 12th International Conference on Computer Vision, pp.237-244, 2009. ,
DOI : 10.1109/ICCV.2009.5459257
URL : https://hal.archives-ouvertes.fr/inria-00439516
Image indexing using color correlograms, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.762-768, 1997. ,
DOI : 10.1109/CVPR.1997.609412
Improving Bag-of-Features for Large Scale Image Search, International Journal of Computer Vision, vol.42, issue.3, pp.316-336, 2010. ,
DOI : 10.1007/s11263-009-0285-2
Aggregating local descriptors into a compact image representation, 2010 IEEE Conference on, pp.3304-3311, 2010. ,
Negative evidences and co-occurences in image retrieval : The benefit of PCA and whitening Aggregating local image descriptors into compact codes " . Pattern Analysis and Machine Intelligence Caffe : An Open Source Convolutional Architecture for Fast Feature Embedding, Computer Vision?ECCV 2012, pp.774-787, 2012. ,
Domain adaptive semantic diffusion for large scale context-based video annotation, Computer Vision IEEE 12th International Conference on, pp.1420-1427, 2009. ,
Audio-visual grouplet, Proceedings of the 19th ACM international conference on Multimedia, MM '11, pp.123-132, 2011. ,
DOI : 10.1145/2072298.2072316
Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations, Proceedings of International Conference on Multimedia Retrieval, ICMR '14, p.25, 2014. ,
DOI : 10.1145/2578726.2578739
Video modeling using strata-based annotation, IEEE Multimedia, vol.7, issue.1, pp.68-74, 2000. ,
DOI : 10.1109/93.839313
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.8878
Fast texture database retrieval using extended fractal features, Photonics West'98 Electronic Imaging International Society for Optics and Photonics, pp.162-173, 1997. ,
DOI : 10.1117/12.298440
Human activity recognition using a dynamic texture based method, pp.1-10, 2008. ,
Audiovisual diarization of people in video content, Multimedia Tools and Applications, vol.13, issue.4, pp.747-775, 2014. ,
DOI : 10.1007/978-3-540-68585-2_49
Independent component analysis for understanding multimedia content, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, pp.757-766, 2002. ,
DOI : 10.1109/NNSP.2002.1030096
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.650.5965
ImageNet classification with deep convolutional neural networks, Communications of the ACM, vol.60, issue.6, pp.1097-1105, 2012. ,
DOI : 10.1162/neco.2009.10-08-881
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.299.205
Addressing the curse of imbalanced training sets : one-sided selection, pp.179-186, 1997. ,
HMDB: A large video database for human motion recognition, 2011 International Conference on Computer Vision, pp.2556-2563, 2011. ,
DOI : 10.1109/ICCV.2011.6126543
URL : http://cbcl.mit.edu/publications/ps/Kuehne_etal_iccv11.pdf
Combining textual and visual cues for content-based image retrieval on the world wide web " . In : Content-Based Access of Image and Video Libraries, Proceedings. IEEE Workshop on, pp.24-28, 1998. ,
Beyond sliding windows: Object localization by efficient subwindow search, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008. ,
DOI : 10.1109/CVPR.2008.4587586
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.4517
Multimedia classification and event detection using double fusion, Multimedia Tools and Applications, pp.1-15, 2013. ,
DOI : 10.1109/TMM.2008.917359
On Space-Time Interest Points, International Journal of Computer Vision, vol.17, issue.8, pp.107-123, 2005. ,
DOI : 10.1007/BFb0017862
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.4359
A sparse texture representation using local affine regions " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.27, issue.8, pp.1265-1278, 2005. ,
DOI : 10.1109/tpami.2005.151
URL : https://hal.archives-ouvertes.fr/inria-00548530
Beyond bags of features : Spatial pyramid matching for recognizing natural scene categories " . In : Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, pp.2169-2178, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00548585
Convolutional networks and applications in vision, Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pp.253-256, 2010. ,
DOI : 10.1109/ISCAS.2010.5537907
Gradient-based learning applied to document recognition Multicategory support vector machines : Theory and application to the classification of microarray data and satellite radiance data, Proceedings of the IEEE, pp.2278-2324, 1998. ,
An Implicit Shape Model for Combined Object Categorization and Segmentation, Workshop on Statistical Learning in Computer Vision, ECCV, pp.7-66, 2004. ,
DOI : 10.1007/11957959_26
Rapid natural scene categorization in the near absence of attention, Proceedings of the National Academy of Sciences, pp.9596-9601, 2002. ,
DOI : 10.1126/science.287.5456.1273
Object Bank : A High- Level Image Representation for Scene Classification & Semantic Feature Sparsification, pp.5-30, 2010. ,
DOI : 10.1007/s11263-013-0660-x
Weakly-Supervised Violence Detection in Movies with Audio and Video Based Co-training, Advances in Multimedia Information Processing -PCM 2009, pp.930-935, 2009. ,
DOI : 10.1007/978-3-642-10467-1_84
Large-scale image classification: Fast feature extraction and SVM training, CVPR 2011, pp.1689-1696, 2011. ,
DOI : 10.1109/CVPR.2011.5995477
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.225.3736
Integrated feature selection and higher-order spatial feature extraction for object categorization, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008. ,
DOI : 10.1109/CVPR.2008.4587403
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.319.2362
Exploratory undersampling for classimbalance learning, Systems, Man, and Cybernetics, pp.539-550, 2009. ,
Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004. ,
DOI : 10.1023/B:VISI.0000029664.99615.94
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.4931
Object recognition from local scale-invariant features, Proceedings of the Seventh IEEE International Conference on Computer Vision, pp.1150-1157, 1999. ,
DOI : 10.1109/ICCV.1999.790410
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.4065
Multimodal feature integration for story boundary detection in broadcast news, Chinese Spoken Language Processing (ISCSLP) 7th International Symposium on, pp.420-425, 2010. ,
An iterative image registration technique with an application to stereo vision, pp.674-679, 1981. ,
Texture features for browsing and retrieval of image data " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.18, issue.8, pp.837-842, 1996. ,
Texture classification and segmentation using multiresolution simultaneous autoregressive models, Pattern Recognition, vol.25, issue.2, pp.173-188, 1992. ,
DOI : 10.1016/0031-3203(92)90099-5
Random Subwindows for Robust Image Classification, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.34-40, 2005. ,
DOI : 10.1109/CVPR.2005.287
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.69.8683
Actions in context, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.2929-2936, 2009. ,
DOI : 10.1109/CVPR.2009.5206557
URL : https://hal.archives-ouvertes.fr/inria-00548645
Searching informative concept banks for video event detection, Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, ICMR '13, pp.255-262, 2013. ,
DOI : 10.1145/2461466.2461507
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.310.8951
2013 internet trends, 2013. ,
Semantic model vectors for complex video event recognition, Multimedia IEEE Transactions on, vol.14, issue.1, pp.88-101, 2012. ,
Scale & affine invariant interest point detectors, International journal of computer vision, vol.60, issue.1, pp.63-86, 2004. ,
URL : https://hal.archives-ouvertes.fr/inria-00548554
Features for contentbased audio retrieval Advances in computers, Multimodal video concept detection via bag of auditory words and multiple kernel learning " . In : Advances in Multimedia Modeling, pp.71-150, 2010. ,
On the detection of semantic concepts at TRECVID, Proceedings of the 12th annual ACM international conference on Multimedia , MULTIMEDIA '04, pp.660-667, 2004. ,
DOI : 10.1145/1027527.1027680
Weakly supervised discriminative localization and classification: a joint learning process, 2009 IEEE 12th International Conference on Computer Vision, pp.1925-1932, 2009. ,
DOI : 10.1109/ICCV.2009.5459426
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.153.2127
Sampling strategies for bagof-features image classification, Computer Vision?ECCV 2006, pp.490-503, 2006. ,
DOI : 10.1007/11744085_38
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.9956
Multiresolution grayscale and rotation invariant texture classification with local binary patterns " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.24, issue.7, pp.971-987, 2002. ,
A comparative study of texture measures with classification based on featured distributions, Pattern recognition, vol.29, issue.1, pp.51-59, 1996. ,
Modeling the shape of the scene : A holistic representation of the spatial envelope, International Journal of Computer Vision, vol.42, issue.3, pp.145-175, 2001. ,
DOI : 10.1023/A:1011139631724
Object Localization with Boosting and Weak Supervision for Generic Object Recognition, Image Analysis, pp.862-871, 2005. ,
DOI : 10.1007/11499145_87
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.211.7741
An overview of the goals, tasks, data, evaluation mechanisms and metrics, TRECVID 2013-TREC Video Retrieval Evaluation Online, pp.45-74, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-01230444
Scene recognition and weakly supervised object localization with deformable part-based models, 2011 International Conference on Computer Vision, pp.1307-1314, 2011. ,
DOI : 10.1109/ICCV.2011.6126383
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.300.7841
The role of features, algorithms and data in visual recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2328-2335, 2010. ,
DOI : 10.1109/CVPR.2010.5539920
Audio event detection in movies using multiple audio words and contextual Bayesian networks, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI), pp.17-22, 2013. ,
DOI : 10.1109/CBMI.2013.6576546
URL : https://hal.archives-ouvertes.fr/hal-00822022
Fisher Kernels on Visual Vocabularies for Image Categorization, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007. ,
DOI : 10.1109/CVPR.2007.383266
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.7388
Large-scale image categorization with explicit data embedding, 2010 IEEE Conference on, pp.2297-2304, 2010. ,
Improving the fisher kernel for large-scale image classification, Computer Vision?ECCV 2010, pp.143-156, 2010. ,
Fast training of support vector machines using sequential minimal optimization, p.36, 1999. ,
Weakly supervised learning of interactions between humans and objects " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.34, issue.3, pp.601-614, 2012. ,
DOI : 10.1109/tpami.2011.158
URL : https://hal.archives-ouvertes.fr/inria-00516477
Efficient mining of frequent and distinctive feature configurations, Computer Vision , 2007. ICCV 2007. IEEE 11th International Conference on, pp.1-8, 2007. ,
Reclassement d'images par le contenu, CORIA 2012, p.96 ,
Training Deformable Models for Localization, Computer Vision and Pattern Recognition IEEE Computer Society Conference on, pp.206-213, 2006. ,
Saliency moments for image categorization, Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR '11, pp.39-107, 2011. ,
DOI : 10.1145/1991996.1992035
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.8189
Deriving a discriminative color model for a given object class from weakly labeled training data, Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR '12, pp.44-68, 2012. ,
DOI : 10.1145/2324796.2324848
Evaluating knowledge transfer and zero-shot learning in a large-scale setting, CVPR 2011, pp.1641-1648, 2011. ,
DOI : 10.1109/CVPR.2011.5995627
Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science, vol.290, issue.5500, pp.2323-2326, 2000. ,
DOI : 10.1126/science.290.5500.2323
URL : http://astro.temple.edu/~msobel/courses_files/saulmds.pdf
Human face detection in visual scenes, p.66, 1995. ,
Exploring Video Structure Beyond The Shots, Proceedings of the IEEE International Conference on Multimedia Computing and Systems, pp.237-249, 1998. ,
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), pp.1605-1614, 2006. ,
DOI : 10.1109/CVPR.2006.326
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.184.2856
Action bank: A high-level representation of activity in video, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.1234-1241, 2012. ,
DOI : 10.1109/CVPR.2012.6247806
Evaluations of multi-learner approaches for concept indexing in video documents, Adaptivity, Personalization and Fusion of Heterogeneous Information, pp.88-91, 2010. ,
Quaero at TRECVID 2011 : Semantic Indexing and Multimedia Event Detection, p.32, 2011. ,
Re-ranking for multimedia indexing and retrieval, Advances in Information Retrieval, pp.708-711, 2011. ,
Active learning with multiple classifiers for multimedia indexing, Multimedia Tools and Applications, pp.403-417, 2012. ,
DOI : 10.1007/s11042-010-0599-7
URL : https://hal.archives-ouvertes.fr/hal-00953838
High-dimensional signature compression for large-scale image classification, CVPR 2011, pp.1665-1672, 2011. ,
DOI : 10.1109/CVPR.2011.5995504
Image Classification with the Fisher Vector: Theory and Practice, International Journal of Computer Vision, vol.73, issue.2, pp.222-245, 2013. ,
DOI : 10.1007/s11263-006-9794-4
Multimodal Speaker Identification Using Canonical Correlation Analysis, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, p.51, 2006. ,
DOI : 10.1109/ICASSP.2006.1660095
Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis, IEEE Transactions on Multimedia, vol.9, issue.7, pp.1396-1403, 2007. ,
DOI : 10.1109/TMM.2007.906583
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.118.2660
Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., pp.32-36, 2004. ,
DOI : 10.1109/ICPR.2004.1334462
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.173.6790
SVM optimization, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.928-935, 2008. ,
DOI : 10.1145/1390156.1390273
Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1470-1477, 2003. ,
DOI : 10.1109/ICCV.2003.1238663
Video shot boundary detection: Seven years of TRECVid activity, Computer Vision and Image Understanding, vol.114, issue.4, pp.411-418, 2010. ,
DOI : 10.1016/j.cviu.2009.03.011
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.148.2826
Multimedia semantic indexing using model vectors, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), p.445, 2003. ,
DOI : 10.1109/ICME.2003.1221649
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.454.4023
The stratification system a design environment for random access video, Network and Operating System Support for Digital Audio and Video, pp.250-261, 1993. ,
DOI : 10.1007/3-540-57183-3_22
Early versus late fusion in semantic video analysis, Proceedings of the 13th annual ACM international conference on Multimedia , MULTIMEDIA '05, pp.399-402, 2005. ,
DOI : 10.1145/1101149.1101236
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.5928
MediaMill at TRECVID 2013 : Searching concepts, objects, instances and events in video, p.79, 2013. ,
DOI : 10.1145/1873951.1874212
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.381.3359
Violence Detection in Video Using Spatio-Temporal Features, 2010 23rd SIBGRAPI Conference on Graphics, Patterns and Images, pp.224-230, 2010. ,
DOI : 10.1109/SIBGRAPI.2010.38
Retina enhanced SURF descriptors for spatio-temporal concept detection, Multimedia tools and applications, pp.443-469, 2014. ,
DOI : 10.1145/1390334.1390437
URL : https://hal.archives-ouvertes.fr/hal-00760192
Similarity of color images, IS&T/SPIE's Symposium on Electronic Imaging : Science & Technology International Society for Optics and Photonics, pp.381-392, 1995. ,
DOI : 10.1117/12.205308
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.2789
Color indexing, International journal of computer vision, vol.7, issue.1, pp.11-32, 1991. ,
Visual category recognition using Spectral Regression and Kernel Discriminant Analysis, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp.178-185, 2009. ,
DOI : 10.1109/ICCVW.2009.5457703
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.175.926
Pose primitive based human action recognition in videos or still images, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008. ,
DOI : 10.1109/CVPR.2008.4587721
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.324.989
An Empirical Study of MetaCost Using Boosting Algorithms, p.36, 2000. ,
DOI : 10.1007/3-540-45164-1_42
Extracting Subimages of an Unknown Category from a Set of Images, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 1 (CVPR'06), pp.927-934, 2006. ,
DOI : 10.1109/CVPR.2006.116
Efficient Object Category Recognition Using Classemes, Computer Vision?ECCV 2010, pp.776-789, 2010. ,
DOI : 10.1007/978-3-642-15549-9_56
Texture discrimination by Gabor functions, Biological Cybernetics, vol.55, issue.2-3, pp.71-82, 1986. ,
Identifying relevant frames in weakly labeled videos for training concept detectors, Proceedings of the 2008 international conference on Content-based image and video retrieval, CIVR '08, pp.9-16, 2008. ,
DOI : 10.1145/1386352.1386358
Boosting color saliency in image feature detection " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.28, issue.1, pp.150-156, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00548615
Evaluating color descriptors for object and scene recognition " . Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.32, issue.69, pp.1582-1596, 2010. ,
Learning the semantics of multimedia content with application to web image retrieval and classification, p.32, 2003. ,
A new approach to image retrieval with hierarchical color clustering " . Circuits and Systems for Video Technology, IEEE Transactions on, vol.8, issue.5, pp.628-643, 1998. ,
Action recognition by dense trajectories, CVPR 2011, pp.3169-3176, 2011. ,
DOI : 10.1109/CVPR.2011.5995407
URL : https://hal.archives-ouvertes.fr/inria-00583818
Free viewpoint action recognition using motion history volumes, Computer Vision and Image Understanding, vol.104, issue.2-3, pp.249-257, 2006. ,
DOI : 10.1016/j.cviu.2006.07.013
URL : https://hal.archives-ouvertes.fr/inria-00544629
Distance metric learning for large margin nearest neighbor classification, The Journal of Machine Learning Research, vol.10, pp.207-244, 2009. ,
Composition and search with a video algebra, IEEE Multimedia, vol.2, issue.1, pp.12-25, 1995. ,
DOI : 10.1109/93.368596
Support vector machines for multi-class pattern recognition, pp.61-72, 1999. ,
Object categorization by learned universal visual dictionary, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.1800-1807, 2005. ,
DOI : 10.1109/ICCV.2005.171
LOCUS: learning object classes with unsupervised segmentation, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.756-763, 2005. ,
DOI : 10.1109/ICCV.2005.148
Efficient Highly Over-Complete Sparse Coding Using a Mixture Model, Computer Vision?ECCV 2010, pp.113-126, 2010. ,
DOI : 10.1007/978-3-642-15555-0_9
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.175.1478
Joint audio-visual bi-modal codewords for video event detection, Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR '12, 1950. ,
DOI : 10.1145/2324796.2324843
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.394.1190
A differential geometric approach to representing the human actions, Computer Vision and Image Understanding, vol.109, issue.3, pp.335-351, 2008. ,
Weakly Supervised Object Recognition and Localization with Invariant High Order Features, Procedings of the British Machine Vision Conference 2010, pp.1-11, 2010. ,
DOI : 10.5244/C.24.47
Training cost-sensitive neural networks with methods addressing the class imbalance problem, IEEE Transactions on Knowledge and Data Engineering, vol.18, issue.1, pp.63-77, 2006. ,
DOI : 10.1109/TKDE.2006.17