@. R. Cinbis, J. Verbeek, and C. , Schmid Image categorization using Fisher kernels of non-iid image models, IEEE Conference on Computer Vision & Pattern Recognition (CVPR), 2012.

@. R. Cinbis, J. Verbeek, and C. , Schmid Unsupervised Metric Learning for Face Identification, TV Video IEEE Conference on Computer Vision (ICCV), 2011. Other publications ? R. G. Cinbis, S. Sclaroff Contextual Object Detection using Set-based Classification European Conference on Computer Vision (ECCV), 2012.

@. R. Cinbis, Selim Aksoy Relative Position-Based Spatial Relationships Using Mathematical Morphol, IEEE International Conference on Image Processing (ICIP), 2007.

@. Behcet-ugur-toreyin, R. Gokberk-cinbis, and Y. Dedeoglu, Ahmet Enis Cetin Fire Detection in Infrared Video Using Wavelet Analysis SPIE Optical Engineering, 2007.

A. Agarwal and B. Triggs, Hyperfeatures ??? Multilevel Local Coding for Visual Recognition, European Conference on Computer Vision, pp.30-43, 2006.
DOI : 10.1007/11744023_3

URL : https://hal.archives-ouvertes.fr/inria-00548592

S. Agarwal and D. Roth, Learning a Sparse Representation for Object Detection, European Conference on Computer Vision, pp.97-101, 2002.
DOI : 10.1007/3-540-47979-1_8

T. Ahonen, A. Hadid, and M. Pietikainen, Face Description with Local Binary Patterns: Application to Face Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.28, issue.12, pp.2037-2041244, 2006.
DOI : 10.1109/TPAMI.2006.244

T. Alexe, V. Deselaers, and . Ferrari, What is an object?, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.39, 2010.
DOI : 10.1109/CVPR.2010.5540226

T. Alexe, V. Deselares, and . Ferrari, Measuring the Objectness of Image Windows, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.11, pp.2189-2202, 2012.
DOI : 10.1109/TPAMI.2012.28

N. Alexe, Y. Heess, V. Teh, and . Ferrari, Searching for objects driven by context, Advances in Neural Information Processing Systems 25, pp.890-898

F. Alted, Why Modern CPUs Are Starving and What Can Be Done about It, Computing in Science & Engineering, vol.12, issue.2, pp.68-71
DOI : 10.1109/MCSE.2010.51

S. An, P. Peursum, W. Liu, and S. Venkatesh, Efficient algorithms for subwindow search in object detection and localization, IEEE Conference on Computer Vision and Pattern Recognition, pp.264-271, 2009.

A. Andreopoulos and J. K. Tsotsos, 50 Years of object recognition: Directions forward, Computer Vision and Image Understanding, vol.117, issue.8, pp.827-891, 2013.
DOI : 10.1016/j.cviu.2013.04.005

S. Andrews, I. Tsochantaridis, and T. Hofmann, Support vector machines for multiple-instance learning, Advances in Neural Information Processing Systems, p.99, 2002.

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, From contours to regions: An empirical evaluation, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.2294-2301, 2009.
DOI : 10.1109/CVPR.2009.5206707

P. Arbeláez, M. Maire, C. Fowlkes, and J. Malik, Contour Detection and Hierarchical Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.5, pp.898-916
DOI : 10.1109/TPAMI.2010.161

P. Arbeláez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev et al., Semantic segmentation using regions and parts, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.3378-3385
DOI : 10.1109/CVPR.2012.6248077

F. R. Bach and M. I. Jordan, Predictive low-rank decomposition for kernel methods, Proceedings of the 22nd international conference on Machine learning , ICML '05, p.31, 2005.
DOI : 10.1145/1102351.1102356

S. Bagon, O. Brostovski, M. Galun, and M. Irani, Detecting and sketching the common, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2010-88
DOI : 10.1109/CVPR.2010.5540233

D. H. Ballard, Generalizing the Hough transform to detect arbitrary shapes, Pattern Recognition, vol.13, issue.2, pp.111-122, 1981.
DOI : 10.1016/0031-3203(81)90009-1

A. Barla, F. Odone, and A. Verri, Histogram intersection kernel for image classification, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429), pp.513-542, 2003.
DOI : 10.1109/ICIP.2003.1247294

H. Bay, A. Ess, T. Tuytelaars, and L. Van-gool, Speeded-Up Robust Features (SURF), Computer Vision and Image Understanding, vol.110, issue.3, pp.346-359, 2008.
DOI : 10.1016/j.cviu.2007.09.014

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.205.738

J. Belongie, J. Malik, and . Puzicha, Shape matching and object recognition using shape contexts, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, issue.4, pp.509-522, 2002.
DOI : 10.1109/34.993558

T. Berg, A. Berg, J. Edwards, M. Maire, R. White et al., Names and faces in the news, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., p.38, 2004.
DOI : 10.1109/CVPR.2004.1315253

A. Bergamo and L. Torresani, Meta-class features for large-scale object categorization on a budget, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.3085-3092
DOI : 10.1109/CVPR.2012.6248040

I. Biederman, Recognition-by-components: A theory of human image understanding., Psychological Review, vol.94, issue.2, pp.115-147
DOI : 10.1037/0033-295X.94.2.115

T. O. Binford, Visual perception by computer, Proceedings of the IEEE Conference on Systems and Control, pp.1971-1975

C. Bishop, Pattern recognition and machine learning. Spinger-Verlag, pp.14-28, 2006.

M. Blaschko, A. Vedaldi, and A. Zisserman, Simultaneous object detection and ranking with weak supervision, Advances in Neural Information Processing Systems, pp.36-41, 2010.

M. B. Blaschko and C. H. Lampert, Learning to Localize Objects with Structured Output Regression, European Conference on Computer Vision, pp.2-15, 2008.
DOI : 10.1007/978-3-540-88682-2_2

M. B. Blaschko and C. H. Lampert, Object Localization with Global and Local Context Kernels, Procedings of the British Machine Vision Conference 2009, p.37, 2009.
DOI : 10.5244/C.23.63

D. Blei, A. Ng, and M. Jordan, Latent Dirichlet allocation, Journal of Machine Learning Research, vol.3, issue.113, pp.993-1022, 2003.

O. Boiman, E. Shechtman, and M. Irani, In defense of Nearest-Neighbor based image classification, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.14, 2008.
DOI : 10.1109/CVPR.2008.4587598

A. Bosch, A. Zisserman, and X. Munoz, Representing shape with a spatial pyramid kernel, Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR '07, p.19, 2007.
DOI : 10.1145/1282280.1282340

L. Bottou, Large-scale machine learning with stochastic gradient descent, COMPSTAT, p.28, 2010.

G. Bouchard and B. Triggs, The tradeoff between generative and discriminative classifiers, IASC International Symposium on Computational Statistics (COMPSTAT), pp.721-728, 2004.
URL : https://hal.archives-ouvertes.fr/inria-00548546

L. Bourdev and J. Brandt, Robust Object Detection via Soft Cascade, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.236-243310, 2005.
DOI : 10.1109/CVPR.2005.310

L. Bourdev and J. Malik, Poselets: Body part detectors trained using 3D human pose annotations, 2009 IEEE 12th International Conference on Computer Vision, pp.1365-1372, 2009.
DOI : 10.1109/ICCV.2009.5459303

Y. Boureau, F. Bach, Y. Lecun, and J. Ponce, Learning mid-level features for recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.16, 2010.
DOI : 10.1109/CVPR.2010.5539963

S. Branson, C. Wah, F. Schroff, B. Babenko, P. Welinder et al., Visual Recognition with Humans in the Loop, European Conference on Computer Vision, pp.2010-2035
DOI : 10.1007/978-3-642-15561-1_32

L. Breiman, Random forests, Machine Learning, pp.5-32, 2001.

R. A. Brooks, Symbolic reasoning among 3-D models and 2-D images, Artificial Intelligence, vol.17, issue.1-3, pp.285-3480004, 1981.
DOI : 10.1016/0004-3702(81)90028-X

M. Brown, G. Hua, and S. Winder, Discriminative Learning of Local Image Descriptors, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.1, pp.43-57
DOI : 10.1109/TPAMI.2010.54

J. Carreira and C. Sminchisescu, Constrained parametric min-cuts for automatic object segmentation, release 1, pp.2011-79

J. Carreira and C. Sminchisescu, CPMC: Automatic object segmentation using constrained parametric min-cuts. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.34, issue.79, pp.1312-1328
DOI : 10.1109/tpami.2011.231

J. Carreira, R. Caseiroa, J. Batista, and C. Sminchisescu, Semantic Segmentation with Second-Order Pooling, European Conference on Computer Vision, pp.18-67, 2012.
DOI : 10.1007/978-3-642-33786-4_32

H. Cevikalp and B. Triggs, Face recognition based on image sets, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2011-120
DOI : 10.1109/CVPR.2010.5539965

URL : https://hal.archives-ouvertes.fr/hal-00564979

K. W. Chang and D. Roth, Selective block minimization for faster convergence of limited memory large-scale linear models, Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pp.699-707
DOI : 10.1145/2020408.2020517

J. C. Chappelier and E. Eckard, PLSI: The True Fisher Kernel and beyond, Machine Learning and Knowledge Discovery in Databases, pp.195-210, 2009.
DOI : 10.1561/1500000008

K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman, The devil is in the details: an evaluation of recent feature encoding methods, Procedings of the British Machine Vision Conference 2011, pp.45-58, 2011.
DOI : 10.5244/C.25.76

G. Chen, Y. Ding, J. Xiao, and T. X. Han, Detection Evolution with Multi-order Contextual Co-occurrence, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.37, 2013.
DOI : 10.1109/CVPR.2013.235

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.308.6409

Q. Chen, Z. Song, R. Feris, A. Datta, L. Cao et al., Efficient Maximum Appearance Search for Large-Scale Object Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.2013-2049
DOI : 10.1109/CVPR.2013.410

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.662.808

N. Cherniavsky, I. Laptev, J. Sivic, and A. Zisserman, Semi-supervised Learning of Facial Attributes in Video, The first international workshop on parts and attributes, pp.2010-122
DOI : 10.1007/978-3-642-35749-7_4

M. Choi, J. Lim, A. Torralba, and A. Willsky, Exploiting hierarchical context on a large database of object categories, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2010-2047
DOI : 10.1109/CVPR.2010.5540221

O. Chum and A. Zisserman, An Exemplar Model for Learning Object Classes, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.33-40, 2007.
DOI : 10.1109/CVPR.2007.383050

R. G. Cinbis and S. Sclaroff, Contextual Object Detection Using Set-Based Classification, European Conference on Computer Vision, pp.43-57, 2012.
DOI : 10.1007/978-3-642-33783-3_4

URL : https://hal.archives-ouvertes.fr/hal-00756638

R. G. Cinbis, J. Verbeek, and C. Schmid, Unsupervised metric learning for face identification in TV video, 2011 International Conference on Computer Vision, pp.2011-2018
DOI : 10.1109/ICCV.2011.6126415

URL : https://hal.archives-ouvertes.fr/inria-00611682

R. G. Cinbis, J. Verbeek, and C. Schmid, Image categorization using Fisher kernels of non-iid image models, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.2012-2017
DOI : 10.1109/CVPR.2012.6247926

URL : https://hal.archives-ouvertes.fr/hal-00685943

R. G. Cinbis, J. Verbeek, and C. Schmid, Segmentation Driven Object Detection with Fisher Vectors, 2013 IEEE International Conference on Computer Vision, pp.2013-2019
DOI : 10.1109/ICCV.2013.369

URL : https://hal.archives-ouvertes.fr/hal-00873134

R. G. Cinbis, J. Verbeek, and C. Schmid, Multi-fold MIL Training for Weakly Supervised Object Localization, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.2014-2020
DOI : 10.1109/CVPR.2014.309

URL : https://hal.archives-ouvertes.fr/hal-00975746

S. Clinchant, G. Csurka, F. Perronnin, and J. Renders, XRCE's participation to ImagEval, ImageEval workshop at CVIR, p.13, 2007.

D. Comaniciu and P. Meer, Mean shift: A robust approach toward feature space analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.24, issue.33, pp.603-619, 2002.

T. Cour, B. Sapp, A. Nagle, and B. Taskar, Talking pictures: Temporal grouping and dialog-supervised person recognition, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.120-122, 2010.
DOI : 10.1109/CVPR.2010.5540106

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.192.6518

T. Cour, B. Sapp, and B. Taskar, Learning from partial labels, Journal of Machine Learning Research, pp.2011-2036

D. Crandall and D. Huttenlocher, Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition, European Conference on Computer Vision, p.38, 2006.
DOI : 10.1007/11744023_2

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, ECCV Int. Workshop on Stat. Learning in Computer Vision, p.10, 2004.

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, ECCV Int. Workshop on Stat. Learning in Computer Vision, pp.4-43, 2004.

Q. Dai and D. Hoiem, Learning to localize detected objects, IEEE Conference on Computer Vision and Pattern Recognition, p.36, 2012.

N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.5-32, 2005.
DOI : 10.1109/CVPR.2005.177

URL : https://hal.archives-ouvertes.fr/inria-00548512

T. Dean, M. A. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan et al., Fast, Accurate Detection of 100,000 Object Classes on a Single Machine, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.2013-2045
DOI : 10.1109/CVPR.2013.237

J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., Imagenet: A largescale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, p.80, 2009.

C. Desai, D. Ramanan, and C. Fowlkes, Discriminative models for multi-class object layout, International Conference on Computer Vision, p.37, 2009.
DOI : 10.1007/s11263-011-0439-x

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.161.8585

T. Deselaers, B. Alexe, and V. Ferrari, Localizing Objects While Learning Their Appearance, European Conference on Computer Vision, p.22, 2010.
DOI : 10.1007/978-3-642-15561-1_33

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.308.8826

T. Deselaers, B. Alexe, and V. Ferrari, Weakly Supervised Localization and Learning with Generic Knowledge, International Journal of Computer Vision, vol.73, issue.2, pp.257-293, 2012.
DOI : 10.1007/s11263-012-0538-3

T. Dietterich, R. Lathrop, and T. Lozano-pérez, Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, vol.89, issue.1-2, pp.31-71, 1997.
DOI : 10.1016/S0004-3702(96)00034-3

S. K. Divvala, A. A. Efros, and M. Hebert, How Important Are ???Deformable Parts??? in the Deformable Parts Model?, European Conference on Computer Vision Workshops, pp.31-40
DOI : 10.1007/978-3-642-33885-4_4

C. Dubout and F. Fleuret, Exact Acceleration of Linear Object Detectors, European Conference on Computer Vision, pp.301-311
DOI : 10.1007/978-3-642-33712-3_22

R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, p.10, 2001.

N. M. Elfiky, F. Shahbaz-khan, J. Van-de-weijer, and J. Gonzalez, Discriminative compact pyramids for object and scene recognition, Pattern Recognition, vol.45, issue.4, pp.1627-1636
DOI : 10.1016/j.patcog.2011.09.020

I. Endres and D. Hoiem, Category Independent Object Proposals, European Conference on Computer Vision, p.66, 2010.
DOI : 10.1007/978-3-642-15555-0_42

I. Endres and D. Hoiem, Category-Independent Object Proposals with Diverse Ranking, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, issue.2, pp.2014-2049
DOI : 10.1109/TPAMI.2013.122

M. Everingham, J. Sivic, and A. Zisserman, Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video, Procedings of the British Machine Vision Conference 2006, pp.120-122, 2006.
DOI : 10.5244/C.20.92

M. Everingham, L. Van-gool, C. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, p.58, 2007.
DOI : 10.1007/s11263-009-0275-4

M. Everingham, J. Sivic, and A. Zisserman, Taking the bite out of automated naming of characters in TV video, Image and Vision Computing, vol.27, issue.5, pp.545-559, 2009.
DOI : 10.1016/j.imavis.2008.04.018

M. Everingham, L. Van-gool, C. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, pp.303-338, 2010.
DOI : 10.1007/s11263-009-0275-4

H. Fan, Z. Cao, Y. Jiang, Q. Yin, and C. Doudou, Learning deep face representation, 2014.

R. E. Fan, P. H. Chen, and C. J. Lin, Working set selection using second order information for training support vector machines, Journal of Machine Learning Research, vol.6, pp.1889-1918, 2005.

R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin, LIBLINEAR: a library for large linear classification, Journal of Machine Learning Research, vol.9, pp.1871-1874, 2008.

A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, Describing objects by their attributes, 2009 IEEE Conference on Computer Vision and Pattern Recognition, p.23, 2009.
DOI : 10.1109/CVPR.2009.5206772

J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-taylor, Improving" bag-ofkeypoints" image categorisation: Generative models and pdf-kernels, pp.13-15, 2005.

L. Fei-fei, R. Fergus, and P. Perona, Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories, CVPR 2004 Workshop on Generative-Model Based Vision, 2004.
DOI : 10.1016/j.cviu.2005.09.012

P. Felzenszwalb and D. Huttenlocher, Efficient Graph-Based Image Segmentation, International Journal of Computer Vision, vol.59, issue.2, pp.167-181, 2004.
DOI : 10.1023/B:VISI.0000022288.19776.77

P. Felzenszwalb, R. Grishick, D. Mcallester, and D. Ramanan, Object Detection with Discriminatively Trained Part-Based Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.2010-2042
DOI : 10.1109/TPAMI.2009.167

P. F. Felzenszwalb, R. B. Girshick, and D. Mcallester, Cascade object detection with deformable part models, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2010-2043
DOI : 10.1109/CVPR.2010.5539906

B. Fernando, E. Fromont, and T. Tuytelaars, Effective Use of Frequent Itemset Mining for Image Classification, European Conference on Computer Vision, pp.2012-2032
DOI : 10.1007/978-3-642-33718-5_16

V. Ferrari and A. Zisserman, Learning visual attributes, Advances in Neural Information Processing Systems, pp.433-440, 2007.

S. Fidler, R. Mottaghi, A. Yuille, and R. Urtasun, Bottom-up segmentation for topdown detection, IEEE Conference on Computer Vision and Pattern Recognition, pp.37-81, 2013.

K. Fukushima, Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological Cybernetics, vol.40, issue.4, pp.193-202, 1980.
DOI : 10.1007/BF00344251

B. Fulkerson, A. Vedaldi, and S. Soatto, Localizing Objects with Smart Dictionaries, European Conference on Computer Vision, p.16, 2008.
DOI : 10.1007/978-3-540-88682-2_15

J. Gall and V. Lempitsky, Class-specific hough forests for object detection, IEEE Conference on Computer Vision and Pattern Recognition, p.33, 2009.

C. Galleguillos and S. Belongie, Context based object categorization: A critical survey, Computer Vision and Image Understanding, vol.114, issue.6, pp.2010-2047
DOI : 10.1016/j.cviu.2010.02.004

C. Galleguillos, A. Rabinovich, and S. Belongie, Object categorization using cooccurrence , location and appearance, IEEE Conference on Computer Vision and Pattern Recognition, p.37, 2008.
DOI : 10.1109/cvpr.2008.4587799

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.150.1109

Z. Ghahramani and G. E. Hinton, The EM algorithm for mixtures of factor analyzers, p.116, 1996.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint, pp.24-81

R. B. Girshick, P. F. Felzenszwalb, and D. Mcallester, Discriminatively trained deformable part models, release 5, pp.74-82

J. Goldberger, S. Gordon, and H. Greenspan, An efficient image similarity measure based on approximations of KL-divergence between two gaussian mixtures, Proceedings Ninth IEEE International Conference on Computer Vision, pp.487-493, 2003.
DOI : 10.1109/ICCV.2003.1238387

A. Graf and S. Borer, Normalization in Support Vector Machines, DAGM Pattern Recognition, pp.277-282, 2001.
DOI : 10.1007/3-540-45404-7_37

C. Gu, J. Lim, P. Arbeláez, and J. Malik, Recognition using regions, IEEE Conference on Computer Vision and Pattern Recognition, p.37, 2009.

C. Gu, P. Arbeláez, Y. Lin, K. Yu, and M. , Multi-component Models for Object Detection, European Conference on Computer Vision, pp.34-66, 2012.
DOI : 10.1007/978-3-642-33765-9_32

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, Automatic face naming with caption-based supervision, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.124, 2008.
DOI : 10.1109/CVPR.2008.4587603

URL : https://hal.archives-ouvertes.fr/inria-00321048

M. Guillaumin, J. Verbeek, and C. Schmid, Is that you? Metric learning approaches for face identification, 2009 IEEE 12th International Conference on Computer Vision, p.120, 2009.
DOI : 10.1109/ICCV.2009.5459197

URL : https://hal.archives-ouvertes.fr/inria-00439290

M. Guillaumin, J. Verbeek, and C. Schmid, Multimodal semi-supervised learning for image classification, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2010-2035
DOI : 10.1109/CVPR.2010.5540120

URL : https://hal.archives-ouvertes.fr/inria-00548640

M. Guillaumin, J. Verbeek, and C. Schmid, Multiple Instance Metric Learning from Automatically Labeled Bags of Faces, European Conference on Computer Vision, p.120, 2010.
DOI : 10.1007/978-3-642-15549-9_46

URL : https://hal.archives-ouvertes.fr/inria-00548639

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, Face Recognition from Caption-Based Supervision, International Journal of Computer Vision, vol.57, issue.2, pp.2011-120
DOI : 10.1007/s11263-011-0447-x

URL : https://hal.archives-ouvertes.fr/inria-00522185

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, Face Recognition from Caption-Based Supervision, International Journal of Computer Vision, vol.57, issue.2, pp.64-82
DOI : 10.1007/s11263-011-0447-x

URL : https://hal.archives-ouvertes.fr/inria-00522185

A. Halevy, P. Norvig, and F. Pereira, The Unreasonable Effectiveness of Data, IEEE Intelligent Systems, vol.24, issue.2, pp.8-12, 2009.
DOI : 10.1109/MIS.2009.36

B. Hariharan, J. Malik, and D. Ramanan, Discriminative Decorrelation for Clustering and Classification, European Conference on Computer Vision, pp.2012-81
DOI : 10.1007/978-3-642-33765-9_33

C. Harris and M. Stephens, A Combined Corner and Edge Detector, Procedings of the Alvey Vision Conference 1988, pp.147-151, 1988.
DOI : 10.5244/C.2.23

H. Harzallah, F. Jurie, and C. Schmid, Combining efficient object localization and image classification, 2009 IEEE 12th International Conference on Computer Vision, p.33, 2009.
DOI : 10.1109/ICCV.2009.5459257

URL : https://hal.archives-ouvertes.fr/inria-00439516

G. Heitz and D. Koller, Learning Spatial Context: Using Stuff to Find Things, European Conference on Computer Vision, p.37, 2008.
DOI : 10.1007/978-3-540-88682-2_4

R. Herbrich and T. Graepel, A PAC-Bayesian margin bound for linear classifiers. Information Theory, IEEE Transactions on, vol.48, issue.12, pp.3140-3150, 2002.

T. Hofmann, Learning the similarity of documents: An information-geometric approach to document retrieval and categorization, Advances in Neural Information Processing Systems, pp.914-920, 1999.

T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis, Machine Learning, pp.177-196, 2001.

D. Hoiem, A. A. Efros, and M. Hebert, Putting objects in perspective, IJCV, p.38, 2008.

C. J. Hsieh, K. W. Chang, C. J. Lin, S. S. Keerthi, and S. Sundararajan, A dual coordinate descent method for large-scale linear SVM, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.408-415, 2008.
DOI : 10.1145/1390156.1390208

M. Hu, Visual pattern recognition by moment invariants Information Theory, IRE Transactions on, vol.8, issue.2 1, pp.179-187, 1962.

G. Huang, M. Jones, and E. Learned-miller, LFW results using a combined Nowak plus MERL recognizer, Workshop on Faces Real-Life Images at European Conference on Computer Vision, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00326728

N. Ikizler-cinbis, R. G. Cinbis, and S. Sclaroff, Learning actions from the Web, 2009 IEEE 12th International Conference on Computer Vision, pp.995-1002, 2009.
DOI : 10.1109/ICCV.2009.5459368

T. Jaakkola and D. Haussler, Exploiting generative models in discriminative classifiers, Advances in Neural Information Processing Systems, pp.25-48, 1999.

H. Jégou, M. Douze, and C. Schmid, Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search, European Conference on Computer Vision, pp.304-317, 2008.
DOI : 10.1007/978-3-540-88682-2_24

H. Jégou, M. Douze, and C. Schmid, On the burstiness of visual elements, 2009 IEEE Conference on Computer Vision and Pattern Recognition, p.30, 2009.
DOI : 10.1109/CVPR.2009.5206609

H. Jégou, M. Douze, C. Schmid, and P. Pérez, Aggregating local descriptors into a compact image representation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.3304-3311
DOI : 10.1109/CVPR.2010.5540039

H. Jégou, M. Douze, and C. Schmid, Product Quantization for Nearest Neighbor Search, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.1, pp.117-128, 2011.
DOI : 10.1109/TPAMI.2010.57

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez et al., Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9, p.44, 2012.
DOI : 10.1109/TPAMI.2011.235

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez et al., Aggregating Local Image Descriptors into Compact Codes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.9, p.17, 2012.
DOI : 10.1109/TPAMI.2011.235

Y. G. Jiang, C. W. Ngo, and J. Yang, Towards optimal bag-of-features for object categorization and semantic video retrieval, Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR '07, pp.494-501, 2007.
DOI : 10.1145/1282280.1282352

A. Kapoor, G. Hua, A. Akbarzadeh, and S. Baker, Which faces to tag: Adding prior constraints into active learning, 2009 IEEE 12th International Conference on Computer Vision, p.122, 2009.
DOI : 10.1109/ICCV.2009.5459392

F. Khan, J. Van-de-weijer, and M. Vanrell, Top-down color attention for object recognition, International Conference on Computer Vision, p.67, 2009.

F. Khan, R. Anwer, J. Van-de-weijer, A. Bagdanov, M. Vanrell et al., Color attributes for object detection, IEEE Conference on Computer Vision and Pattern Recognition, pp.2012-81

F. S. Khan, J. Van-de-weijer, and M. Vanrell, Top-down color attention for object recognition, International Conference on Computer Vision, pp.979-986, 2009.

G. Kim and A. Torralba, Unsupervised detection of regions of interest using iterative link analysis, Advances in Neural Information Processing Systems, pp.4-6, 2009.

A. Kläser, M. Marsza?ek, C. Schmid, and A. Zisserman, Human Focused Action Localization in Video, ECCV Workshop on Sign, Gesture, and Activity, p.123, 2010.
DOI : 10.1007/978-3-642-35749-7_17

J. Krapac, J. Verbeek, and F. Jurie, Learning tree-structured descriptor quantizers for image categorization, BMVC, 2011a, p.16
URL : https://hal.archives-ouvertes.fr/inria-00613118

J. Krapac, J. Verbeek, and F. Jurie, Modeling spatial layout with fisher vectors for image categorization, 2011 International Conference on Computer Vision, p.20, 2011.
DOI : 10.1109/ICCV.2011.6126406

URL : https://hal.archives-ouvertes.fr/inria-00612277

A. Krizhevsky, I. Sutskever, and G. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp.1106-1114, 2012.

C. Lampert, M. Blaschko, and T. Hofmann, Efficient Subwindow Search: A Branch and Bound Framework for Object Localization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.12, pp.312129-2142, 2009.
DOI : 10.1109/TPAMI.2009.144

C. Lampert, H. Nickisch, and S. Harmeling, Learning to detect unseen object classes by between-class attribute transfer, 2009 IEEE Conference on Computer Vision and Pattern Recognition, p.23, 2009.
DOI : 10.1109/CVPR.2009.5206594

C. H. Lampert, M. B. Blaschko, and T. Hofmann, Beyond sliding windows: Object localization by efficient subwindow search, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.
DOI : 10.1109/CVPR.2008.4587586

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.4517

D. Larlus and F. Jurie, Latent mixture vocabularies for object categorization and segmentation, Image and Vision Computing, vol.27, issue.5, pp.523-534, 2009.
DOI : 10.1016/j.imavis.2008.04.022

URL : https://hal.archives-ouvertes.fr/inria-00548649

S. Lazebnik and M. Raginsky, Supervised learning of quantizer codebooks by information loss minimization. Pattern Analysis and Machine Intelligence, IEEE Transactions on, issue.7, pp.311294-1309, 2009.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), pp.19-70, 2006.
DOI : 10.1109/CVPR.2006.68

URL : https://hal.archives-ouvertes.fr/inria-00548585

Y. Lecun, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard et al., Handwritten digit recognition with a back-propagation network, Advances in Neural Information Processing Systems, p.23, 1990.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

A. Lehmann, B. Leibe, and L. Van-gool, PRISM: PRincipled Implicit Shape Model, Procedings of the British Machine Vision Conference 2009, p.35, 2009.
DOI : 10.5244/C.23.64

A. Lehmann, B. Leibe, and L. Van-gool, Feature-centric Efficient Subwindow Search, 2009 IEEE 12th International Conference on Computer Vision, p.34, 2009.
DOI : 10.1109/ICCV.2009.5459341

B. Leibe, A. Leonardis, and B. Schiele, Robust Object Detection with Interleaved Categorization and Segmentation, International Journal of Computer Vision, vol.73, issue.2, pp.259-289, 2008.
DOI : 10.1007/s11263-007-0095-3

C. Li, D. Parikh, and T. Chen, Extracting adaptive contextual cues from unlabeled regions, International Conference on Computer Vision, pp.511-518

L. J. Li, H. Su, E. P. Xing, and L. Fei-fei, Object bank: A high-level image representation for scene classification & semantic feature sparsification, Advances in Neural Information Processing Systems, pp.2010-2032

X. C. Lian, Z. Li, B. L. Lu, and L. Zhang, Max-Margin Dictionary Learning for Multiclass Image Categorization, European Conference on Computer Vision, pp.157-170
DOI : 10.1007/978-3-642-15561-1_12

C. Liu, J. Yuen, and A. Torralba, Nonparametric scene parsing: Label transfer via dense scene alignment, IEEE Conference on Computer Vision and Pattern Recognition, p.37, 2009.

Y. Liu and F. Perronnin, A similarity measure between unordered vector sets with application to image categorization, IEEE Conference on Computer Vision and Pattern Recognition, p.19, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01507191

D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.4931

R. Madsen, D. Kauchak, and C. Elkan, Modeling word burstiness using the Dirichlet distribution, Proceedings of the 22nd international conference on Machine learning , ICML '05, p.46, 2005.
DOI : 10.1145/1102351.1102420

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Supervised dictionary learning, Advances in Neural Information Processing Systems, p.16, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00322431

S. Maji and A. Berg, Max-margin additive models for detection, International Conference on Computer Vision, p.30, 2009.

S. Maji and J. Malik, Object detection using a max-margin hough tranform, IEEE Conference on Computer Vision and Pattern Recognition, p.35, 2009.

T. Malisiewicz, A. Gupta, and A. Efros, Ensemble of exemplar-SVMs for object detection and beyond, 2011 International Conference on Computer Vision, pp.2011-2046
DOI : 10.1109/ICCV.2011.6126229

S. Manen, M. Guillaumin, L. Van-gool, and K. U. Leuven, Prime Object Proposals with Randomized Prim's Algorithm, 2013 IEEE International Conference on Computer Vision, p.79
DOI : 10.1109/ICCV.2013.315

URL : https://lirias.kuleuven.be/bitstream/123456789/450464/1/3716_final_OA.pdf

S. Manen, M. Guillaumin, L. Van-gool, and K. U. Leuven, Prime object proposals with randomized prim's algorithm, release 2013-12-17, pp.2013-80
DOI : 10.1109/iccv.2013.315

URL : https://lirias.kuleuven.be/bitstream/123456789/450464/1/3716_final_OA.pdf

D. R. Martin, C. C. Fowlkes, and J. Malik, Learning to detect natural image boundaries using local brightness, color, and texture cues, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.5, pp.530-549, 2004.
DOI : 10.1109/TPAMI.2004.1273918

J. Matas, O. Chum, M. Urban, and T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions, Image and Vision Computing, vol.22, issue.10, pp.761-767, 2004.
DOI : 10.1016/j.imavis.2004.02.006

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.671.8241

T. Mensink, J. Verbeek, and G. Csurka, Tree-Structured CRF Models for Interactive Image Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.2, p.25, 2012.
DOI : 10.1109/TPAMI.2012.100

URL : https://hal.archives-ouvertes.fr/hal-00688143

K. Mikolajczyk and C. Schmid, Scale & Affine Invariant Interest Point Detectors, International Journal of Computer Vision, vol.60, issue.1, pp.63-86, 2004.
DOI : 10.1023/B:VISI.0000027790.02288.f2

URL : https://hal.archives-ouvertes.fr/inria-00548554

M. Minsky, The emotion machine, Proceedings of the third conference on Creativity & cognition , C&C '99, p.1, 2007.
DOI : 10.1145/317561.317563

F. Moosmann, E. Nowak, and F. Jurie, Randomized Clustering Forests for Image Classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30, issue.9, p.16, 2008.
DOI : 10.1109/TPAMI.2007.70822

URL : https://hal.archives-ouvertes.fr/inria-00548666

J. L. Mundy, Object Recognition in the Geometric Era: A Retrospective, Toward categorylevel object recognition, pp.1-4, 2006.
DOI : 10.1007/11957959_1

H. Murase and S. K. Nayar, Learning and recognition of 3D objects from appearance, [1993] Proceedings IEEE Workshop on Qualitative Vision, pp.39-50, 1993.
DOI : 10.1109/WQV.1993.262951

K. Murphy, A. Torralba, and W. Freeman, Using the forest to see the trees: a graphical model relating features, objects and scenes, Advances in Neural Information Processing Systems, p.37, 2003.

M. Nguyen, L. Torresani, F. De-la-torre, and C. Rother, Weakly supervised discriminative localization and classification: a joint learning process, 2009 IEEE 12th International Conference on Computer Vision, pp.41-89, 2009.
DOI : 10.1109/ICCV.2009.5459426

D. Nistér and H. Stewénius, Scalable Recognition with a Vocabulary Tree, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), p.14, 2006.
DOI : 10.1109/CVPR.2006.264

E. Nowak, F. Jurie, and B. Triggs, Sampling Strategies for Bag-of-Features Image Classification, European Conference on Computer Vision, p.11, 2006.
DOI : 10.1007/11744085_38

URL : https://hal.archives-ouvertes.fr/hal-00203752

T. Ojala, M. Pietikäinen, and T. Mäenpää, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, issue.7, pp.971-987, 2002.
DOI : 10.1109/TPAMI.2002.1017623

R. Okada, Discriminative generalized hough transform for object dectection, 2009 IEEE 12th International Conference on Computer Vision, 2000.
DOI : 10.1109/ICCV.2009.5459441

A. Oliva and A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope, International Journal of Computer Vision, vol.42, issue.3, pp.145-175, 2001.
DOI : 10.1023/A:1011139631724

B. A. Olshausen and D. J. Field, Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, p.14, 1997.

D. Oneata, J. Verbeek, and C. Schmid, Action and Event Recognition with Fisher Vectors on a Compact Feature Set, 2013 IEEE International Conference on Computer Vision, pp.2013-66
DOI : 10.1109/ICCV.2013.228

URL : https://hal.archives-ouvertes.fr/hal-00873662

D. Oneata, J. Verbeek, and C. Schmid, Efficient Action Localization with Approximately Normalized Fisher Vectors, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
DOI : 10.1109/CVPR.2014.326

URL : https://hal.archives-ouvertes.fr/hal-00979594

D. Ozkan and P. Duygulu, A Graph Based Approach for Naming Faces in News Photos, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), pp.1477-1482, 2006.
DOI : 10.1109/CVPR.2006.29

M. Pandey and S. Lazebnik, Scene recognition and weakly supervised object localization with deformable part-based models, 2011 International Conference on Computer Vision, pp.94-99, 2011.
DOI : 10.1109/ICCV.2011.6126383

O. Parkhi, A. Vedaldi, C. Jawahar, and A. Zisserman, The truth about cats and dogs, 2011 International Conference on Computer Vision, pp.36-96, 2011.
DOI : 10.1109/ICCV.2011.6126398

A. Perina, M. Cristani, U. Castellani, V. Murino, and N. Jojic, Free energy score space, Advances in Neural Information Processing Systems, p.47, 2009.

R. Perko and A. Leonardis, A framework for visual-context-aware object detection in still images, Computer Vision and Image Understanding, vol.114, issue.6, p.37, 2010.
DOI : 10.1016/j.cviu.2010.03.005

F. Perronnin, Universal and Adapted Vocabularies for Generic Visual Categorization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30, issue.7, pp.1243-125670755, 2007.
DOI : 10.1109/TPAMI.2007.70755

F. Perronnin and C. Dance, Fisher Kernels on Visual Vocabularies for Image Categorization, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.16-47, 2007.
DOI : 10.1109/CVPR.2007.383266

F. Perronnin, C. Dance, G. Csurka, and M. Bressan, Adapted Vocabularies for Generic Visual Categorization, European Conference on Computer Vision, p.15, 2006.
DOI : 10.1007/11744085_36

F. Perronnin, Y. Liu, J. Sánchez, and H. Poirier, Large-scale image retrieval with compressed Fisher vectors, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2010-2040
DOI : 10.1109/CVPR.2010.5540009

F. Perronnin, J. Sánchez, and Y. Liu, Large-scale image categorization with explicit data embedding, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.30-45, 2010.
DOI : 10.1109/CVPR.2010.5539914

F. Perronnin, J. Sánchez, and T. Mensink, Improving the Fisher kernel for largescale image classification, European Conference on Computer Vision, pp.30-45, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00548630

P. Pham, M. Moens, and T. Tuytelaars, Cross-Media Alignment of Names and Faces, IEEE Transactions on Multimedia, vol.12, issue.1, pp.13-27, 2010.
DOI : 10.1109/TMM.2009.2036232

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Lost in quantization: Improving particular object retrieval in large scale image databases, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.14, 2008.
DOI : 10.1109/CVPR.2008.4587635

J. Philbin, M. Isard, J. Sivic, and A. Zisserman, Descriptor Learning for Efficient Retrieval, European Conference on Computer Vision, pp.677-691
DOI : 10.1007/978-3-642-15558-1_49

J. C. Platt, Sequential minimal optimization: A fast algorithm for training support vector machines Advances in kernel methods, p.28, 1998.

J. Ponce, T. L. Berg, M. Everingham, D. A. Forsyth, M. Hebert et al., Dataset Issues in Object Recognition, Toward category-level object recognition, 2006.
DOI : 10.1007/11957959_2

URL : https://hal.archives-ouvertes.fr/inria-00548595

A. Prest, C. Leistner, J. Civera, C. Schmid, and V. Ferrari, Learning object class detectors from weakly annotated video, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.88-99, 2012.
DOI : 10.1109/CVPR.2012.6248065

URL : https://hal.archives-ouvertes.fr/hal-00695940

J. Puzicha, J. M. Buhmann, Y. Rubner, and C. Tomasi, Empirical evaluation of dissimilarity measures for color and texture, Proceedings of the Seventh IEEE International Conference on Computer Vision, pp.1165-1172, 1999.
DOI : 10.1109/ICCV.1999.790412

A. Quattoni and A. Torralba, Recognizing indoor scenes, 2009 IEEE Conference on Computer Vision and Pattern Recognition, p.21, 2009.
DOI : 10.1109/CVPR.2009.5206537

P. Quelhas, F. Monay, J. Odobez, D. Gatica-perez, T. Tuytelaars et al., Modeling scenes with local descriptors and latent aspects, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.883-890, 2005.
DOI : 10.1109/ICCV.2005.152

A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, Objects in Context, 2007 IEEE 11th International Conference on Computer Vision, p.37, 2007.
DOI : 10.1109/ICCV.2007.4408986

A. Rahimi and B. Recht, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems, p.31, 2007.

R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, Self-taught learning, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.766-781, 2007.
DOI : 10.1145/1273496.1273592

D. Ramanan, Using Segmentation to Verify Object Hypotheses, 2007 IEEE Conference on Computer Vision and Pattern Recognition, p.36, 2007.
DOI : 10.1109/CVPR.2007.383271

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.330.1573

L. G. Roberts, Pattern recognition with an adaptive network, IRE International Convention Record, pp.66-70, 1960.

F. Rosenblatt, The perceptron-a perceiving and recognizing automaton, p.26, 1957.

C. A. Rothwell, D. A. Forsyth, A. Zisserman, and J. L. Mundy, Extracting projective structure from single perspective views of 3D point sets, 1993 (4th) International Conference on Computer Vision, pp.573-582, 1993.
DOI : 10.1109/ICCV.1993.378159

M. Rubinstein, A. Joulin, J. Kopf, and C. Liu, Unsupervised Joint Object Discovery and Segmentation in Internet Images, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.1939-1946
DOI : 10.1109/CVPR.2013.253

URL : https://hal.archives-ouvertes.fr/hal-01064227

O. Russakovsky, Y. Lin, K. Yu, and L. Fei-fei, Object-Centric Spatial Pooling for Image Classification, European Conference on Computer Vision, 2012. (pages xviii, pp.90-99
DOI : 10.1007/978-3-642-33709-3_1

B. Russell, W. Freeman, A. Efros, J. Sivic, and A. Zisserman, Using Multiple Segmentations to Discover Objects and their Extent in Image Collections, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), p.38, 2006.
DOI : 10.1109/CVPR.2006.326

G. Salton and M. Mcgill, Introduction to Modern Information Retrieval, p.10, 1983.

J. Sánchez and F. Perronnin, High-dimensional signature compression for largescale image classification, IEEE Conference on Computer Vision and Pattern Recognition, pp.29-71, 2011.

J. Sánchez, F. Perronnin, and T. De-campos, Modeling the spatial layout of images beyond spatial pyramids, Pattern Recognition Letters, vol.33, issue.16, pp.2216-2223, 2012.
DOI : 10.1016/j.patrec.2012.07.019

J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, Image Classification with the Fisher Vector: Theory and Practice, International Journal of Computer Vision, vol.73, issue.2, pp.222-245, 2013.
DOI : 10.1007/s11263-013-0636-x

J. Savarese, A. Winn, and . Criminisi, Discriminative Object Class Models of Appearance and Shape by Correlatons, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), 2006.
DOI : 10.1109/CVPR.2006.102

C. Schmid and R. Mohr, Local grayvalue invariants for image retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, issue.5, pp.530-535, 1997.
DOI : 10.1109/34.589215

URL : https://hal.archives-ouvertes.fr/inria-00548358

C. Schmid, P. Bobet, B. Lamiroy, and R. Mohr, An image oriented CAD approach, Workshop on Object Representation in Computer Vision (ECCV '96), pp.221-245, 1996.
DOI : 10.1007/3-540-61750-7_31

URL : https://hal.archives-ouvertes.fr/inria-00548370

F. Schroff, A. Criminisi, and A. Zisserman, Harvesting image databases from the web, International Conference on Computer Vision, p.25, 2007.

B. Settles, Active learning literature survey, p.25, 2009.

S. Shalev-shwartz, Y. Singer, and N. Srebro, Pegasos, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.807-814, 2007.
DOI : 10.1145/1273496.1273598

G. Sharma and F. Jurie, Learning discriminative spatial representation for image classification, Procedings of the British Machine Vision Conference 2011, p.19, 2011.
DOI : 10.5244/C.25.6

URL : https://hal.archives-ouvertes.fr/hal-00722820

G. Sharma, F. Jurie, and C. Schmid, Discriminative spatial saliency for image classification, 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.3506-3513
DOI : 10.1109/CVPR.2012.6248093

URL : https://hal.archives-ouvertes.fr/hal-00714311

E. Shechtman and M. Irani, Matching Local Self-Similarities across Images and Videos, 2007 IEEE Conference on Computer Vision and Pattern Recognition, p.12, 2007.
DOI : 10.1109/CVPR.2007.383198

J. Shi and C. Tomasi, Good features to track, IEEE Conference on Computer Vision and Pattern Recognition, pp.593-600, 1994.

Z. Shi, P. Siva, T. Xiang, and Q. Mary, Transfer Learning by Ranking for Weakly Supervised Object Annotation, Procedings of the British Machine Vision Conference 2012, pp.1-11, 2012.
DOI : 10.5244/C.26.78

Z. Shi, T. Hospedales, and T. Xiang, Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation, 2013 IEEE International Conference on Computer Vision, pp.40-94, 2013.
DOI : 10.1109/ICCV.2013.371

K. Simonyan, A. Vedaldi, and A. Zisserman, Descriptor Learning Using Convex Optimisation, European Conference on Computer Vision, pp.2012-2025
DOI : 10.1007/978-3-642-33718-5_18

K. Simonyan, A. Vedaldi, and A. Zisserman, Deep fisher networks for large-scale image classification, Advances in Neural Information Processing Systems, p.21, 2013.

S. Singh, A. Gupta, and A. Efros, Unsupervised Discovery of Mid-Level Discriminative Patches, European Conference on Computer Vision, p.21, 2012.
DOI : 10.1007/978-3-642-33709-3_6

P. Siva and T. Xiang, Weakly supervised object detector learning with model drift detection, 2011 International Conference on Computer Vision, pp.41-89, 2011.
DOI : 10.1109/ICCV.2011.6126261

P. Siva, C. Russell, and T. Xiang, In Defence of Negative Mining for Annotating Weakly Labelled Data, European Conference on Computer Vision, pp.39-41, 2012.
DOI : 10.1007/978-3-642-33712-3_43

P. Siva, C. Russell, T. Xiang, and L. Agapito, Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.39-41, 2013.
DOI : 10.1109/CVPR.2013.416

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, p.10, 2003.
DOI : 10.1109/ICCV.2003.1238663

J. Sivic and A. Zisserman, Efficient Visual Search of Videos Cast as Text Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.4, pp.591-606, 2009.
DOI : 10.1109/TPAMI.2008.111

J. Sivic, M. Everingham, and A. Zisserman, Who are you? " : Learning person specific classifiers from video, IEEE Conference on Computer Vision and Pattern Recognition, p.123, 2009.

J. Sochman and J. Matas, WaldBoost ??? Learning for Time Constrained Sequential Detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.150-156, 2005.
DOI : 10.1109/CVPR.2005.373

X. Song, T. Wu, Y. Jia, and S. Zhu, Discriminatively Trained And-Or Tree Models for Object Detection, 2013 IEEE Conference on Computer Vision and Pattern Recognition, p.35, 2013.
DOI : 10.1109/CVPR.2013.421

Z. Song, Q. Chen, Z. Huang, Y. Hua, and S. Yan, Contextualizing object detection and classification, CVPR 2011, pp.37-83, 2011.
DOI : 10.1109/CVPR.2011.5995330

V. Sreekanth, A. Vedaldi, A. Zisserman, and C. Jawahar, Generalized RBF feature maps for efficient detection, BMVC, p.31, 2010.

M. J. Swain and D. H. Ballard, Color indexing, International Journal of Computer Vision, vol.31, issue.1, pp.11-32, 1991.
DOI : 10.1007/BF00130487

R. Sznitman, C. Becker, F. Fleuret, and P. Fua, Fast object detection with entropydriven evaluation, IEEE Conference on Computer Vision and Pattern Recognition, pp.2013-2046

Y. Taigman, L. Wolf, and T. Hassner, Multiple One-Shots for Utilizing Class Label Information, Procedings of the British Machine Vision Conference 2009, 2009.
DOI : 10.5244/C.23.77

Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face Verification, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.2014-120
DOI : 10.1109/CVPR.2014.220

M. E. Tipping and C. M. Bishop, Mixtures of Probabilistic Principal Component Analyzers, Neural Computation, vol.2, issue.1, pp.443-482, 1999.
DOI : 10.1007/BF00162527

A. Torralba, Contextual priming for object detection, International Journal on Computer Vision, vol.53, issue.38, p.37, 2003.

A. Torralba and A. A. Efros, Unbiased look at dataset bias, CVPR 2011, pp.1521-1528
DOI : 10.1109/CVPR.2011.5995347

L. Torresani, M. Szummer, and A. Fitzgibbon, Efficient Object Category Recognition Using Classemes, European Conference on Computer Vision, pp.776-789
DOI : 10.1007/978-3-642-15549-9_56

I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, Large margin methods for structured and interdependent output variables, Journal of Machine Learning Research, vol.6, pp.1453-1484, 2005.

T. Tuytelaars, Dense interest points, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.2281-2288, 2010.
DOI : 10.1109/CVPR.2010.5539911

T. Tuytelaars and C. Schmid, Vector Quantizing Feature Space with a Regular Lattice, 2007 IEEE 11th International Conference on Computer Vision, pp.1-8, 2007.
DOI : 10.1109/ICCV.2007.4408924

URL : https://hal.archives-ouvertes.fr/inria-00548675

O. Tuzel, F. Porikli, and P. Meer, Human Detection via Classification on Riemannian Manifolds, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.
DOI : 10.1109/CVPR.2007.383197

J. Uijlings, K. Van-de-sande, T. Gevers, and A. Smeulders, Selective Search for Object Recognition, International Journal of Computer Vision, vol.57, issue.1, pp.154-171, 2013.
DOI : 10.1007/s11263-013-0620-5

K. Van-de-sande, J. Uijlings, T. Gevers, and A. Smeulders, Segmentation as selective search for object recognition, 2011 International Conference on Computer Vision, pp.79-81, 2011.
DOI : 10.1109/ICCV.2011.6126456

K. E. Van-de-sande, T. Gevers, and C. G. Snoek, Evaluating Color Descriptors for Object and Scene Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.9, pp.1582-1596
DOI : 10.1109/TPAMI.2009.154

J. Van-gemert, C. Veenman, A. Smeulders, and J. Geusebroek, Visual Word Ambiguity, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.7, pp.1271-1283
DOI : 10.1109/TPAMI.2009.132

V. Vapnik, The Nature of Statistical Learning Theory, p.24, 1995.

N. Vasconcelos, On the Efficient Evaluation of Probabilistic Similarity Functions for Image Retrieval, IEEE Transactions on Information Theory, vol.50, issue.7, pp.1482-1496, 2004.
DOI : 10.1109/TIT.2004.830760

A. Vedaldi and A. Zisserman, Efficient additive kernels via explicit feature maps, IEEE Conference on Computer Vision and Pattern Recognition, p.44, 2010.
DOI : 10.1109/cvpr.2010.5539949

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.7024

A. Vedaldi and A. Zisserman, Sparse kernel approximations for efficient classification and detection, 2012 IEEE Conference on Computer Vision and Pattern Recognition, p.72, 2012.
DOI : 10.1109/CVPR.2012.6247943

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.363.5902

A. Vedaldi and A. Zisserman, Efficient Additive Kernels via Explicit Feature Maps, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.34, issue.3, pp.480-492
DOI : 10.1109/TPAMI.2011.153

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.7024

A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman, Multiple kernels for object detection, 2009 IEEE 12th International Conference on Computer Vision, p.33, 2009.
DOI : 10.1109/ICCV.2009.5459183

J. Verbeek and B. Triggs, Region Classification with Markov Field Aspect Models, 2007 IEEE Conference on Computer Vision and Pattern Recognition, p.88, 2007.
DOI : 10.1109/CVPR.2007.383098

URL : https://hal.archives-ouvertes.fr/inria-00321129

J. Verbeek, J. Nunnink, and N. Vlassis, Accelerated EM-based clustering of large data sets, Data Mining and Knowledge Discovery, vol.13, issue.3, pp.291-307, 2006.
DOI : 10.1007/s10618-005-0033-3

URL : https://hal.archives-ouvertes.fr/inria-00321022

S. Vijayanarasimhan and K. Grauman, Large-scale live active learning: Training object detectors with crawled data and crowds, CVPR 2011, pp.2011-88
DOI : 10.1109/CVPR.2011.5995430

P. Viola and M. Jones, Robust Real-Time Face Detection, International Journal of Computer Vision, vol.57, issue.2, pp.137-154, 2004.
DOI : 10.1023/B:VISI.0000013087.49260.fb

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.9805

J. Wang, J. Yang, K. Yu, F. Lv, T. Huang et al., Locality-constrained Linear Coding for image classification, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.15, 2010.
DOI : 10.1109/CVPR.2010.5540018

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.175.2312

L. Wang, J. Shi, G. Song, and I. Shen, Object Detection Combining Recognition and Segmentation, Asian Conf. on Computer Vision, p.36, 2007.
DOI : 10.1007/978-3-540-76386-4_17

X. Wang, M. Yang, S. Zhu, and Y. Lin, Regionlets for generic object detection, International Conference on Computer Vision, p.36, 2013.

Y. Wei and L. Tao, Efficient histogram-based sliding window, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p.33, 2010.
DOI : 10.1109/CVPR.2010.5540049

S. Winder, G. Hua, and M. Brown, Picking the best DAISY, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.178-185, 2009.
DOI : 10.1109/CVPR.2009.5206839

J. Winn, A. Criminisi, and T. Minka, Object categorization by learned universal visual dictionary, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, p.16, 2005.
DOI : 10.1109/ICCV.2005.171

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.93.8714

L. Wolf and S. Bileschi, A Critical View of Context, International Journal of Computer Vision, vol.13, issue.2, pp.251-261, 2006.
DOI : 10.1007/s11263-006-7538-0

W. Xia, C. Domokos, J. Dong, L. Cheong, and S. Yan, Semantic Segmentation without Annotating Segments, 2013 IEEE International Conference on Computer Vision, pp.2013-79
DOI : 10.1109/ICCV.2013.271

R. Yan, J. Zhang, J. Yang, and A. Hauptmann, A discriminative learning framework with pairwise constraints for video object classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.28, issue.4, pp.2006-122

J. Yang, K. Yu, Y. Gong, and T. Huang, Linear spatial pyramid matching using sparse coding for image classification, IEEE Conference on Computer Vision and Pattern Recognition, p.15, 2009.

J. Yang, K. Yu, and T. Huang, Supervised translation-invariant sparse coding, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.3517-3524
DOI : 10.1109/CVPR.2010.5539958

L. Yang, R. Jin, C. Pantofaru, and R. Sukthankar, Discriminative Cluster Refinement: Improving Object Category Recognition Given Limited Training Data, 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.
DOI : 10.1109/CVPR.2007.383270

L. Yang, R. Jin, R. Sukthankar, and F. Jurie, Unifying discriminative visual codebook generation with classifier training for object category recognition, 2008 IEEE Conference on Computer Vision and Pattern Recognition, p.16, 2008.
DOI : 10.1109/CVPR.2008.4587504

URL : https://hal.archives-ouvertes.fr/inria-00548653

B. Yao and L. Fei-fei, Grouplet: A structured image representation for recognizing human and object interactions, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
DOI : 10.1109/CVPR.2010.5540234

T. Yeh, J. J. Lee, and T. Darrell, Fast concurrent object localization and recognition, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.280-287, 2009.
DOI : 10.1109/CVPR.2009.5206805

H. F. Yu, C. J. Hsieh, K. W. Chang, and C. J. Lin, Large linear classification when data cannot fit in memory, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.833-842, 2010.

Y. Yue, T. Finley, F. Radlinski, and T. Joachims, A support vector method for optimizing average precision, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pp.271-278, 2007.
DOI : 10.1145/1277741.1277790

C. Zhang and P. A. Viola, Multiple-instance pruning for learning efficient cascade detectors, Advances in Neural Information Processing Systems, pp.1681-1688, 2007.

J. Zhang, M. Marsza?ek, S. Lazebnik, and C. Schmid, Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study, International Journal of Computer Vision, vol.36, issue.1, pp.213-238, 2007.
DOI : 10.1007/s11263-006-9794-4

URL : https://hal.archives-ouvertes.fr/inria-00548574

T. Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms, Twenty-first international conference on Machine learning , ICML '04, pp.116-144, 2004.
DOI : 10.1145/1015330.1015332

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.7377

W. Zhang, A. Surve, X. Fern, and T. Dietterich, Learning non-redundant codebooks for classifying complex objects, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.1241-1248, 2009.
DOI : 10.1145/1553374.1553533

Y. Zhang and T. Chen, Implicit shape kernel for discriminative learning of the hough transform detector British Machine Vision Association, BMVC, pp.105-106, 2010.

X. Zhou, K. Yu, T. Zhang, and T. S. Huang, Image Classification Using Super-Vector Coding of Local Image Descriptors, European Conference on Computer Vision, pp.141-154
DOI : 10.1007/978-3-642-15555-0_11