B. Huet and E. R. Hancock, Structural Sensitivity for Large-Scale Line-Pattern Recognition, Third International Conference on Visual Information Systems (VISUAL99), pp.711-718
DOI : 10.1007/3-540-48762-X_88

B. Huet, A. D. Cross, and E. R. Hancock, Graph Matching for Shape Retrieval, Advances in Neural Information Processing Systems 11, 1999.

P. Worthington, B. Huet, and E. R. Hancock, Appearance-based object recognition using shape-from-shading, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170), pp.412-416, 1998.
DOI : 10.1109/ICPR.1998.711169

B. Huet and E. R. Hancock, Relational histograms for shape indexing, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), pp.563-569, 1998.
DOI : 10.1109/ICCV.1998.710773

B. Huet and E. R. Hancock, Fuzzy Relational Distance for Largescale Object Recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR'98), pp.138-143, 1998.

B. Huet and E. R. Hancock, Pairwise representation for image database indexing, 6th International Conference on Image Processing and its Applications, pp.15-17, 1997.
DOI : 10.1049/cp:19970942

B. Huet and E. R. Hancock, Cartographic indexing into a database of remotely sensed images, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96, pp.8-14, 1996.
DOI : 10.1109/ACV.1996.571987

B. Huet and E. R. Hancock, Structural Indexing of Infra-red Images using Statistical Histogram Comparison, Third International Workshop on Image and Signal Processing (IWISP'96), pp.4-7, 1996.
DOI : 10.1016/B978-044482587-2/50143-9

P. Charlton and B. Huet, Intelligent Agents for Image Retrieval, Research and Technology Advances in Digital Libraries, 1995.

P. Charlton and B. Huet, Using Multiple Agents For Content- Based Image Retrieval, European Research Seminar on Advances in Distributed Systems, L'Alpe D'Huez (France), 1995.

H. S. Li and J. Sun, Video object cut and paste, SIGGRAPH, 2005.

H. Greenspan, J. Goldberger, and A. Mayer, Probabilistic space-time video modeling via piecewise gmm, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.3, pp.384-396, 2004.
DOI : 10.1109/TPAMI.2004.1262334

D. Dementhon and D. Doermann, Video retrieval using spatio-temporal descriptors, Proceedings of the eleventh ACM international conference on Multimedia , MULTIMEDIA '03, pp.508-517, 2003.
DOI : 10.1145/957013.957124

E. Galmar and B. Huet, Graph-Based Spatio-temporal Region Extraction, ICIAR, pp.236-247, 2006.
DOI : 10.1007/11867586_23

T. Athanasiadis, V. Tzouvaras, V. Petridis, F. Precioso, Y. Avrithis et al., Using a multimedia ontology infrastructure for semantic annotation of multimedia content, 5th Int'l Workshop on Knowledge Markup and Semantic Annotation, 2005.

S. U. Eran-borenstein and E. Sharon, Combining top-down and bottomup segmentation, 8th Conference on Computer Vision and Pattern Recognition Workshop, 2004.

T. Athanasiadis, P. Mylonas, Y. Avrithis, and S. Kollias, Semantic Image Segmentation and Object Labeling, IEEE Transactions on Circuits and Systems for Video Technology, vol.17, issue.3, 2007.
DOI : 10.1109/TCSVT.2007.890636

S. Berretti, A. D. Bimbo, and E. Vicario, Efficient matching and indexing of graph models in content-based retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23, issue.10, pp.1089-1105, 2001.
DOI : 10.1109/34.954600

F. Ge, S. Wang, and T. Liu, Image-segmentation evaluation from the perspective of salient object extraction, CVPR, pp.1146-1153, 2006.

C. Hau, C. Gareth, and J. F. Jones, Affect-based indexing and retrieval of films, Proceedings of ACM Multimedia '05, pp.427-430, 2005.

F. Kuo, M. Chiang, M. Shan, and S. Lee, Emotion-based music recommendation by association discovery from film music, Proceedings of the 13th annual ACM international conference on Multimedia , MULTIMEDIA '05, pp.507-510, 2005.
DOI : 10.1145/1101149.1101263

E. Y. Kim, S. Kim, H. Koo, K. Jeong, and J. Kim, Emotion-Based Textile Indexing Using Colors and Texture, Fuzzy Systems and Knowledge Discovery, pp.1077-1080, 2005.
DOI : 10.1007/11539506_133

M. Pantic and L. J. Rothkrantz, Toward an affect-sensitive multimodal human-computer interaction, Proceedings of IEEE, pp.1370-1390, 2003.
DOI : 10.1109/JPROC.2003.817122

C. Busso, Z. Deng, S. Yildirim, M. Bulut, M. Chul et al., Analysis of emotion recognition using facial expressions, speech and multimodal information, Proceedings of the 6th international conference on Multimodal interfaces , ICMI '04, pp.205-211, 2004.
DOI : 10.1145/1027933.1027968

O. Martin, I. Kotsia, B. Macq, and I. Pitas, The eNTERFACE05 Audio-Visual Emotion Database, Proceedings of the 22nd International Conference on Data Engineering Workshops (ICDEW'06, 2006.

. Intelcorporation, Open Source Computer Vision Library: Reference Manual, 2006.

D. Bruce, T. Lukas, and . Kanade, An iterative image registration technique with an application to stereo vision, Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp.674-679, 1981.

P. Boersma and D. Weenink, Praat: doing phonetics by computer, 2008.

J. Noble, Spoken emotion recognition with support vector machines, 2003.

E. Galmar and B. Huet, Analysis of vector space model and spatiotemporal segmentation for video indexing and retrieval, Proceedings of the 6th ACM international conference on Image and video retrieval, CIVR '07, 2007.
DOI : 10.1145/1282280.1282344

R. Benmokhtar and B. Huet, Multi-level Fusion for Semantic Video Content Indexing and Retrieval, 5th International Workshop on Adaptive Multimedia Retrieval, LIP6, 2007.
DOI : 10.1007/978-3-540-79860-6_13

W. Chih, C. C. Hsu, C. J. Chang, and . Lin, A practical guide to support vector classification, 2003.

P. Aigrain and P. Joly, The automatic real-time analysis of film editing and transition effects and its applications, Computers & Graphics, vol.18, issue.1, pp.93-103, 1994.
DOI : 10.1016/0097-8493(94)90120-1

S. Ayache and G. Quènot, TRECVid 2007 collaborative annotation using active learning, TRECVid, 11th international workshop on video retrieval evaluation, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00953889

R. Benmokhtar and B. Huet, Classifier Fusion: Combination Methods For Semantic Indexing in Video Content, International conference on artificial neural networks, pp.65-74, 2006.
DOI : 10.1007/11840930_7

R. Benmokhtar and B. Huet, Neural Network Combining Classifier Based on Dempster-Shafer Theory for Semantic Indexing in Video Content, International multimedia modeling conference, pp.196-205, 2007.
DOI : 10.1007/978-3-540-69429-8_20

R. Benmokhtar and B. Huet, Perplexity-based evidential neural network classifier fusion using mpeg-7 low-level visual features, Proceeding of the 1st ACM international conference on Multimedia information retrieval, MIR '08, pp.336-341, 2008.
DOI : 10.1145/1460096.1460151

R. Benmokhtar and B. Huet, Hierarchical ontology-based robust video shots indexing using global MPEG-7 visual descriptors, Proceedings of the international workshop on contentbased multimedia indexing, pp.195-200, 2009.

T. Berners, J. Hendler, and O. Lassila, The Semantic Web, Scientific American, vol.284, issue.5, pp.29-37, 2001.
DOI : 10.1038/scientificamerican0501-34

S. Chang, C. W. Meng, H. Sundaram, H. Zhong, and D. , A fully automated contentbased video search engine supporting spatiotemporal queries, IEEE transactions circuits and systems for video technology, pp.602-615, 1998.

T. Denoeux, An evidence-theoretic neural network classifer, International conference on systems, man and cybernetics, pp.712-717, 1995.

N. Dimitrova, Multimedia Content Analysis: The Next Wave, International conference on image and video retrieval. Lecture notes in computer science, pp.8-17, 2003.
DOI : 10.1007/3-540-45113-7_2

R. Duin and D. Tax, Experiements with classifier combining rules, Proc. first int. workshop MCS, pp.16-29, 2000.

C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack et al., Efficient and effective Querying by Image Content, Journal of Intelligent Information Systems, vol.2, issue.6, pp.231-262, 1994.
DOI : 10.1007/BF00962238

J. Fan, Y. Gao, and H. Luo, Hierarchical classification for automatic image annotation, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pp.111-118, 2007.
DOI : 10.1145/1277741.1277763

J. Gao, J. Goodman, M. Li, and K. Lee, Toward a unified approach to statistical language modeling for Chinese, ACM Transactions on Asian Language Information Processing, vol.1, issue.1, 2001.
DOI : 10.1145/595576.595578

I. Iec, Coding of moving pictures and associated audio information, Information Technology, pp.14496-14498, 2001.

A. Jain, R. Duin, and J. Mao, Statistical pattern recognition: a review, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.1, pp.4-37, 2000.
DOI : 10.1109/34.824819

J. Jiang and D. Conrath, Semantic similarity based on corpus statistics and lexical taxonomy, 1997.

W. Jiang, C. Cotton, S. Chang, D. Ellis, and A. Loui, Short-term audio-visual atoms for generic video concept classification, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, pp.5-14, 2009.
DOI : 10.1145/1631272.1631277

E. Kasutani and A. Yamada, The MPEG-7 color layout descriptor: a compact image feature description for high-speed image/video segment retrieval, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205), pp.674-677, 2001.
DOI : 10.1109/ICIP.2001.959135

M. Koskela and A. Smeaton, Clustering-Based Analysis of Semantic Concept Models for Video Shots, 2006 IEEE International Conference on Multimedia and Expo, pp.45-48, 2006.
DOI : 10.1109/ICME.2006.262546

M. Koskela, A. Smeaton, and J. Laaksonen, Measuring Concept Similarities in Multimedia Ontologies: Analysis and Evaluations, IEEE Transactions on Multimedia, vol.9, issue.5, pp.912-922, 2007.
DOI : 10.1109/TMM.2007.900137

S. Kotsiantis, Supervised machine learning: a review of classification techniques, Informatica, vol.31, pp.249-268, 2007.

L. Kuncheva, "Fuzzy" versus "nonfuzzy" in combining classifiers designed by boosting, IEEE Transactions on Fuzzy Systems, vol.11, issue.6, pp.729-741, 2003.
DOI : 10.1109/TFUZZ.2003.819842

L. Kuncheva, J. Bezdek, and R. Duin, Decision templates for multiple classifier fusion: an experimental comparison, Pattern Recognition, vol.34, issue.2, pp.299-314, 2001.
DOI : 10.1016/S0031-3203(99)00223-X

J. Laaksonen, M. Moskela, and E. Oja, Class distributions on SOM surfaces for feature extraction and object retrieval, Neural Networks, vol.17, issue.8-9, pp.1121-1133, 2004.
DOI : 10.1016/j.neunet.2004.07.007

B. Li and K. Goh, Confidence-based dynamic ensemble for image annotation and semantics discovery, Proceedings of the eleventh ACM international conference on Multimedia , MULTIMEDIA '03, pp.195-206, 2003.
DOI : 10.1145/957013.957051

Y. Li, Z. Bandar, and D. Mclean, An approach for measuring semantic similarity between words using multiple information sources, IEEE Trans Knowl Data Eng, vol.15, issue.4, pp.871-882, 2003.

D. Lin, An information-theoretic definition of similarity, Proceedings of the 15th international conference on machine learning, pp.296-304, 1998.

B. Manjunath, P. Salembier, and T. Sikora, Introduction to MPEG-7: multimedia content description interface, 2002.

D. Messing, P. Beek, and J. Errico, The MPEG-7 color structure descriptor: image description using color and local spatial information, Proceedings of the IEEE international conference on image processing, pp.670-673, 2001.

M. Naphade, I. Kozintsev, and T. Huang, Probabilistic semantic video indexing, Proceedings of neural information processing systems, pp.967-973, 2000.

M. Naphade, T. Kristjansson, B. Frey, and T. Huang, Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269), pp.536-540, 1998.
DOI : 10.1109/ICIP.1998.999041

M. Naphade, L. Kennedy, J. Kender, S. Chang, J. Smith et al., A light scale concept ontology for multimedia understanding for TRECVid, 2005.

M. Naphade, L. Kennedy, J. Kender, S. Chang, J. Smith et al., A light scale concept ontology for multimedia understanding for trecvid, 2005.

D. Park, Y. Jeon, and C. Won, Efficient use of local edge histogram descriptor, Proceedings of the 2000 ACM workshops on Multimedia , MULTIMEDIA '00, pp.51-54, 2000.
DOI : 10.1145/357744.357758

A. Pentland, R. Picard, S. Sclaroff, R. Rada, H. Mili et al., Photobook: content-based manipulation of image databases In: Proceedings of SPIE conference on storage and retrieval for image and video databases 40 Development and application of a metric on semantic nets, IEEE Trans Syst Man Cybern, vol.19, issue.1, pp.17-30, 1989.

P. Resnik, Using information content to evaluate semantic similarity in a taxonomy, Proceedings of the 14th international joint conference on artificial intelligence, pp.448-453, 1995.

N. Seco, T. Veale, and J. Hayes, An intrinsic information content metric for semantic similarity in WordNet, Proceedings of European conference on artificial intelligence, 2004.

T. Slimani, B. Benyaghlane, and K. Mellouli, Une extension de mesure de similarité entre les concepts d'une ontologie, International conference on sciences of electronic, technologies of information and telecommunications, pp.1-10, 2007.

J. Smith and S. Chang, VisualSEEk, Proceedings of the fourth ACM international conference on Multimedia , MULTIMEDIA '96, pp.87-98, 1996.
DOI : 10.1145/244130.244151

C. Snoek and M. Worring, Multimodal Video Indexing: A Review of the State-of-the-art, Multimedia Tools and Applications, vol.25, issue.1, pp.5-35, 2005.
DOI : 10.1023/B:MTAP.0000046380.27575.a5

C. Snoek, M. Worring, J. Geusebroek, D. Koelma, and F. Seinstra, The mediamill TRECVid 2004 semantic viedo search engine, TREC video retrieval evaluation online proceedings, 2004.

F. Souvannavong, Indexation et recherche de plans vidéo par le contenu sémantique, 2005.

X. Sun, B. Manjunath, and A. Divakaran, Representation of motion activity in hierarchical levels for video indexing and filtering, Proceedings of the IEEE international conference on image processing, pp.149-152, 2002.

C. Tsinaraki, P. Polydoros, and S. Christodoulakis, Interoperability Support for Ontology-Based Video Retrieval Applications, Proceedings of the third international conference on image and video retrieval, 2004.
DOI : 10.1007/978-3-540-27814-6_68

V. Vapnik, The nature of statistical learning theory, 2000.

S. Vembu, M. Kiesel, M. Sintek, and S. Baumann, Towards bridging the semantic gap in multimedia annotation and retrieval, Proceedings of the 1st international workshop on semantic web annotations for multimedia, 2006.

H. Wactlar, T. Kanade, M. Smith, and S. Stevens, Intelligent access to digital video: Informedia project, Computer, vol.29, issue.5, 1996.
DOI : 10.1109/2.493456

Z. Wu and M. Palmer, Verbs semantics and lexical selection, Proceedings of the 32nd annual meeting on Association for Computational Linguistics -, pp.133-138, 1994.
DOI : 10.3115/981732.981751

Y. Wu, B. Tseng, and J. Smith, Ontology-based multi-classification learning for video concept detection, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763), pp.1003-1006, 2004.
DOI : 10.1109/ICME.2004.1394372

F. Xu and Y. Zhang, Evaluation and comparison of texture descriptors proposed in MPEG-7, Journal of Visual Communication and Image Representation, vol.17, issue.4, pp.701-716, 2006.
DOI : 10.1016/j.jvcir.2005.10.002

L. Xu, A. Krzyzak, and C. Suen, Methods of combining multiple classifiers and their applications to handwriting recognition, IEEE Transactions on Systems, Man, and Cybernetics, vol.22, issue.3, pp.418-435, 1992.
DOI : 10.1109/21.155943

D. Yining and B. Manjunath, NeTra-V: toward an object-based video representation, Proceedings of IEEE conference of multimedia and expo, pp.616-627, 1998.
DOI : 10.1109/76.718508

. Dailymotion, J. O. Amir, M. Argillander, and . Berg, IBM research TRECVID-2004 video retrieval system, flickr.com/. [3] TrecVID NIST TRECVID 2004 Workshop, 2004.

J. Cao, Y. Zhang, Y. Song, Z. Chen, X. Zhang et al., MCG-WEBV: A benchmark dataset for web video analysis, 2009.

C. Chang and C. Lin, LIBSVM, ACM Transactions on Intelligent Systems and Technology, vol.2, issue.3, 2001.
DOI : 10.1145/1961189.1961199

T. Chua, J. Tang, R. Hong, H. Li, Z. Luo et al., NUS-WIDE, Proceeding of the ACM International Conference on Image and Video Retrieval, CIVR '09
DOI : 10.1145/1646396.1646452

S. Feng, R. Manmatha, and V. Lavrenko, Multiple Bernoulli relevance models for image and video annotation, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., pp.1002-1009, 2004.
DOI : 10.1109/CVPR.2004.1315274

G. Griffin, A. Holub, and P. Perona, Caltech-256 object category dataset, 2007.

R. Hong, G. Li, L. Nie, J. Tang, and T. Chua, Explore large scale data for multimedia QA, In ACM conference on Image and Video Retrieval, 2010.

W. Jiang, C. Cotton, S. Chang, D. Ellis, and A. C. Loui, Short-term audio-visual atoms for generic video concept classification, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, 2009.
DOI : 10.1145/1631272.1631277

Y. Ke, R. Sukthankar, and M. Hebert, Event Detection in Crowded Videos, 2007 IEEE 11th International Conference on Computer Vision, 2007.
DOI : 10.1109/ICCV.2007.4409011

L. Li, G. Wang, and L. Fei-fei, Optimol: automatic online picture collection via incremental model learning, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007.

X. Liu and B. Huet, Automatic Concept Detector Refinement for Large-Scale Video Semantic Annotation, 2010 IEEE Fourth International Conference on Semantic Computing, 2010.
DOI : 10.1109/ICSC.2010.15

C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval, 2008.
DOI : 10.1017/CBO9780511809071

C. Schuldt, I. Laptev, and B. Caputo, Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004.
DOI : 10.1109/ICPR.2004.1334462

M. Yu-chen, M. Christel, E. Hauptmann, and H. Wactlar, Putting active learning into multimedia applications: Dynamic definition and refinement of concept classifiers, Proceedings of ACM Multimedia, pp.902-911, 2005.

Z. Zha, T. Mei, J. Wang, Z. Wang, and X. Hua, Graph-based semi-supervised learning with multi-label, ACM Trans. Program. Lang. Syst, vol.20, issue.5, pp.97-103, 2009.

X. Zhu, Semi-supervised learning literature survey, 2006.

]. Y. Arase, X. Xie, T. Hara, and S. Nishio, Mining people's trips from large scale geo-tagged photos, Proceedings of the international conference on Multimedia, MM '10, pp.133-142, 2010.
DOI : 10.1145/1873951.1873971

H. Becker, M. Naaman, and L. Gravano, Event Identification in Social Media, 12 th International Workshop on the Web and Databases (WebDB'09), 2009.

H. Becker, M. Naaman, and L. Gravano, Learning similarity metrics for event identification in social media, Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, pp.291-300, 2010.
DOI : 10.1145/1718487.1718524

R. Datta, D. Joshi, J. Li, J. , and Z. Wang, Image retrieval, ACM Computing Surveys, vol.40, issue.2, 2008.
DOI : 10.1145/1348246.1348248

A. Fialho, R. Troncy, L. Hardman, C. Saathoff, and A. Scherp, What's on this evening? Designing User Support for Event-based Annotation and Exploration of Media, 1 st International Workshop on EVENTS -Recognising and tracking events on the Web and in real life, pp.40-54, 2010.

M. Hearst, Search User Interfaces, 2009.
DOI : 10.1017/CBO9781139644082

J. Hobbs and F. Pan, Time Ontology in OWL, W3C Working Draft, 2006.

L. Kennedy and M. Naaman, Less talk, more rock, Proceedings of the 18th international conference on World wide web, WWW '09, pp.311-320, 2009.
DOI : 10.1145/1526709.1526752

D. Liu, X. Hua, M. Wang, and H. Zhang, Image retagging, Proceedings of the international conference on Multimedia, MM '10, pp.491-500, 2010.
DOI : 10.1145/1873951.1874031

Y. Raimond, S. Abdallah, M. Sandler, and F. Giasson, The Music Ontology, th International Conference on Music Information Retrieval (ISMIR'07), 2007.

R. Shaw, R. Troncy, and L. Hardman, LODE: Linking Open Descriptions of Events, 4 th Asian Semantic Web Conference, 2009.
DOI : 10.1007/978-3-642-10871-6_11

J. Tang, S. Yan, R. Hong, G. Qi, and T. Chua, Inferring semantic concepts from community-contributed images and noisy tags, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, pp.223-232, 2009.
DOI : 10.1145/1631272.1631305

R. Troncy, A. Fialho, L. Hardman, and C. Saathoff, Experiencing Events through User-Generated Media, 1 st International Workshop on Consuming Linked Data (COLD'10), 2010.

R. Troncy, B. Malocha, and A. Fialho, Linking events with media, Proceedings of the 6th International Conference on Semantic Systems, I-SEMANTICS '10, 2010.
DOI : 10.1145/1839707.1839759

U. Westermann and R. Jain, Toward a Common Event Model for Multimedia Applications, IEEE Multimedia, vol.14, issue.1, pp.19-29, 2007.
DOI : 10.1109/MMUL.2007.23

Y. Zheng, M. Zhao, Y. Song, H. Adam, U. Buddemeier et al., Tour the world, Proceedings of the seventeen ACM international conference on Multimedia, MM '09, 2009.
DOI : 10.1145/1631272.1631468