A. E. Abduraman, Structuration intra-programme de contenus TV, 2013.
URL : https://hal.archives-ouvertes.fr/tel-01136587

A. E. Abduraman, S. Berrani, and B. Merialdo, An unsupervised approach for recurrent tv program structuring, Proceddings of the 9th international interactive conference on Interactive television, EuroITV '11, pp.123-126, 2011.
DOI : 10.1145/2000119.2000143

A. E. Abduraman, S. Berrani, and B. Merialdo, Audio/visual recurrences and decision trees for unsupervised TV program structuring, The 8th International Conference on Computer Vision Theory and Applications, pp.701-708, 2013.

V. Alfred, Algorithms for finding patterns in strings, Algorithms and Complexity, vol.1, p.255, 2014.

N. Ancona, C. Cicirelli, A. Branca, and A. Distante, Goal detection in football by using support vector machines for classification, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222), pp.611-616, 2001.
DOI : 10.1109/IJCNN.2001.939092

A. Barjatya, Block matching algorithms for motion estimation, IEEE Transactions Evolution Computation, vol.8, issue.3, pp.225-239, 2004.

M. Ben and G. Gravier, Unsupervised mining of audiovisually consistent segments in videos with application to structure analysis, 2011 IEEE International Conference on Multimedia and Expo, pp.1-6, 2011.
DOI : 10.1109/ICME.2011.6011951
URL : https://hal.archives-ouvertes.fr/hal-00646603

S. Berrani, G. Manson, and P. Lechat, A non-supervised approach for repeated sequence detection in TV broadcast streams, Signal Processing: Image Communication, pp.525-537, 2008.
DOI : 10.1016/j.image.2008.04.018

M. Bertini, A. D. Bimbo, and P. Pala, Content-based indexing and retrieval of TV news, Pattern Recognition Letters, vol.22, issue.5, pp.503-516, 2001.
DOI : 10.1016/S0167-8655(00)00113-6

Z. Botev, J. Grotowski, and D. Kroese, Kernel density estimation via diffusion. The Annals of Statistics, pp.2916-2957, 2010.

M. Broilo, A. Basso, and F. G. De-natale, Unsupervised anchorpersons differentiation in news video, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp.115-120, 2011.
DOI : 10.1109/CBMI.2011.5972531

R. Brunelli, O. Mich, and C. M. Modena, A Survey on the Automatic Indexing of Video Data,, Journal of Visual Communication and Image Representation, vol.10, issue.2, pp.78-112, 1999.
DOI : 10.1006/jvci.1997.0404

V. Brunie, J. Carrive, and L. Vinet, Ing??nierie des documents audiovisuels : le projet FERIA. Une approche centr??e sur la description des contenus, Techniques et sciences informatiques, vol.25, issue.4, pp.469-496, 2006.
DOI : 10.3166/tsi.25.469-496

L. Catanese, N. Souviraà-labastie, B. Qu, S. Campion, G. Gravier et al., MODIS: an audio motif discovery software, Annual Conference of the International Speech Communication Association, pp.2675-2677, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00931227

W. Chai, Semantic segmentation and summarization of music: methods based on tonality and recurrent structure, IEEE Signal Processing Magazine, vol.23, issue.2, pp.124-132, 2006.
DOI : 10.1109/MSP.2006.1598088

Y. Chang, P. Lin, S. Cheng, K. Chan, Y. Zeng et al., Robust anchorperson detection based on audio streams using a hybrid I-vector and DNN system, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, pp.1-4, 2014.
DOI : 10.1109/APSIPA.2014.7041717

H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk et al., Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions, 2011 18th IEEE International Conference on Image Processing, pp.2609-2612, 2011.
DOI : 10.1109/ICIP.2011.6116200

S. Chen, M. Shyu, M. Chen, and C. Zhang, A multimodal data mining framework for soccer goal detection based on decision tree logic, IEEE International Conference on Multimedia and Expo, pp.265-268, 2004.
DOI : 10.1504/IJCAT.2006.012001

K. Choro?, Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos, Intelligent Information and Database Systems, pp.339-348, 2015.
DOI : 10.1007/978-3-319-15705-4_33

G. E. Crooks, G. Hon, J. Chandonia, and S. E. Brenner, WebLogo: A Sequence Logo Generator, Genome Research, vol.14, issue.6, pp.1188-1190, 2004.
DOI : 10.1101/gr.849004

M. Delakis, G. Gravier, and P. Gros, Audiovisual integration with Segment Models for tennis video parsing, Computer Vision and Image Understanding, vol.111, issue.2, pp.142-154, 2008.
DOI : 10.1016/j.cviu.2007.09.002
URL : https://hal.archives-ouvertes.fr/inria-00568073

A. Dielmann, Unsupervised detection of multimodal clusters in edited recordings, 2010 IEEE International Workshop on Multimedia Signal Processing, pp.177-182, 2010.
DOI : 10.1109/MMSP.2010.5662015

N. Dimitrova, H. Zhang, B. Shahraray, I. Sezan, T. Huang et al., Applications of video-content analysis and retrieval, IEEE Multimedia, vol.9, issue.3, pp.42-55, 2002.
DOI : 10.1109/MMUL.2002.1022858

Y. Dong, G. Qin, G. Xiao, S. Lian, and X. Chang, Advanced news video parsing via visual characteristics of anchorperson scenes, Telecommunication Systems, vol.24, issue.60, pp.247-263, 2013.
DOI : 10.1007/s11235-013-9731-0

E. Dumont and G. Quénot, Automatic Story Segmentation for TV News Video Using Multiple Modalities, International Journal of Digital Multimedia Broadcasting, vol.11, issue.1, 2012.
DOI : 10.1016/S0167-6393(01)00061-9
URL : https://hal.archives-ouvertes.fr/hal-00767035

S. Eickeler and S. Muller, Content-based video indexing of TV broadcast news using hidden Markov models, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258), pp.2997-3000, 1999.
DOI : 10.1109/ICASSP.1999.757471

S. Eickeler, F. Wallhoff, U. Lurgel, and G. , Content based indexing of images and video using face detection and recognition methods, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), pp.1505-1508, 2001.
DOI : 10.1109/ICASSP.2001.941217

A. Ekin, A. M. Tekalp, and R. Mehrotra, Automatic soccer video analysis and summarization, IEEE Transactions on Image Processing, vol.12, issue.7, pp.796-807, 2003.
DOI : 10.1109/TIP.2003.812758

M. A. Fischler and R. C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Communications of the ACM, vol.24, issue.6, pp.381-395, 1981.
DOI : 10.1145/358669.358692

X. Gao and X. Tang, Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing, IEEE Transactions on Circuits and Systems for Video Technology, pp.765-776, 2002.

J. M. Gauch and A. Shivadas, Identification of new commercials using repeated video sequence detection, IEEE International Conference on Image Processing 2005, p.1252, 2005.
DOI : 10.1109/ICIP.2005.1530626

B. Gu, A. Mu, and A. M. Tekalp, Temporal video segmentation using unsupervised clustering and semantic object tracking, Journal of Electronic Imaging, vol.7, issue.3, pp.592-604, 1998.

A. Guéziec, Tracking pitches for broadcast television, Computer, vol.35, issue.3, pp.38-43, 2002.
DOI : 10.1109/2.989928

A. Hanjalic, R. Lagendijk, and J. Biemond, Template-based detection of anchorperson shots in news programs, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269), pp.148-152, 1998.
DOI : 10.1109/ICIP.1998.727156

J. E. Hopcroft, Introduction to automata theory, languages, and computation, 1979.

A. Jacobs, Using Self-similarity Matrices for Structure Mining on News Video, Advances in Artificial Intelligence, pp.87-94, 2006.
DOI : 10.1007/11752912_11

G. Jaffré and P. Joly, Costume: A new feature for automatic video content indexing, Recherche d'Information Assistée par Ordinateur, pp.314-325, 2004.

D. B. Jayagopi, S. Ba, J. Odobez, and D. Gatica-perez, Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues, Proceedings of the 10th international conference on Multimodal interfaces, IMCI '08, pp.45-52, 2008.
DOI : 10.1145/1452392.1452403

P. Ji, L. Cao, X. Zhang, L. Zhang, and W. Wu, News videos anchor person detection by shot clustering, Neurocomputing, vol.123, pp.86-99, 2014.
DOI : 10.1016/j.neucom.2013.06.003

E. Kijak, G. Gravier, P. Gros, L. Oisel, and F. Bimbot, HMM based structuring of tennis videos using visual and audio cues, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), p.309, 2003.
DOI : 10.1109/ICME.2003.1221310

E. Kijak, G. Gravier, L. Oisel, and P. Gros, Audiovisual integration for tennis broadcast structuring. Multimedia Tools and Applications, pp.289-311, 2006.
DOI : 10.1007/s11042-006-0031-5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.3587

M. Larkin, G. Blackshields, N. Brown, R. Chenna, P. A. Mcgettigan et al., Clustal W and Clustal X version 2.0, Bioinformatics, vol.23, issue.21, pp.2947-2948, 2007.
DOI : 10.1093/bioinformatics/btm404
URL : https://hal.archives-ouvertes.fr/hal-00206210

P. Letessier, O. Buisson, and A. Joly, Scalable mining of small visual objects, Proceedings of the 20th ACM international conference on Multimedia, MM '12, 2012.
DOI : 10.1145/2393347.2393431
URL : https://hal.archives-ouvertes.fr/hal-00739735

H. Li, J. Tang, S. Wu, Y. Zhang, and S. Lin, Automatic detection and analysis of player action in moving background sports video sequences, IEEE Transactions on Circuits and Systems for Video Technology, vol.20, issue.3, pp.351-364, 2010.

Z. Li, B. Ding, J. Han, R. Kays, and P. Nye, Mining periodic behaviors for moving objects, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '10, pp.1099-1108, 2010.
DOI : 10.1145/1835804.1835942

B. Logan and S. Chu, Music summarization using key phrases, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.749-752, 2000.
DOI : 10.1109/ICASSP.2000.859068

L. Lu, M. Wang, and H. Zhang, Repeating pattern discovery and structure analysis from acoustic music data, Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval , MIR '04, pp.275-282, 2004.
DOI : 10.1145/1026711.1026756

A. Mittal, L. Cheong, and L. T. Sing, Robust identification of gradual shottransition types, International Conference on Image Processing, p.413, 2002.

A. Muscariello, G. Gravier, and F. Bimbot, An efficient method for the unsupervised discovery of signalling motifs in large audio streams, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp.145-150, 2011.
DOI : 10.1109/CBMI.2011.5972536
URL : https://hal.archives-ouvertes.fr/inria-00572817

A. Muscariello, G. Gravier, and F. Bimbot, Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching Combination, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.7, pp.2031-2044, 2012.
DOI : 10.1109/TASL.2012.2194283
URL : https://hal.archives-ouvertes.fr/hal-00740978

X. Naturel, G. Gravier, and P. Gros, Fast Structuring of Large Television Streams Using Program Guides, Adaptive Multimedia Retrieval: User, Context, and Feedback, pp.222-231, 2007.
DOI : 10.1007/978-3-540-71545-0_17

C. Panagiotakis and G. Tziritas, A speech/music discriminator based on RMS and zero-crossings, IEEE Transactions on Multimedia, vol.7, issue.1, pp.155-166, 2005.
DOI : 10.1109/TMM.2004.840604

N. V. Patel and I. K. Sethi, Video shot detection and characterization for video databases, Pattern Recognition, vol.30, issue.4, pp.583-592, 1997.
DOI : 10.1016/S0031-3203(96)00114-8

J. Poli, Structuration automatique de flux télévisuels, 2007.

J. Poli and J. Carrive, Modeling Television Schedules for Television Stream Structuring, Advances in Multimedia Modeling, pp.680-689, 2006.
DOI : 10.1007/978-3-540-69423-6_66

K. M. Pua, J. M. Gauch, S. E. Gauch, and J. Z. Miadowicz, Real time repeated video sequence identification, Computer Vision and Image Understanding, vol.93, issue.3, pp.310-327, 2004.
DOI : 10.1016/j.cviu.2003.10.005

B. Qu, F. Vallet, J. Carrive, and G. Gravier, Content-based inference of hierarchical structural grammar for recurrent TV programs using multiple sequence alignment, 2014 IEEE International Conference on Multimedia and Expo (ICME), pp.1-6, 2014.
DOI : 10.1109/ICME.2014.6890295
URL : https://hal.archives-ouvertes.fr/hal-01026335

B. Qu, F. Vallet, J. Carrive, and G. Gravier, Using grammar induction to discover the structure of recurrent TV programs, International Conferences on Advances in Multimedia, pp.112-117, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01026331

B. Qu, F. Vallet, J. Carrive, and G. Gravier, Content-Based Discovery of Multiple Structures from Episodes of Recurrent TV Programs Based on Grammatical Inference, ACM International Conference on Multimedia Modeling, pp.140-154, 2015.
DOI : 10.1007/978-3-319-14445-0_13
URL : https://hal.archives-ouvertes.fr/hal-01089237

Z. Rasheed and M. Shah, Detection and representation of scenes in videos, IEEE Transactions on Multimedia, vol.7, issue.6, pp.1097-1105, 2005.
DOI : 10.1109/TMM.2005.858392

S. Ray and R. H. Turi, Determination of number of clusters in k-means clustering and application in colour image segmentation, The 4th International Conference on Advances in Pattern Recognition and Digital Techniques, pp.137-143, 1999.

N. Saitou and M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Molecular Biology and Evolution, vol.4, issue.4, pp.406-425, 1987.

M. Sipser, Introduction to the Theory of Computation, Cengage Learning, 2012.
DOI : 10.1145/230514.571645

C. G. Snoek and M. Worring, Multimodal Video Indexing: A Review of the State-of-the-art, Multimedia Tools and Applications, vol.25, issue.1, pp.5-35, 2005.
DOI : 10.1023/B:MTAP.0000046380.27575.a5

J. D. Thompson, D. G. Higgins, and T. J. Gibson, Improved sensitivity of profile searches through the use of sequence weights and gap excision, Bioinformatics, vol.10, issue.1, pp.19-29, 1994.
DOI : 10.1093/bioinformatics/10.1.19

J. D. Thompson, D. G. Higgins, and T. J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research, vol.22, issue.22, pp.4673-4680, 1994.
DOI : 10.1093/nar/22.22.4673

K. Thompson, Programming Techniques: Regular expression search algorithm, Communications of the ACM, vol.11, issue.6, pp.419-422, 1968.
DOI : 10.1145/363347.363387

F. Vallet, Structuration automatique de talk shows télévisés, 2011.

P. Viola and M. Jones, Robust real-time object detection, International Journal of Computer Vision, vol.4, pp.34-47, 2001.

J. Wang, L. Duan, Q. Liu, H. Lu, and J. S. Jin, A Multimodal Scheme for Program Segmentation and Representation in Broadcast Video Streams, IEEE Transactions on Multimedia, vol.10, issue.3, pp.393-408, 2008.
DOI : 10.1109/TMM.2008.917362

L. Xie, P. Xu, S. Chang, A. Divakaran, and H. Sun, Structure analysis of soccer video with domain knowledge and hidden Markov models, Pattern Recognition Letters, vol.25, issue.7, pp.767-775, 2004.
DOI : 10.1016/j.patrec.2004.01.005

X. Yang, Q. Tian, and P. Xue, Efficient Short Video Repeat Identification With Application to News Video Structure Analysis, IEEE Transactions on Multimedia, vol.9, issue.3, pp.600-609, 2007.
DOI : 10.1109/TMM.2006.889352

M. M. Yeung and B. Liu, Efficient matching and clustering of video shots, Proceedings., International Conference on Image Processing, pp.338-341, 1995.
DOI : 10.1109/ICIP.1995.529715

X. Yu, L. Li, and H. W. Leong, Interactive broadcast services for live soccer video based on instant semantics acquisition, Journal of Visual Communication and Image Representation, vol.20, issue.2, pp.117-130, 2009.
DOI : 10.1016/j.jvcir.2008.12.004

H. Zhang, Y. Gong, S. W. Smoliar, and S. Y. Tan, Automatic parsing of news video, The International Conference on Multimedia Computing and Systems, pp.45-54, 1994.

J. Zhang, J. Qiu, X. Wang, and L. Wu, Representation of the player action in sport videos, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp.1-4, 2013.
DOI : 10.1109/APSIPA.2013.6694283

T. Zlitni, B. Bouaziz, and W. Mahdi, Automatic topics segmentation for TV news video using prior knowledge, Multimedia Tools and Applications, pp.1-28
DOI : 10.1007/s11042-015-2531-7

F. Precision and G. Measure-for, ep: number of episodes involved for the evaluation), p.114

F. Precision and M. Measure-for, ep: number of episodes involved for the evaluation), p.115