, 3 -Exemple d'annotation manuelle à partir de la transcription correspondant à la figure 5.2. Extrait du cours de Jérémie Bourdon, FIGURE 5

L. Mdhaffar and ;. Mdhaffar, Les colonnes 5 et 6 représentent le nombre d'expressionsclés annotées pour les transcriptions et les diapositives, respectivement. La colonne 7 représente le nombre de diapositives dans chaque cours et la colonne 8 contient la durée de chaque cours. Enfin, la dernière colonne indique la source du cours, 2020.

, Les cours Introduction à l'informatique, Introduction à l'algorithmique et Les fonctions sont donnés par le même enseignant

. .. Analyse,

.. .. Discussion,

.. .. Conclusion,

V. Bettenfeld, M. Salima, C. Christophe, and P. Claudine, « Instrumentation of Classrooms Using Synchronous Speech Transcription, European Conference on Technology Enhanced Learning, pp.648-651, 2018.

V. Bettenfeld, M. Salima, P. Claudine, and C. Et-christophe, « Instrumentation of learning situation using automated speech transcription : A prototyping approach, 11th International Conf on Computer, 2019.

S. Mdhaffar, A. Laurent, and E. Yannick, « Etude de performance des réseaux neuronaux récurrents dans le cadre de la campagne d'évaluation Multi-Genre Broadcast challenge 3 (MGB3), XXXIIe Journées d'Etudes sur la Parole, 2018.

S. Mdhaffar, A. Laurent, and E. Yannick, « Le corpus PASTEL pour le traitement automatique de cours magistraux, 25e conférence sur le Traitement Automatique des Langues Naturelles, 2018.

S. Mdhaffar, E. Yannick, H. Nicolas, L. Antoine, and Q. Solen, « Apport de l'adaptation automatique des modèles de langage pour la reconnaissance de la parole : évaluation qualitative extrinsèque dans un contexte de traitement de cours magistraux, 2019.

S. Mdhaffar, E. Yannick, H. Nicolas, L. Antoine, D. Richard et al., « Qualitative evaluation of ASR adaptation in a lecture context : Application to the PASTEL corpus, Proc. Interspeech, pp.569-573, 2019.

S. Mdhaffar, E. Yannick, L. Antoine, N. Hernandez, D. Richard et al., , 2020.

, Educational Corpus of Oral Courses : Annotation, Analysis and Case Study

A. Masmoudi, M. Salima, S. Rahma, L. Hadrich, and B. , , 2019.

, Automatic diacritics restoration for Tunisian dialect, ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18.3, p.28

S. Mdhaffar, B. Fethi, E. Yannick, and H. Lamia, , 2017.

, Sentiment analysis of Tunisian dialects : Linguistic ressources and experiments, Proceedings of the third Arabic natural language processing workshop, pp.55-61

A. Références, F. Ahmed, . Guzman, S. Hassan, and V. Stephan, « The AMARA Corpus : Building Parallel Language Resources for the Educational Domain, Language Resources and Evaluation Conference (LREC). T. 14, pp.1044-1054, 2014.

B. Abdullah, I. Illina, and F. Dominique, « Dynamic Extension of ASR Lexicon Using Wikipedia Data, 2018 IEEE Spoken Language Technology Workshop (SLT), 2018.

, IEEE, pp.36-42

M. Abe, « A Study on Speaker Individuality Control, 1992.

Y. Akita, Y. Tong, and K. Tatsuya, « Language model adaptation for academic lectures using character recognition result of presentation slides, International Conference on Acoustics, Speech and Signal Processing, pp.5431-5435, 2015.

G. Alharbi and H. Thomas, « Using Topic Segmentation Models for the Automatic Organisation of MOOCs resources, International Conference on Educational Data Mining (EDM), pp.524-527, 2015.

A. Allauzen and G. Jean-luc, « Adaptation automatique du modèle de langage d'un système de transcription de journaux parlés : Modélisation probabiliste du langage naturel, TAL. Traitement automatique des langues 44, vol.1, pp.11-31, 2003.

A. Allauzen and G. Jean-luc, « Diachronic vocabulary adaptation for broadcast news transcription, Ninth European Conference on Speech Communication and Technology, 2005.

T. Alumäe and K. Mikko, « Domain adaptation of maximum entropy language models, pp.301-306, 2010.

W. Aransa, H. Schwenk, and B. Loic, « Improving continuous space language models using auxiliary features, Proceedings of the 12th International Workshop on Spoken Language Translation, pp.151-158, 2015.

S. Arnold, R. Schneider, C. Philippe, A. Felix, . Gers et al., « SECTOR : A Neural Model for Coherent Topic Segmentation and Classification, Transactions of the Association for Computational Linguistics, vol.7, pp.169-184, 2019.

O. Aubert and J. Joscha, « Annotating Video with Open Educational Resources in a Flipped Classroom Scenario, OCWC Global Conference, 2014.

O. Aubert, P. Yannick, and C. Camila, « Leveraging Video Annotations in Video-based e-Learning, Proceedings of the 6th International Conference on Computer Supported Education, pp.479-485, 2014.

E. Auer, R. Albert, S. Han, W. Peter, S. Oliver et al., « ELAN as flexible annotation framework for sound and image processing detectors, Seventh conference on International Language Resources and Evaluation [LREC, 2010.

, European Language Resources Association (ELRA), pp.890-893

C. Auzanne, S. John, J. Garofolo, G. Fiscus, M. William et al., « Automatic language model adaptation for spoken document retrieval, Content-Based Multimedia Information Access, vol.1, pp.132-141, 2000.

M. Bacchiani, R. Michael, R. Brian, and S. Richard, « MAP adaptation of stochastic grammars, Computer speech & language 20, vol.1, pp.41-68, 2006.

P. Badjatiya, J. Litton, M. Kurisinkel, . Gupta, and V. Vasudeva, « Attentionbased neural text segmentation, European Conference on Information Retrieval. Springer, pp.180-193, 2018.

K. Bain, H. Sara, . Basson, and W. Mike, « Speech recognition in university classrooms : liberated learning project, Proceedings of the fifth international ACM conference on Assistive technologies, pp.192-196, 2002.

A. Balagopalan, L. Lakshmi, B. Vidhya, B. Nithin, C. Aswin et al., « Automatic keyphrase extraction and segmentation of video lectures, 2012 IEEE International Conference on Technology Enhanced Education (ICTEE), pp.1-10, 2012.

. Balasubramanian, S. Vidhya, D. Gobu, and K. Et-navaneeth-kumar, « A multimodal approach for extracting content descriptive metadata from lecture videos, Journal of Intelligent Information Systems, vol.46, pp.121-145, 2016.

C. Barras, G. Edouard, W. U. Zhibiao, and L. Mark, « Transcriber : a free tool for segmenting, labeling and transcribing speech, First international conference on language resources and evaluation (LREC), pp.1373-1376, 1998.

C. Barras, G. Edouard, W. U. Zhibiao, and L. Mark, « Transcriber : development and use of a tool for assisting speech corpora production, Speech Communication, vol.33, pp.5-22, 2001.

L. Baum, « An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process, vol.3, pp.1-8, 1972.

F. Béchet, « LIA-PHON : Un système complet de phonétisation de textes, TAL. Traitement automatique des langues 42, vol.1, pp.47-67, 2001.

D. Beeferman, A. Berger, and L. John, Second Conference on Empirical Methods in Natural Language Processing, 1997.

M. Behnke, A. Valerio, M. Barone, S. Rico, S. Vilelmini et al., « Improving machine translation of educational content via crowdsourcing, International Language Resources and Evaluation (LREC), 2018.

, European Language Resource Association

P. Bell, Y. Hitoshi, S. Pawel, W. U. Youzheng, M. Fergus et al., « A lecture transcription system combining neural network acoustic and language models. » In : INTERSPEECH, pp.3087-3091, 2013.

J. R. Bellegarda, « Exploiting latent semantic information in statistical language modeling, Proceedings of the IEEE 88, vol.8, pp.1279-1296, 2000.

J. R. Bellegarda, « Statistical language model adaptation : review and perspectives, Speech communication 42, vol.1, pp.93-108, 2004.

J. Ben and M. Ameur, « Évaluation adaptative des systèmes de transcription en contextes applicatifs, 2015.

J. Ben, M. Ameur, G. Olivier, A. Martine, and R. Sophie, « How to evaluate ASR output for named entity recognition ?, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

J. Ben, M. A. Ben, M. Adda-decker?, O. Galibert?, J. Kahn? et al., « ETER : a new metric for the evaluation of hierarchical named entity recognition, Ninth International Conference on Language Resources and Evaluation, 2014.

Y. Bengio, D. Réjean, V. Pascal, and J. Christian, « A neural probabilistic language model, Journal of machine learning research 3.Feb, pp.1137-1155, 2003.

V. Bettenfeld, C. Choquet, and P. Claudine, « Lecture instrumentation based on synchronous speech transcription, International Conference on Advanced Learning Technologies (ICALT, pp.11-15, 2018.

V. Bettenfeld, C. Raphaëlle, and C. Et-christophe, « PASTEL : un environnement outillé exploitant la reconnaissance de la parole dans les situations de cours, 2018.

V. Bettenfeld, M. Salima, C. Christophe, and P. Claudine, « Instrumentation of Classrooms Using Synchronous Speech Transcription, European Conference on Technology Enhanced Learning, pp.648-651, 2018.

V. Bettenfeld, C. Raphaëlle, P. Claudine, and C. Et-christophe, « Elaboration d'une méthodologie d'instrumentation pédagogique en contexte universitaire, 9 ème Conférence sur les Environnements Informatiques pour l'Apprentissage Humain, 2019.

V. Bettenfeld, M. Salima, P. Claudine, and C. Et-christophe, « Instrumentation of learning situation using automated speech transcription : A prototyping approach, 11th International Conf on Computer, 2019.

B. Bigi, D. E. Renato, . Mori, and T. Spriet, « Reconnaissance thématique à partir de textes dictés et Adaptation dynamique de modèles de langages thématiques, Journées d'Etudes sur la Parole (JEP), 2000.

D. M. Blei, Y. Andrew, . Ng, I. Michael, and . Jordan, « Latent dirichlet allocation, Journal of machine Learning research, vol.3, pp.993-1022, 2003.

A. Bouchekif, D. Géraldine, E. Yannick, C. Delphine, and C. Nathalie, « Diachronic semantic cohesion for topic segmentation of tv broadcast news, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

A. Bouchekif, « Structuration automatique de documents audio, 2016.

F. Bougares, « Attelage de systèmes de transcription automatique de la parole, 2012.

H. Bourlard, J. Christian, and . Wellekens, « Links between Markov models and multilayer perceptrons, Advances in neural information processing systems, pp.502-510, 1989.

H. Bourlard, F. Marc, P. Nikolaos, A. Popescu-belis, R. Steve et al., « Processing and linking audio events in large multimedia archives : The eu inevent project, First Workshop on Speech, Language and Audio in Multimedia, 2013.

G. Brown, G. D. Brown, R. Gillian, . Brown, G. Brown et al., Discourse analysis, 1983.

P. F. Brown, V. Peter, . Desouza, L. Robert, . Mercer et al., « Class-based n-gram models of natural language, Computational linguistics 18, vol.4, pp.467-479, 1992.

C. Canellas, A. Olivier, and P. Yannick, « Prise de note collaborative en vue d'une tâche : une étude exploratoire avec COCoNotes Live, 2015.

S. Cerdà, J. A. , M. Angel, D. Agua-teba, G. Vicente et al., TransLectures ». In : IberSPEECH 2012-VII Jornadas en Tecnología del Habla and III Iberian SLTech Workshop. IberSPEECH, pp.345-351, 2012.

P. Cerva, J. Silovsky, Z. Jindrich, N. Jan, and M. Jiri, « Real-time lecture transcription using ASR for czech hearing impaired or deaf students, Thirteenth Annual Conference of the International Speech Communication Association, 2012.

C. Chelba, J. Timothy, . Hazen, and S. Murat, « Retrieval and browsing of spoken content, IEEE Signal Processing Magazine, vol.25, pp.39-49, 2008.

C. Chelba and J. Frederick, « Structured language modeling, Computer Speech & Language, vol.14, pp.283-332, 2000.

C. Chelba, D. E. Harry, P. Frederick, J. Eric, R. Victor et al., Structure and performance of a dependency language model. Rapp. tech. sri international menlo park ca speech technology et research lab, 1997.

L. Chen, . Et-taiyi, and . Huang, « An improved MAP method for language model adaptation, Sixth European Conference on Speech Communication and Technology, 1999.

S. F. Chen, « Shrinking exponential language models, Proceedings of Human Language Technologies : The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp.468-476, 2009.

S. F. Chen and G. Joshua, « An empirical study of smoothing techniques for language modeling, Computer Speech & Language 13, vol.4, pp.359-394, 1999.

X. Chen, T. Tian, L. Xunying, L. Pierre, W. Moquan et al., « Recurrent neural network language model adaptation for multigenre broadcast speech recognition, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

X. Chen, L. Xunying, Q. Yanmin, M. Gales, C. Philip et al., « CUED-RNNLM-An open-source toolkit for efficient training and evaluation of recurrent neural network language models, Acoustics, Speech and Signal Processing, p.2016, 2016.

, IEEE, pp.6000-6004

Y. Chen, Y. Huang, S. Kong, and L. Lin-shan, « Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features, IEEE Spoken Language Technology Workshop. IEEE, pp.265-270, 2010.

E. Cho, F. Christian, H. Teresa, K. Kevin, M. Mohammed et al., « A real-world system for simultaneous translation of German lectures. » In : INTERSPEECH, pp.3473-3477, 2013.

K. Cho, B. Van-merriënboer, G. Caglar, B. Dzmitry, B. Fethi et al., « Learning phrase representations using RNN encoder-decoder for statistical machine translation, Empirical Methods in Natural Language Processing, 2014.

F. Choi and . Yy, « Advances in domain independent linear text segmentation, Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pp.26-33, 2000.

J. Chung, G. Caglar, C. Kyunghyun, and B. Yoshua, « Empirical evaluation of gated recurrent neural networks on sequence modeling, 2014.

, Workshop on Deep Learning, 2014.

G. E. Dahl, Y. U. Dong, D. Li, and A. Acero, « Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition, IEEE Transactions on audio, speech, and language processing 20, vol.1, pp.30-42, 2011.

G. E. Dahl, Y. U. Dong, D. Li, and A. Acero, « Large vocabulary continuous speech recognition with context-dependent DBN-HMMs, ICASSP, pp.4688-4691, 2011.

S. Davis and P. Mermelstein, « Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE transactions on acoustics, speech, and signal processing 28, vol.4, pp.357-366, 1980.

S. Deena, H. Madina, D. Mortaza, S. Oscar, and H. Thomas, « Combining feature and model-based adaptation of RNNLMs for multi-genre broadcast speech recognition, Proceedings of the Annual Conference of the International Speech Communication Association, pp.2343-2347, 2016.

S. Deena, H. Madina, D. Mortaza, S. Oscar, and H. Thomas, « Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment, Speech and Language Processing, vol.27, pp.572-582, 2019.

C. Dorai, O. Vincent, and N. Viswanath, « Structuralizing educational videos based on presentation content, Proceedings 2003 International Conference on Image Processing, p.1029, 2003.

Z. Elloumi, « Prédiction de performances des systèmes de Reconnaissance Automatique de la Parole, 2019.

Y. Estève, « Intégration de sources de connaissances pour la modélisation stochastique du langage appliquée à la parole continue dans un contexte de dialogue oral homme-machine, 2002.

J. Fauconnier, S. Laurent, K. Mouna, M. Mustapha, and A. Nathalie, « Détection automatique de la structure organisationnelle de documents à partir de marqueurs visuels et lexicaux, Actes de la Conférence sur le Traitement Automatique des Langues Naturelles, 2014.

M. Federico, « Efficient language model adaptation through MDI estimation, Sixth European Conference on Speech Communication and Technology, 1999.

M. Federico and B. Nicola, « Broadcast news LM adaptation over time, Computer Speech & Language 18, vol.4, pp.417-435, 2004.

D. Fetterly, M. Mark, N. Marc, L. Janet, and . Wiener, « A large-scale study of the evolution of Web pages, Software : Practice and Experience, vol.34, pp.213-237, 2004.

G. Forney and . David, « The viterbi algorithm, Proceedings of the IEEE 61, vol.3, pp.268-278, 1973.

C. Fügen, « A system for simultaneous translation of lectures and speeches, 2008.

C. Fügen, W. Matthias, W. John, . Mcdonough, I. Shajith et al., , 2006.

, Advances in lecture recognition : The isl rt-06s evaluation system, Ninth International Conference on Spoken Language Processing

C. Fügen, I. Shajith, K. Florian, K. Kenichi, L. Kornel et al., « The ISL RT-06S speech-to-text system, International Workshop on Machine Learning for Multimodal Interaction, pp.407-418, 2006.

A. Fujii, I. Katunobu, and I. Tetsuya, « Lodem : A system for on-demand video lectures, Speech Communication, vol.48, pp.516-531, 2006.

Y. Fujii, Y. Kazumasa, K. Norihide, and N. Seiichi, « Class lecture summarization taking into account consecutiveness of important sentences, Ninth Annual Conference of the International Speech Communication Association, 2008.

S. Furui, M. Kikuo, and I. Hitoshi, « A Japanese national project on spontaneous speech corpus and processing technology, ASR2000-Automatic Speech Recognition : Challenges for the new Millenium ISCA Tutorial and Research Workshop, 2000.

S. Furui, I. Koji, H. Chiori, S. Takahiro, Y. Saito et al., « Ubiquitous speech processing, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings. T. 1. IEEE, pp.13-16, 2001.

O. Galibert and J. Kahn, « The first official repere evaluation, First Workshop on Speech, Language and Audio in Multimedia, 2013.

M. Galley, K. Mckeown, F. Eric, and J. Et-hongyan, « Discourse segmentation of multi-party conversation, Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol.1, pp.562-569, 2003.

. Galliano, G. Sylvain, . Gravier, and C. Laura, « The ESTER 2 evaluation campaign for the rich transcription of French radio broadcasts, Tenth Annual Conference of the International Speech Communication Association, 2009.

. Galliano, . Sylvain, G. Edouard, M. Djamel, C. Khalid et al., « The ESTER phase II evaluation campaign for the rich transcription of French broadcast news, Ninth European Conference on Speech Communication and Technology, 2005.

A. Gelan, « Language and Text-to-Speech technologies for highly accessible language & culture learning, 2010.

M. Georgescul, A. Clark, and A. Susan, « Word distributions for thematic segmentation in a support vector machine approach, Proceedings of the Tenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, pp.101-108, 2006.

S. Ghannay, « Etude sur les representations continues de mots appliquees a la detection automatique des erreurs de reconnaissance de la parole, 2017.

D. Gildea and T. Hofmann, « Topic-based language models using EM, Sixth European Conference on Speech Communication and Technology, 1999.

H. Gish, M. Siu, and R. Robin, « Segregation of speakers for speech recognition and speaker identification, [Proceedings] ICASSP 91 : 1991 International Conference on Acoustics, Speech, and Signal Processing, pp.873-876, 1991.

J. Glass, T. J. Hazen, H. Lee, and W. Chao, « Analysis and processing of lecture audio data : Preliminary investigations, Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL, pp.9-12, 2004.

J. Glass, T. J. Hazen, C. Scott, M. Igor, H. David et al., « Recent progress in the MIT spoken lecture processing project, Eighth Annual Conference of the International Speech Communication Association, 2007.

J. R. Glass, J. Timothy, . Hazen, C. Scott, K. Schutte et al., , 2005.

, « The MIT spoken lecture processing project, Proceedings of HLT/EMNLP on Interactive Demonstrations, pp.28-29

S. Goldwater, D. J. Christopher, and D. Manning, « Which words are hard to recognize ? Prosodic, lexical, and disfluency factors that increase ASR error rates, Proceedings of ACL-08 : HLT, pp.380-388, 2008.

S. Goldwater, D. J. Christopher, and D. Manning, « Which words are hard to recognize ? Prosodic, lexical, and disfluency factors that increase speech recognition error rates, Speech Communication, vol.52, pp.181-200, 2010.

I. J. Good, « The population frequencies of species and the estimation of population parameters, Biometrika 40.3-4, pp.237-264, 1953.

J. T. Goodman, « A bit of progress in language modeling, Computer Speech & Language 15, vol.4, pp.403-434, 2001.

G. Gravier, J. F. Bonastre, E. Geoffrois, S. Galliano, K. Mctait et al., , 2004.

«. Ester, une campagne d'évaluation des systèmes d'indexation automatique d'émissions radiophoniques en français, Proc. Journées d'Etude sur la Parole (JEP)

F. Grézl, K. Martin, K. Stanislav, and C. Jan, « Probabilistic and bottle-neck features for LVCSR of meetings, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07, p.757, 2007.

C. Guinaudeau, G. Guillaume, and S. Pascale, « Improving ASRbased topic segmentation of TV programs with confidence measures and semantic relations, Eleventh Annual Conference of the International Speech Communication Association, 2010.

C. Guinaudeau and J. Hirschberg, « Accounting for Prosodic Information to Improve ASR-Based Topic Tracking for TV Broadcast News, Twelfth Annual Conference of the International Speech Communication Association, 2011.

X. He, L. Deng, and A. Acero, « Why word error rate is not a good metric for speech recognizer training for the speech translation task ?, In : 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5632-5635, 2011.

M. A. Hearst, « TextTiling : Segmenting text into multi-paragraph subtopic passages, Computational linguistics 23, vol.1, pp.33-64, 1997.

M. Hentschel, D. Marc, O. Atsunori, I. Tomoharu, and N. Tomohiro, « A Unified Framework for Feature-based Domain Adaptation of Neural Network Language Models, ICASSP 2019-2019 IEEE International Conference on, 2019.

. Acoustics, Speech and Signal Processing (ICASSP), pp.7250-7254

M. Hentschel, D. Marc, O. Atsunori, I. Tomoharu, and N. Tomohiro, « Feature Based Domain Adaptation for Neural Network Language Models 167, 2019.

, with Factorised Hidden Layers, vol.102, pp.598-608

H. Hermansky, L. Anthony, and C. Jr, « Perceptual linear predictive (PLP) analysis-resynthesis technique, Second European Conference on Speech Communication and Technology, 1991.

H. Hermansky, . Et-sangita, and . Sharma, « Temporal patterns (TRAPS) in ASR of noisy speech, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258). T. 1. IEEE, pp.289-292, 1999.

H. Hermansky, M. Nelson, B. Aruna, and K. Phil, « RASTA-PLP speech analysis technique, Proceedings ICASSP-92 : 1992 IEEE International Conference on, 1992.

, Acoustics, Speech, and Signal Processing. T. 1. IEEE, pp.121-124

N. Hernandez, « Description et détection automatique de structures de texte, 2004.

N. Hernandez and G. Et-brigitte, « Détection automatique de structures fines de texte, Actes de la 12e Conférence sur le Traitement Automatique des Langues Naturelles, 2005.

G. Hinton, D. Li, Y. U. Dong, D. George, M. Abdel-rahman et al., « Deep neural networks for acoustic modeling in speech recognition, IEEE Signal processing magazine, vol.29, 2012.

I. Ho, K. Hajime, S. Akira, and Y. Kazuo, « Enhancing global and synchronous distance learning and teaching by using instant transcript and translation, Cyberworlds, 2005. International Conference on, p.5, 2005.

S. Hochreiter and S. Et-jürgen, « Long short-term memory, Neural computation 9, vol.8, pp.1735-1780, 1997.

T. Hofmann, « Probabilistic latent semantic analysis, Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, pp.289-296, 1999.

T. Hofmann, « Probabilistic latent semantic indexing, ACM SIGIR Forum, 2017.

B. Hsu, « Generalized linear interpolation of language models, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pp.136-140, 2007.

B. Hsu, J. Paul, and . Glass, « Style & topic language model adaptation using HMM-LDA, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp.373-381, 2006.

J. Huang, W. Martin, C. Stanley, S. Olivier, P. Daniel et al., , p.168, 2006.

, IBM Rich Transcription Spring 2006 speech-to-text system for lecture meetings, International Workshop on Machine Learning for Multimodal Interaction, pp.432-443

. Hwang, . Wu-yuin, S. Rustam, C. T. Tony, and C. Kuo-et-nian-shing, « Effects of speech-to-text recognition application on learning performance in synchronous cyber classrooms, Journal of Educational Technology & Society, vol.15, p.367, 2012.

A. Iglesias, L. Moreno, and J. Javier, « Supporting Teachers to Automatically Build Accessible Pedagogical Resources : The APEINTA Project, Technology Enhanced Learning. Quality of Teaching and Educational Reform, pp.620-624, 2010.

A. Iglesias, J. J. Pablo, R. , and L. Moreno, « Avoiding communication barriers in the classroom : the APEINTA project, Interactive Learning Environments 24, vol.4, pp.829-843, 2016.

D. Jansen, A. Alcala, and G. Francisco, « Amara : A sustainable, global solution for accessibility, International Conference on Universal Access in Human-Computer Interaction, pp.401-411, 2014.

F. Jelinek, L. Robert, . Mercer, R. Lalit, . Bahl et al., « Perplexity-a measure of the difficulty of speech recognition tasks, The Journal of the Acoustical Society of America, vol.62, pp.63-63, 1977.

F. Jelinek, « Continuous speech recognition by statistical methods, Proceedings of the IEEE 64, vol.4, pp.532-556, 1976.

S. Joty, C. Giuseppe, M. Gabriel, T. Raymond, and . Ng, « Supervised topic segmentation of email conversations, Fifth International AAAI Conference on Weblogs and Social Media, 2011.

S. F. Juan and . Samson, « Exploiting resources from closely-related languages for automatic speech recognition in low-resource languages from Malaysia, 2015.

H. Jung, H. Valentina, S. Et-juho, and K. , « DynamicSlide : Exploring the Design Space of Reference-based Interaction Techniques for Slide-based Lecture Videos, Proceedings of the 2018 Workshop on Multimedia for Accessible Human Computer Interface, pp.33-41, 2018.

J. Kahn, G. Olivier, Q. Ludovic, C. Matthieu, G. Aude et al., « A presentation of the REPERE challenge, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI), pp.1-6, 2012.

M. Kan, « Linear Segmentation and Segment Significance », Proc. of WVLC-6, 1998.

T. Kawahara, Y. Nemoto, and A. Yuya, « Automatic lecture transcription by exploiting presentation slide information for language model adaptation, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4929-4932, 2008.

A. Kellner, « Initial language models for spoken dialogue systems, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'98 (Cat. No. 98CH36181). T. 1. IEEE, pp.185-188, 1998.

H. H. Kim, Y. Ho, and K. , « Generic speech summarization of transcribed lecture videos : Using tags and their semantic relations, Journal of the Association for Information Science and Technology, vol.67, pp.366-379, 2016.

Y. Kim, J. Yacine, S. David, M. Alexander, and . Rush, Thirtieth AAAI Conference on Artificial Intelligence, 2016.

B. Kingsbury, N. Tara, . Sainath, and S. Hagen, « Scalable minimum Bayes risk training of deep neural network acoustic models using distributed Hessian-free optimization, pp.10-13, 2012.

D. Klakow, « Log-linear interpolation of language models, Fifth International Conference on Spoken Language Processing, 1998.

R. Kneser and N. Hermann, « Improved backing-off for m-gram language modeling, 1995 International Conference on Acoustics, Speech, and Signal Processing, 1995.

T. Ko, P. Vijayaditya, P. Daniel, and K. Sanjeev, « Audio augmentation for speech recognition, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

. Kordoni, . Valia, . Van-den, K. Bosch, K. Lida et al., « Enhancing access to online education : Quality machine translation of MOOC content, Language Resources and Evaluation Conference (LREC). European Language Resources Association, 2016.

O. Koshorek, C. Adir, M. Noam, R. Michael, and B. Jonathan, , 2018.

, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologies, pp.469-473

R. Kuhn, D. E. Renato, and . Mori, « A cache-based natural language model for speech recognition, IEEE transactions on pattern analysis and machine intelligence 12, vol.6, pp.570-583, 1990.

S. Kullback, Information theory and statistics, 1997.

G. Kurata, R. Bhuvana, S. George, and S. Abhinav, « Language modeling with highway LSTM, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp.244-251, 2017.

L. Lamel, A. Gilles, E. Bilinski, and G. Jean-luc, « Transcribing lectures and seminars, Ninth European Conference on Speech Communication and Technology, 2005.

. Lamprier, . Sylvain, A. Tassadit, L. Bernard, and S. Frederic, « On evaluation methodologies for text segmentation algorithms, 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007). T. 2. IEEE, pp.19-26, 2007.

R. Lau, R. Rosenfeld, and R. Salim, « Trigger-based language models : A maximum entropy approach, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.45-48, 1993.

G. Lecorvé, « Adaptation thématique non supervisée d'un système de reconnaissance automatique de la parole, 2010.

G. Lecorvé, G. Guillaume, and S. Pascale, « An unsupervised webbased topic language model adaptation method, Proc. of the International Conference on Acoustics, Speech, and Signal Processing, pp.5081-5084, 2008.

B. Lecouteux, « Reconnaissance automatique de la parole guidée par des transcriptions a priori, 2008.

Y. Lecun, B. E. Boser, S. John, D. H. Denker, E. Richard et al.,

E. Wayne, . Hubbard, D. Lawrence, and . Jackel, « Handwritten digit recognition with a back-propagation network, Advances in neural information processing systems, pp.396-404, 1990.

J. Li, L. Deng, H. Reinhold, and G. Et-yifan, Robust automatic speech recognition : a bridge to practical applications, 2015.

M. Lin, « Automated Lecture Video Segmentation : Facilitate Content Browsing and Retrieval, 2006.

M. Lin, F. Jay, . Nunamaker, C. Michael, and C. Hsinchun, « Segmentation of lecture videos based on text : a method combining multiple linguistic features, p.37, 2004.

, Annual Hawaii International Conference on System Sciences, 2004.

M. Lin, C. Michael, C. Jinwei, F. Jay, and . Nunamaker-jr, « Automated video segmentation for lecture videos : A linguistics-based approach, International Journal of Technology and Human Interaction, pp.27-45, 2005.

Z. Ling, « An Acoustic Model for English Speech Recognition Based on Deep Learning, 2019 11th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp.610-614, 2019.

X. Liu, J. F. Mark, . Gales, C. Philip, and . Woodland, « Context dependent language model adaptation, Ninth Annual Conference of the International Speech Communication Association, 2008.

Y. Liu, X. Luan, Y. Xie, D. Dai, and L. Wu, « Narrative structure analysis of lecture video with hierarchical hidden markov model for e-learning, International Conference on Technologies for E-Learning and Digital Entertainment, 2006.

. Springer, , pp.429-437

H. Lu, . Sheng-syun, . Shen, S. Sz-rung, H. Lee et al., « Alignment of spoken utterances with slide content for easier learning with recorded lectures using structured support vector machine (svm), Fifteenth Annual Conference of the International Speech Communication Association, 2014.

E. Luppi, P. Raffaella, R. Carla, T. Daniela, T. Ivan et al., « Net4voice : new technologies for voice-converting in barrier-free learning environments, eLearning papers 13, p.4, 2009.

P. Maergner, A. W. Et-ian, and L. , « Unsupervised vocabulary selection for real-time speech recognition of lectures, 2012 IEEE International Conference on, 2012.

. Acoustics, Speech and Signal Processing (ICASSP), pp.4417-4420

J. Makhoul, K. Francis, S. Richard, and W. Ralph, « Performance measures for information extraction, Proceedings of DARPA broadcast news workshop, pp.249-252, 1999.

I. I. Malioutov and . Mikhailovich, « Minimum cut model for spoken lecture segmentation, 2006.

L. Mangu, E. Brill, and S. Andreas, « Finding consensus in speech recognition : word error minimization and other applications of confusion networks, Computer Speech & Language, vol.14, pp.373-400, 2000.

P. Mareüil, C. D. Boula-de, . Alessandro, Y. François, A. Véronique et al., « A French Phonetic Lexicon with Variants for Speech and Language Processing, 2000.

J. E. Markel and A. H. Gray, Linear Prediction of Speech, 1982.

S. Marquard, « Improving searchability of automatically transcribed lectures through dynamic language modelling, 2012.

A. Martínez-villaronga, A. Miguel, A. Jesús, and J. Et-alfons, , 2013.

, « Language model adaptation for video lectures transcription, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.8450-8454

C. Martins, T. António, and J. Neto, « Dynamic vocabulary adaptation for a daily and real-time broadcast news transcription system, IEEE Spoken Language Technology Workshop. IEEE, pp.146-149, 2006.

A. Masmoudi, M. Salima, S. Rahma, L. Hadrich, and B. , , 2019.

, Automatic diacritics restoration for Tunisian dialect, ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18.3, p.28

R. Masumura, H. Seongjun, and I. Akinori, « Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition, Twelfth Annual Conference of the International Speech Communication Association, 2011.

S. Mdhaffar, A. Laurent, and E. Yannick, « Etude de performance des réseaux neuronaux récurrents dans le cadre de la campagne d'évaluation Multi-Genre Broadcast challenge 3 (MGB3), XXXIIe Journées d'Etudes sur la Parole, 2018.

S. Mdhaffar, A. Laurent, and E. Yannick, « Le corpus PASTEL pour le traitement automatique de cours magistraux, 25e conférence sur le Traitement Automatique des Langues Naturelles, 2018.

S. Mdhaffar, B. Fethi, E. Yannick, and H. Lamia, , 2017.

, Sentiment analysis of Tunisian dialects : Linguistic ressources and experiments, Proceedings of the third Arabic natural language processing workshop, pp.55-61

S. Mdhaffar, E. Yannick, H. Nicolas, L. Antoine, and Q. Solen, « Apport de l'adaptation automatique des modèles de langage pour la reconnaissance de la parole : évaluation qualitative extrinsèque dans un contexte de traitement de cours magistraux, 2019.

S. Mdhaffar, E. Yannick, H. Nicolas, L. Antoine, D. Richard et al., « Qualitative evaluation of ASR adaptation in a lecture context : Application to the PASTEL corpus, Proc. Interspeech, pp.569-573, 2019.

S. Mdhaffar, E. Yannick, L. Antoine, N. Hernandez, D. Richard et al., , 2020.

, Educational Corpus of Oral Courses : Annotation, Analysis and Case Study

S. Meignier and M. Et-teva, « LIUM SpkDiarization : an open source toolkit for diarization, CMU SPUD Workshop, 2010.

T. Mikolov, « Statistical language models based on neural networks, 2012.

T. Mikolov and Z. Geoffrey, « Context dependent recurrent neural network language model, 2012 IEEE Spoken Language Technology Workshop (SLT), pp.234-239, 2012.

T. Mikolov, K. Martin, B. Luká?, J. Sanjeev, and K. , « Recurrent neural network based language model, Eleventh annual conference of the international, 2010.

T. Mikolov, K. Stefan, B. Luká?, J. Sanjeev, and K. , « Extensions of recurrent neural network language model, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5528-5531, 2011.

G. A. Miller, WordNet : An electronic lexical database, 1998.

J. Miranda, J. Paulo, N. Et-alan, and W. Black, « Improving ASR by integrating lecture audio and slides, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.8131-8135, 2013.

J. D. Miró and . Valor, « Evaluation of innovative computer-assisted transcription and translation strategies for video lecture repositories, 2017.

H. Misra, Y. François, M. Joemon, . Jose, and C. Olivier, « Text segmentation via topic modeling : an analytical study, Proceedings of the 18th ACM conference on Information and knowledge management, pp.1553-1556, 2009.

R. C. Moore and L. William, « Intelligent selection of language model training data, Proceedings of the ACL 2010 conference short papers. Association for Computational Linguistics, pp.220-224, 2010.

N. Moreau, M. Djamel, S. Rainer, S. Burger, and C. Khalid, « Data Collection for the CHIL CLEAR, Evaluation Campaign. » In : LREC. T, vol.8, pp.28-30, 2007.

N. Morgan and B. Herve, « Continuous speech recognition using multilayer perceptrons with hidden Markov models, International conference on acoustics, speech, and signal processing, pp.413-416, 1990.

H. Mougard, M. R. , C. Higuera, Q. Solen, and A. Olivier, « The Paper or the Video : Why Choose ?, In : International World Wide Web Conference, 2015.

M. Müller, T. Son, N. , J. N. Eunah, C. Bastian et al., « Lecture Translator-Speech translation framework for simultaneous lecture translation, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics : Demonstrations, pp.82-86, 2016.

C. Munteanu, B. Ronald, P. Gerald, T. Elaine, and J. David, , 2006.

, « The effect of speech recognition accuracy rates on the usefulness and usability of webcast archives, Proceedings of the SIGCHI conference on Human Factors in computing systems, pp.493-502

B. Muramatsu, A. Mckinney, D. Phillip, . Long, and Z. John, « Spo-kenMedia project : Media-linked transcripts and rich media notebooks for learning and teaching, International Workshop on Technology for Education. IEEE, pp.6-9, 2009.

A. Nasr, B. Frédéric, R. Jean-françois, F. Benoît, L. E. Joseph et al., , 2011.

«. Macaon, An nlp tool suite for processing word lattices, Proceedings of the 49th

, Annual Meeting of the Association for Computational Linguistics : Human Language Technologies : Systems Demonstrations, pp.86-91

K. Nickel, G. Tobias, K. Hazim, J. Ekenel, . Mcdonough et al., « An audio-visual particle filter for speaker tracking on the clear'06 evaluation dataset, International Evaluation Workshop on Classification of Events, Activities and Relationships, pp.69-80, 2006.

A. Ntoulas, Crawling and searching the Hidden Web, 2006.

S. Oger, « Modèles de langage ad hoc pour la reconnaissance automatique de la parole, 2011.

S. Oger, V. Popescu, and L. Georges, « Using the world wide web for learning new words in continuous speech recognition tasks : Two case studies, Proc. Speech and Computer, pp.76-81, 2009.

S. Oger, G. Linares, B. Frédéric, and N. Pascal, « On-demand new word learning using world wide web, 2008 IEEE International Conference on, 2008.

. Acoustics, Speech and Signal Processing, IEEE, pp.4305-4308

D. S. Pallett, « A look at NIST's benchmark ASR tests : past, present, and future, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.483-488, 2003.

D. D. Palmer and O. Mari, « Improving out-of-vocabulary name resolution, Computer Speech & Language, vol.19, pp.107-128, 2005.

A. Park, J. Timothy, . Hazen, R. James, and . Glass, « Automatic processing of audio lectures for information retrieval : Vocabulary selection and language modeling, IEEE International Conference on Acoustics, Speech, and Signal Processing, p.497, 2005.

J. Park, L. Xunying, J. F. Mark, . Gales, C. Phil et al., « Improved neural network based language modelling and adaptation, Eleventh Annual Conference of the International Speech Communication Association, 2010.

. Peddinti, . Vijayaditya, P. Daniel, and K. Sanjeev, « A time delay neural network architecture for efficient modeling of long temporal contexts, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

L. Pevzner, A. Marti, and . Hearst, « A critique and improvement of an evaluation metric for text segmentation, Computational Linguistics 28, vol.1, pp.19-36, 2002.

D. Povey, P. Vijayaditya, G. Daniel, G. Pegah, M. Vimal et al., Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI. » In : Interspeech, pp.2751-2755, 2016.

M. A. Przybocki, J. G. Fiscus, J. S. Garofolo, and D. S. Pallett, , 1999.

, « 1998 HUB-4 INFORMATION EXTRACTION EVALUATION

A. Raux, B. Langner, A. W. Black, and E. Maxine, « Let's go : Improving spoken dialog systems for the elderly and non-natives, Eighth European Conference on Speech Communication and Technology, 2003.

J. C. Reynar, « An automatic method of finding topic boundaries, Proceedings of the 32nd annual meeting on Association for Computational Linguistics, pp.331-333, 1994.

J. C. Reynar, « Topic segmentation : Algorithms and applications, 1998.

K. Riedhammer, G. Martin, and N. Elmar, « The FAU video lecture browser system, 2012 IEEE Spoken Language Technology Workshop (SLT), pp.392-397, 2012.

M. Riedl and C. Biemann, « Text segmentation with topic models, Journal for Language Technology and Computational Linguistics, vol.27, pp.47-69, 2012.

M. Riedl and C. Biemann, « TopicTiling : a text segmentation algorithm based on LDA, Proceedings of ACL 2012 Student Research Workshop. Association for Computational Linguistics, pp.37-42, 2012.

R. Rosenfeld, « A maximum entropy approach to adaptive statistical language modeling, Computer Speech and Language. T, vol.10, pp.187-228, 1996.

A. Rousseau, « XenC : An open-source tool for data selection in natural language processing, The Prague Bulletin of Mathematical Linguistics, vol.100, pp.73-82, 2013.

A. Rousseau, B. Gilles, D. Paul, E. Yannick, G. Vishwa et al., « LIUM and CRIM ASR system combination for the REPERE evaluation campaign, International Conference on Text, pp.441-448, 2014.

D. E. Rumelhart, G. E. Hinton, J. Ronald, and . Williams, « Learning representations by back-propagating errors, Cognitive modeling 5, vol.3, p.1, 1988.

K. Sadamitsu, T. Mishina, and Y. Mikio, « Topic-based language models using Dirichlet Mixtures, Systems and Computers in Japan 38, vol.12, pp.76-85, 2007.

G. A. Sanders, N. Audrey, . Le, S. John, and . Garofolo, « Effects of word error rate in the DARPA Communicator data during, Seventh International Conference on Spoken Language Processing, 2000.

G. Saon, S. Hagen, N. David, and P. Michael, « Speaker adaptation of neural network acoustic models using i-vectors, pp.55-59, 2013.

R. Sarikaya, A. Gravano, and G. Yuqing, « Rapid language model development using external resources for new spoken dialog domains, Proceedings.(ICASSP'05), 2005.

, IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.

, IEEE, p.573

O. A. Schulte, W. Tobias, and B. Et-armin, « Replay : an integrated and open solution to produce, handle, and distributeaudio-visual (lecture) recordings ». In : Proceedings of the 36th annual ACM SIGUCCS fall conference : moving mountains, blazing trails, pp.195-198, 2008.

H. Schwenk, « Continuous space language models, Computer Speech & Language, vol.21, pp.492-518, 2007.

H. Schwenk and G. Jean-luc, « Neural network language models for conversational speech recognition, Eighth International Conference on Spoken Language Processing, 2004.

G. Senay, « Approches semi-automatiques pour la recherche d'information dans les documents audio, 2011.

G. Senay, L. Benjamin, and L. Georges, « Prédiction de l'indexabilité d'une transcription (Prediction of transcription indexability), Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, vol.1, pp.697-705, 2012.

R. Shadiev, H. Wu-yuin, N. Chen, and Y. Huang, « Review of speech-to-text recognition technology for enhancing learning, Journal of Educational Technology & Society, vol.17, p.65, 2014.

I. Sheikh, I. Illina, and F. Dominique, « How diachronic text corpora affect context based retrieval of OOV proper names for audio news, International Conference on Language Resources and Evaluation (LREC), 2016.

N. Singh-miller and C. Michael, « Trigger-based language modeling using a loss-sensitive perceptron algorithm, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07. T. 4. IEEE, p.25, 2007.

L. Sitbon and P. Bellot, « Topic segmentation using weighted lexical links (wll), Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp.737-738, 2007.

M. Siu, Y. U. George, and G. Herbert, « An unsupervised, sequential learning algorithm for the segmentation of speech waveforms with multiple speakers, International Conference on Acoustics, Speech, and Signal Processing (ICASSP). T. 2. IEEE, pp.189-192, 1992.

D. Soutner and M. Lud?k, « Application of LSTM neural networks in language modelling, International Conference on Text, Speech and Dialogue, pp.105-112, 2013.

B. Souvignier, A. K. Bernhard, R. Hauke, S. Frank, and S. , « The thoughtful elephant : Strategies for spoken dialog systems, IEEE Transactions on Speech and Audio Processing, vol.8, issue.1, pp.51-62, 2000.

R. Srivastava, . Kumar, G. Klaus, and S. Jürgen, « Highway networks, 2015.

R. Stiefelhagen, B. Keni, K. Hazim, . Ekenel, and V. Michael, « Tracking identities and attention in smart environments-contributions and progress in the CHIL project, 8th IEEE International Conference on Automatic Face & Gesture Recognition, pp.1-8, 2008.

N. Stokes, J. Carthy, F. Alan, and . Smeaton, « SeLeCT : a lexical cohesion based news story segmentation system, AI communications 17, vol.1, pp.3-12, 2004.

M. Sundermeyer, R. Schlüter, and N. Hermann, « LSTM neural networks for language modeling, Thirteenth annual conference of the international, 2012.

I. Szöke, C. Jan, M. Fap?o, and J. Zi?ka, « Speech@ FIT lecture browser, IEEE Spoken Language Technology Workshop. IEEE, pp.169-170, 2010.

I. Trancoso, R. Nunes, and N. Luís, « Classroom lecture recognition, International Workshop on Computational Processing of the Portuguese Language, 2006.

. Springer, , pp.190-199

I. Trancoso, M. Rui, M. Helena, A. Isabel, M. Céu et al., , 2008.

, « The Lectra corpus classroom lecture transcriptions in European Portuguese, Economic Theory 1.17, pp.15-16

E. Trentin and G. Marco, « A survey of hybrid ANN/HMM models for automatic speech recognition, Neurocomputing 37.1-4, pp.91-126, 2001.

G. Tür, D. Hakkani-tür, A. Stolcke, and S. Elizabeth, « Integrating prosodic and lexical cues for automatic topic segmentation, Computational linguistics 27, vol.1, pp.31-57, 2001.

M. Utiyama and I. Hitoshi, « A statistical model for domain-independent text segmentation, Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pp.499-506, 2001.

J. Valor-miró, R. Daniel, S. Nadine, M. Pérez-gonzález-de, G. et al., « Evaluating intelligent interfaces for post-editing automatic transcriptions of online video lectures, Open Learning : The Journal of Open, Distance and e-Learning, vol.29, pp.72-85, 2014.

K. Vesel?, W. Shinji, ?. Katerina, K. Martin, B. Luká? et al., « Sequence summarizing neural network for speaker adaptation, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5315-5319, 2016.

A. Viterbi, « Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE transactions on Information Theory 13, vol.2, pp.260-269, 1967.

L. Wang, L. I. Sujian, L. V. Yajuan, and W. Houfeng, « Learning to rank semantic coherence for topic segmentation, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.1340-1344, 2017.

X. Wang, X. Lei, L. U. Mimi, M. A. Bin, E. Siong et al., « Broadcast news story segmentation using conditional random fields and multimodal features, IEICE TRANSACTIONS on Information and Systems, vol.95, pp.1206-1215, 2012.

Y. Wang, A. Acero, and C. Ciprian, « Is word error rate a good indicator for spoken language understanding accuracy, IEEE Workshop on Automatic Speech Recognition and Understanding, pp.577-582, 2003.

I. H. Witten, C. Timothy, and . Bell, « The zero-frequency problem : Estimating the probabilities of novel events in adaptive text compression, Ieee transactions on information theory, vol.37, pp.1085-1094, 1991.

P. Wittenburg, B. Hennie, R. Albert, A. Klassmann, and S. Han, « ELAN : a professional framework for multimodality research, 5th International Conference on Language Resources and Evaluation (LREC 2006), pp.1556-1559, 2006.

W. Xiong, D. Jasha, H. Xuedong, S. Frank, S. Mike et al., « Achieving human parity in conversational speech recognition, 2016.

N. Yamamoto, O. Jun, and A. Yasuo, « Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition, Eighth European Conference on Speech Communication and Technology, 2003.

H. Yamazaki, I. Koji, S. Koichi, F. Sadaoki, and Y. Haruo, « Dynamic language model adaptation using presentation slides for lecture speech recognition, Eighth Annual Conference of the International Speech Communication Association, 2007.

D. Yu, L. Deng, and D. George, « Roles of pre-training and fine-tuning in contextdependent DBN-HMMs for real-world speech recognition », Proc. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2010.

D. Yu, L. Michael, and . Seltzer, « Improved bottleneck features using pretrained deep neural networks », Twelfth annual conference of the international, 2011.

C. Zhai and J. Lafferty, « A study of smoothing methods for language models applied to information retrieval, ACM Transactions on Information Systems (TOIS) 22, vol.2, pp.179-214, 2004.

C. Zhai and J. Lafferty, « A study of smoothing methods for language models applied to ad hoc information retrieval, ACM SIGIR Forum. T. 51. 2. ACM, pp.268-276, 2017.

J. Zhang, . Jian, and F. Pascale, « Active learning of extractive reference summaries for lecture speech summarization, Proceedings of the 2nd Workshop on Building and Using Comparable Corpora : from Parallel to Non-parallel Corpora, pp.23-26, 2009.

I. Zitouni, « Modélisation du langage pour les systèmes de reconnaissance de la parole destinés aux grands vocabulaires : application à MAUD, 2000.