?. Ouni, S. Cohen, M. M. Massaro, and D. W. , Training Baldi to be multilingual: A case study for an Arabic Badr, Speech Communication, vol.45, issue.2, 2005.
DOI : 10.1016/j.specom.2004.11.008

URL : https://hal.archives-ouvertes.fr/hal-00008688

?. Steiner, I. Richmond, K. Ouni, and S. , Speech animation using electromagnetic articulography as motion capture data Introducing visual target cost within an acoustic-visual unit-selection speech synthesizer, AV S P 2 0 1 3 Proc. 10th International Conference on Auditory-Visual Speech Processing (AVSP), pp.49-55, 2011.

?. Toutios, A. Musti, U. Ouni, S. , C. et al., Weight Optimization for Bimodal Unit-Selection Talking Head Synthesis Conference of the International Speech Communication Association -Interspeech Hmm-based automatic visual speech segmentation using facial data, ISCA Interspeech 2010, pp.12-1401, 2010.

?. Toutios, A. Musti, U. Ouni, S. Colotte, V. Wrobel-dautcourt et al., Setup for Acoustic-Visual Speech Synthesis by Concatenating Bimodal Units Towards a true acoustic-visual speech synthesis, 9th International Conference on Auditory-Visual Speech Processing-AVSP2010, 2010.

S. Ouni, Tongue control and its implication in pronunciation training, Computer Assisted Language Learning, vol.21, issue.6, pp.1-1, 2013.
DOI : 10.1016/j.specom.2009.05.006

URL : https://hal.archives-ouvertes.fr/hal-00834554

?. Miranda, J. Ouni, and S. , Mixing faces and voices: a study of the influence of faces and voices on audiovisual intelligibility 1 3 -I n t e r n a t i o n a l C o n f e re n ce o n Auditory-Visual Speech Visual contribution to speech perception: measuring the intelligibility of animated talking heads, AV S P 2 0 EURASIP J. Audio Speech Music Process, pp.7-10, 2007.

?. Ouni and S. , Tongue gestures awareness and pronunciation training, 12th Annual Conference of the International Speech Communication Association-Interspeech, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602418

?. Massaro, D. W. Bigler, S. Chen, T. Perlman, M. Ouni et al., Pronunciation training: the role of eye and ear, Proceedings of Interspeech, pp.2623-2626, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00327687

A. Al-ani and S. H. , Arabic phonology, 1970.
DOI : 10.1515/9783110878769

A. Bawab, Analysis-by-synthesis features for speech recognition, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4185-4188, 2008.
DOI : 10.1109/ICASSP.2008.4518577

A. , D. Ff, and R. G. , A contrastive cinefluorographic investigation of the articulation of emphatic-non emphatic cognate consonants, Studia Linguistica, vol.6, issue.2 2, pp.8-9, 1972.

G. Ananthakrishnan, From Acoustics to Articulation: Study of the acoustic-articulatory relationship along with methods to normalize and adapt to variations in production across different speakers, 2011.

E. Ananthakrishnan, G. Ananthakrishnan, and O. Engwall, Mapping between acoustic and articulatory gestures, Speech Communication, vol.53, issue.4, pp.5-6, 2011.
DOI : 10.1016/j.specom.2011.01.009

URL : https://hal.archives-ouvertes.fr/hal-00727161

E. Ananthakrishnan, G. Ananthakrishnan, and O. Engwall, Mapping between acoustic and articulatory gestures, Speech Communication, vol.53, issue.4, pp.5-6, 2011.
DOI : 10.1016/j.specom.2011.01.009

URL : https://hal.archives-ouvertes.fr/hal-00727161

. Atal, Inversion of articulatory???to???acoustic transformation in the vocal tract by a computer???sorting technique, The Journal of the Acoustical Society of America, vol.63, issue.5, p.3, 1978.
DOI : 10.1121/1.381848

. Badin, Can you 'read'tongue movements? evaluation of the contribution of tongue display to speech understanding, Speech Communication, issue.2 6, pp.4-9, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00175680

. Bailly, Speaking with smile or disgust: data and models, Interspeech, pp.111-116, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00333673

. Bailly, Audiovisual speech synthesis, International Journal of Speech Technology, vol.6, issue.4, pp.3-3, 2003.
URL : https://hal.archives-ouvertes.fr/hal-00169556

. Bailly, Parole et expression des émotions sur le visage d'humanoïdes virtuels. Traité de la réalité virtuel le, pp.187-208, 2009.

B. Barker, J. Barker, and F. Berthommier, Evidence of correlation between acoustic and visual features of speech, ICPhS, 1999.

Y. Ben, Can tongue be recovered from face? The answer of data-driven statistical models, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00508276

. Benoit, Effects of Phonetic Context on Audio-Visual Intelligibility of French, Journal of Speech Language and Hearing Research, vol.37, issue.5, pp.7-8, 1994.
DOI : 10.1044/jshr.3705.1195

URL : https://hal.archives-ouvertes.fr/hal-00828874

E. Bernstein, L. Bernstein, and S. Eberhardt, Johns hopkins lipreading corpus videodisk set, 1986.

. Beskow, SYNFACE ??? A Talking Head Telephone for the Hearing-Impaired, Proceedings of 9th International Conference on Computers Helping People with Special Needs, pp.1178-1186, 2004.
DOI : 10.1007/978-3-540-27817-7_173

. Beskow, . Nordenberg, J. Beskow, and M. Nordenberg, Data-driven synthesis of expressive visual speech using an mpeg-4 talking head, Proceedings of the 9th European Conference on Speech Communication and Technology, pp.793-796, 2005.

. Best, Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener???s native phonological system, The Journal of the Acoustical Society of America, vol.109, issue.2, 2001.
DOI : 10.1121/1.1332378

. Black, Articulatory features for expressive speech synthesis, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4005-4008, 2012.
DOI : 10.1109/ICASSP.2012.6288796

. Boë, The geometric vocal tract variables controlled for vowel production: proposals for constraining acoustic-to-articulatory inversion, Journal of Phonetics, vol.2, issue.0, pp.2-7, 1992.

A. Bosseler and D. Massaro, Development and Evaluation of a Computer-Animated Tutor for Vocabulary and Language Learning in Children with Autism, Journal of Autism and Developmental Disorders, vol.33, issue.6, pp.6-11, 2003.
DOI : 10.1023/B:JADD.0000006002.82367.4f

J. Busset, Acoustic-to-articulatory inversion using cepstral coefficients, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00836808

[. Medizinelektronik, AG500 Data Format and Data Structure, 2004.

[. Medizinelektronik, JustView: AG500 measuring environment display, 2006.

C. , L. Chang, C. Lin, and C. , LIBSVM: a library for support vector machines, 2001.

F. Charpentier, Determination of the vocal tract shape from the formants by analysis of the articulatory-to-acoustic nonlinearities, Speech Communication, vol.3, issue.4, pp.291-308, 1984.
DOI : 10.1016/0167-6393(84)90025-6

F. Charpentier, Determination of the vocal tract shape from the formants by analysis of the articulatory-to-acoustic nonlinearities, Speech Communication, vol.3, issue.4, pp.291-308, 1984.
DOI : 10.1016/0167-6393(84)90025-6

. Cherkassky, . Ma, V. Cherkassky, and Y. Ma, Practical selection of SVM parameters and noise estimation for SVM regression, Neural Networks, vol.17, issue.1, pp.1-7, 2004.
DOI : 10.1016/S0893-6080(03)00169-2

B. Chuang, E. Chuang, and C. Bregler, Mood swings: expressive speech animation, ACM Transactions on Graphics, vol.24, issue.2, pp.3-3, 2005.
DOI : 10.1145/1061347.1061355

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.3306

M. Cohen, M. M. Cohen, and D. W. Massaro, Synthesis of visible speech. Behavior Research Methods, Instruments, & Computers, pp.2-6, 1990.

M. Cohen, M. M. Cohen, and D. W. Massaro, Modeling coarticulation in synthetic visual speech. Models and techniques in computer animation, 1993.

. Cohen, Training a talking head, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces, p.499, 2002.
DOI : 10.1109/ICMI.2002.1167046

V. Colotte and A. Lafosse, Soja: French text-to-speech synthesis system, 2009.

P. Consortium, Gv-lex, 2009.

P. Consortium, Semaine project (http://www.semaine-project. eu), 2009.

O. Demange, S. Demange, and S. Ouni, An episodic memory-based solution for the acoustic-to-articulatory inversion problem, The Journal of the Acoustical Society of America, vol.133, issue.5, pp.3-3, 2013.
DOI : 10.1121/1.4798665

URL : https://hal.archives-ouvertes.fr/hal-00834556

. Demange, Continuous episodic memory based speech recognition using articulatory dynamics, Proceedings of Interspeech, pp.2305-2308, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602414

S. Deng, L. Deng, and D. Sun, Phonetic classification and recognition using HMM representation of overlapping articulatory features for all classes of English sounds, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing, pp.45-48, 1994.
DOI : 10.1109/ICASSP.1994.389359

S. Dixon, N. F. Dixon, and L. Spitz, The detection of audiovisual desynchrony, Perception, vol.9, pp.7-8, 1980.

. Edge, Model-based synthesis of visual speech movements from 3d video, Speech, and Music Processing, 2009.

A. Elgendy, Aspects of pharyngeal coarticulation, LOT, 2001.

. Embarki, Speech clarity and coarticulation in modern standard arabic and dialectal arabic, International Congress of Phonetic Sciences, pp.635-638, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00671225

. Embarki, Instrumental Studies in Arabic Phonetics,v olume319ofCurrent Issues in Linguistic Theory, chapter Acoustic and electromagnetic articulographic study of pharyngealisation: Coarticulatory effects as an index of stylistic and regional variation in Arabic, pp.193-216, 2011.

O. Engwall, Evaluation of a system for concatenative articulatory visual speech synthesis, International Conference on Speech and Language Processing, 2002.

O. Engwall, Analysis of and feedback on phonetic features in pronunciation training with a virtual teacher, Computer Assisted Language Learning, vol.6, issue.1, pp.3-7, 2012.
DOI : 10.1016/S0167-6393(98)00048-X

D. Erler, K. Erler, and L. Deng, Hidden Markov model representation of quantized articulatory features for speech recognition, Computer Speech & Language, vol.7, issue.3, pp.265-282, 1993.
DOI : 10.1006/csla.1993.1014

. Fadiga, Speech listening specifically modulates the excitability of tongue muscles: a TMS study, European Journal of Neuroscience, vol.96, issue.2, pp.3-9, 2002.
DOI : 10.1126/science.286.5449.2526

G. Fant, Acoustic Theory of Speech Production, 1960.
DOI : 10.1515/9783110873429

R. Fernandez, R. Fernandez, and B. Ramabhadran, Automatic exploration of corpus-specific properties for expressive text-to-speech: A case study in emphasis, Proceedings of the 6th ISCA Workshop on Speech Synthesis, pp.34-39, 2007.

. Flege, Factors affecting strength of perceived foreign accent in a second language, The Journal of the Acoustical Society of America, vol.97, issue.5, 1995.
DOI : 10.1121/1.413041

K. Frankel, J. Frankel, and S. King, Asr -articulatory speech recognition, Proc. Eurospeech, pp.599-602, 2001.

. Galatas, Audio-visual speech recognition incorporating facial depth information captured by the kinect, Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European, pp.2714-2717, 2012.

S. Ghazeli, Back consonants and backing coarticulation in Arabic, 1977.

P. Ghosh and S. Narayanan, A generalized smoothness criterion for acoustic-to-articulatory inversion, The Journal of the Acoustical Society of America, vol.128, issue.4, pp.2-3, 2010.
DOI : 10.1121/1.3455847

K. Grauwinkel and S. Fagel, Visualization of internal articulator dynamics for use in speech therapy for children with sigmatismus interdentalis, Int. Conf. on Auditory-Visual Speech Processing, 2007.

. Green, K. P. Green, and P. K. Kuhl, The role of visual information in the processing of, Perception & Psychophysics, vol.62, issue.1, pp.34-42, 1989.
DOI : 10.3758/BF03208030

. Green, K. P. Green, and P. K. Kuhl, Integral processing of visual place and auditory voicing information during phonetic perception., Journal of Experimental Psychology: Human Perception and Performance, vol.17, issue.1, 0278.
DOI : 10.1037/0096-1523.17.1.278

. Guiard-marigny, 3D models of the lips for realistic speech animation, Proceedings Computer Animation '96, pp.80-89, 1996.
DOI : 10.1109/CA.1996.540490

URL : https://hal.archives-ouvertes.fr/inria-00537531

. Hazan, The use of visual cues in the perception of non-native consonant contrasts, The Journal of the Acoustical Society of America, vol.119, issue.3, 2006.
DOI : 10.1121/1.2166611

H. Hiroya, S. Hiroya, and M. Honda, Estimation of Articulatory Movements From Speech Acoustics Using an HMM-Based Speech Production Model, IEEE Transactions on Speech and Audio Processing, vol.12, issue.2, pp.1-7, 2004.
DOI : 10.1109/TSA.2003.822636

. Hunt, . Black, A. Hunt, and A. Black, Unit selection in a concatenative speech synthesis system using a large speech database, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, 1996.
DOI : 10.1109/ICASSP.1996.541110

F. Itakura, Minimum prediction residual principle applied to speech recognition, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.3, issue.2 1, pp.6-7, 1975.

. Iverson, A perceptual interference account of acquisition difficulties for non-native phonemes, Cognition, vol.87, issue.1, pp.47-57, 2003.
DOI : 10.1016/S0010-0277(02)00198-1

R. Jakobson, Selected writingsMofaxxama, the emphatic phonemes in Arabic, pp.510-522, 1962.

. Jesse, The processing of information from multiple sources in simultaneous interpreting, Interpreting International Journal of Research and Practice in Interpreting, vol.5, issue.2, pp.9-14, 2000.
DOI : 10.1075/intp.5.2.04jes

. Jiang, On the importance of audiovisual coherence for the perceived quality of synthesized visual speech, EURASIP Journal on Applied Signal Processing, 2002.

. Jiang, Realistic face animation from sparse stereo meshes, AV S P, 2005.

. Katsamanis, Face Active Appearance Modeling and Speech Acoustic Information to Recover Articulation, IEEE Transactions on Audio, Speech, and Language Processing, vol.17, issue.3, pp.4-5, 2009.
DOI : 10.1109/TASL.2008.2008740

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.414.7006

. King, . Parent, S. A. King, and R. E. Parent, Creating speech-synchronized animation. Visualization and Computer Graphics, IEEE Transactions on, vol.1, issue.1 3, pp.3-4, 2005.
DOI : 10.1109/tvcg.2005.43

. Kröger, Two and threedimensional visual articulatory models for pronunciation training and for treatment of speech disorders, Interspeech, 9th Annual Conference of the International Speech Communication Association, 2008.

. Laboissièrre, R. Galvàn-]-laboissièrre, and A. Galvàn, Inferring the commands of an articulatory model from acoustical specifications of stop/vowel sequences, Proceedings ICPhS, pp.358-361, 1995.

. Larar, Vector quantization of the articulatory space, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.36, issue.12, pp.361812-1818, 1988.
DOI : 10.1109/29.9026

[. Goff, Real-time analysis-synthesis and intelligibility of talking faces, 2nd International conference on Speech Synthesis, pp.53-56, 1994.

K. Levitt, J. Levitt, and W. Katz, The Effects of EMA-Based Augmented Visual Feedback on the English Speakers' Acquisition of the Japanese Flap: A Perceptual Study, 2010.

. Ling, Hmm-based text-toarticulatory-movement prediction and analysis of critical articulators, INTERSPEECH, pp.2194-2197, 2010.

K. Liu and J. Ostermann, Optimization of an Image-Based Talking Head System, Speech, and Music Processing, 2009.
DOI : 10.1016/j.specom.2004.07.002

S. Maeda, Un modele articulatoire de la langue avec des composantes linéaires, pp.152-162, 1979.

S. Maeda, Compensatory Articulation During Speech: Evidence from the Analysis and Synthesis of Vocal-Tract Shapes Using an Articulatory Model, Speech production and speech modelling, pp.131-149, 1990.
DOI : 10.1007/978-94-009-2037-8_6

D. Massaro, Perceiving Talking Faces: From Speech Perception to a Behavioral Principle, 1998.

D. Massaro and J. Light, Read my tongue movements: bimodal learning to perceive and produce non-native speech, Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH '03), pp.2249-2252, 2003.

. Massaro, Pronunciation training: the role of eye and ear, Proceedings of Interspeech, pp.2623-2626, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00327687

. Mattheyses, On the Importance of Audiovisual Coherence for the Perceived Quality of Synthesized Visual Speech, Speech, and Music Processing, 2009.
DOI : 10.1016/j.specom.2004.06.004

O. Miranda, J. Miranda, and S. Ouni, Mixing faces and voices: a study of the influence of faces and voices on audiovisual intelligibility, AV S P 2 0 1 3 -I n t e r n a t io n a l Conference on Auditory-Visual Speech Processing, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00835855

M. Mori, The Uncanny Valley, Energy, vol.7, pp.3-3, 1970.

. Munhall, Visual Prosody and Speech Intelligibility: Head Movement Improves Auditory Speech Perception, Psychological Science, vol.11, issue.2, pp.5-6, 2004.
DOI : 10.1016/S0167-6393(98)00048-X

. Musti, Introducing Visual Target Cost within an Acoustic-Visual Unit-Selection Speech Synthesizer, International Conference on Auditory-Visual Speech Processing -AVSP2011, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602403

. Musti, Introducing visual target cost within an acoustic-visual unit-selection speech synthesizer, Proc. 10th International Conference on Auditory-Visual Speech Processing (AVSP), pp.49-55, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602403

. Musti, Hmm-based automatic visual speech segmentation using facial data, Interspeech 2010, pp.1401-1404, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00526776

. Neiberg, The acoustic to articulatory mapping: non-linear or non-unique?, Proc. Interspeech, pp.1485-1488, 2008.

H. Ney, A dynamic programmation algorithm for nonlinear smoothing, Signal Processing, vol.5, issue.2, pp.1-6, 1983.

N. Nguyen, A MATLAB toolbox for the analysis of articulatory data in the production of speech, Behavior Research Methods, Instruments, & Computers, vol.92, issue.3, pp.4-6, 2000.
DOI : 10.3758/BF03200817

URL : https://hal.archives-ouvertes.fr/hal-01392907

S. Ouni, Can we retrieve vocal tract dynamics that produced speech? toward a speaker articulatory strategy model, Interspeech 2005-Eurospeech, pp.1037-1040, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00008691

S. Ouni, Tongue gestures awareness and pronunciation training, 12th Annual Conference of the International Speech Communication Association-Interspeech, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602418

S. Ouni, Tongue control and its implication in pronunciation training, Computer Assisted Language Learning, vol.21, issue.6, pp.1-1, 2013.
DOI : 10.1016/j.specom.2009.05.006

URL : https://hal.archives-ouvertes.fr/hal-00834554

. Ouni, Visual Contribution to Speech Perception: Measuring the Intelligibility of Animated Talking Heads, EURASIP Journal on Audio, Speech, and Music Processing, vol.41, issue.3, pp.7-10, 2007.
DOI : 10.1121/1.429611

URL : https://hal.archives-ouvertes.fr/hal-00184425

. Ouni, Training Baldi to be multilingual: A case study for an Arabic Badr, Speech Communication, vol.45, issue.2, pp.5-6, 2005.
DOI : 10.1016/j.specom.2004.11.008

URL : https://hal.archives-ouvertes.fr/hal-00008688

. Ouni, Acoustic-visual synthesis technique using bimodal unitselection, Speech, and Music Processing, pp.1-6, 2013.
DOI : 10.1186/1687-4722-2013-16

URL : https://hal.archives-ouvertes.fr/hal-00835854

. Ouni, . Laprie, S. Ouni, and Y. Laprie, Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion, The Journal of the Acoustical Society of America, vol.118, issue.1, pp.1-8, 2005.
DOI : 10.1121/1.1921448

URL : https://hal.archives-ouvertes.fr/hal-00008682

. Ouni, . Laprie, S. Ouni, and Y. Laprie, Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion, The Journal of the Acoustical Society of America, vol.118, issue.1, pp.1-8, 2005.
DOI : 10.1121/1.1921448

URL : https://hal.archives-ouvertes.fr/hal-00008682

. Ouni, . Laprie, S. Ouni, and Y. Laprie, Studying pharyngealisation using an articulograph, International Workshop on Pharyngeals and Pharyngealisation, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00431829

. Ouni, Visartico: a visualization tool for articulatory data, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00730733

. Papcun, Inferring articulation and recognizing gestures from acoustics with a neural network trained on x???ray microbeam data, The Journal of the Acoustical Society of America, vol.92, issue.2, pp.688-700, 1992.
DOI : 10.1121/1.403994

F. I. Parke, A model for human faces that allows speech synchronized animation, Computers & Graphics, vol.1, issue.1, pp.3-4, 1975.
DOI : 10.1016/0097-8493(75)90024-2

. Pelachaud, Modelling an italian talking head, AVSP 2001-International Conference on Auditory-Visual Speech Processing, 2001.

P. Pelachaud, C. Pelachaud, and I. Poggi, Subtleties of facial expressions in embodied agents, The Journal of Visualization and Computer Animation, vol.2, issue.5, pp.3-3, 2002.
DOI : 10.1002/vis.299

B. Potard, Acoustic-to-articulatory inversion with constraints, 2008.
URL : https://hal.archives-ouvertes.fr/tel-00580811

. Potard, Incorporation of phonetic constraints in acoustic-to-articulatory inversion, The Journal of the Acoustical Society of America, vol.123, issue.4, pp.2310-2323, 2008.
DOI : 10.1121/1.2885747

URL : https://hal.archives-ouvertes.fr/inria-00112226

. Preminger, Selective Visual Masking in Speechreading, Journal of Speech Language and Hearing Research, vol.41, issue.3, pp.564-575, 1998.
DOI : 10.1044/jslhr.4103.564

C. Qin, C. Qin, and M. Carreira-perpiñán, An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping, Proc. Interspeech, pp.74-77, 2007.

. Richardson, Hidden-articulator Markov models for speech recognition, Speech Communication, vol.41, issue.2-3, pp.5-6, 2003.
DOI : 10.1016/S0167-6393(03)00031-1

K. Richmond and K. Richmond, Estimating Articulatory Parameters from the Speech Signal Centre for Speech Technology Research A trajectory mixture density neural network for the acoustic-articulatory inversion mapping, Proc. Interspeech, pp.577-580, 2002.

K. Richmond, Preliminary inversion mapping results with a new EMA corpus, Interspeech,B r i g h t o n, 2009.

F. Rudzicz, Correcting errors in speech recognition with articulatory dynamics, Proc. 48th Annual Meeting of the Association for Computational Linguistics, ACL '10, pp.60-68, 2010.

C. Sakoe, H. Sakoe, S. Chiba, and . Savariaux, Dynamic programming algorithm optimization for spoken word recognition, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.6, issue.2 1, pp.4-7, 1978.

B. Schölkopf, A. J. Smola, and . Schroeter, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond Evaluation of improved articulatory codebooks and codebook access distance measures, Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp.393-396, 1990.

S. Schroeter, J. Schroeter, and M. M. Sondhi, Techniques for estimating vocal-tract shapes from the speech signal, IEEE Transactions on Speech and Audio Processing, vol.2, issue.1, pp.133-150, 1994.
DOI : 10.1109/89.260356

. Soquet, Acoustic-articulatory inversion based on a neural controller of a vocal tract model: further results, Artificial Neural Networks, pp.371-376, 1991.

. Steiner, Investigating articulatory differences between upright and supine posture using 3d ema, 9th International Seminar on Speech Production, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602427

. Steiner, Towards an articulatory tongue model using 3d ema, 9th International Seminar on Speech Production-ISSP'11, pp.147-154, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602423

. Steiner, Speech animation using electromagnetic articulography as motion capture data, AV S P 2 0 1 3 -I n t e r n a t i on a l C o n f e re n ce on Auditory-Visual Speech Processing, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00835856

. Steiner, Symbolic vs. acoustics-based style control for expressive unit selection, Proceedings of Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010.

K. Stevens, Human communication: A unified view, pp.51-66, 1972.

K. N. Stevens, On the quantal nature of speech, Journal of Phonetics, vol.17, pp.3-45, 1989.

. Strom, Expressive prosody for unitselection speech synthesis, Proceedings of Interspeech, 2006.

P. Sumby, W. Sumby, and I. Pollack, Visual Contribution to Speech Intelligibility in Noise, The Journal of the Acoustical Society of America, vol.26, issue.2, pp.2-3, 1954.
DOI : 10.1121/1.1907309

A. Summerfield, Use of Visual Information for Phonetic Perception, Phonetica, vol.36, issue.4-5, 1979.
DOI : 10.1159/000259969

. Suzuki, Determination of articulatory positions from speech acoustics by applying dynamic articulatory constraints, Proc. International Conference on Spoken Language Processing (ICSLP), pp.2251-2254, 1998.

P. Taylor, Text-to-speech synthesis, 2009.
DOI : 10.1017/CBO9780511816338

B. Theobald, Audiovisual speech synthesis, International Congress on Phonetic Sciences, pp.285-290, 2007.

B. Theobald, Audiovisual speech synthesis, ICPhS,S a a r b r u cken, 2007.

M. K. Tiede, MVIEW: Multi-channel visualization application for displaying dynamic sensor movements, 2010.

. Toda, Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model, Speech Communication, vol.50, issue.3, pp.0-2, 2008.
DOI : 10.1016/j.specom.2007.09.001

. Toutios, . Margaritis, A. Toutios, and K. Margaritis, Contribution to statistical acoustic-to-EMA mapping, 16th European Signal Processing Conference (EUSIPCO), 2008.

. Toutios, Weight Optimization for Bimodal Unit-Selection Talking Head Synthesis, ISCA, editor, 12thAnnual Conference of the International Speech Communication Association -Interspeech 2011,F l o rence, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602407

. Toutios, Setup for Acoustic-Visual Speech Synthesis by Concatenating Bimodal Units, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00526766

. Toutios, Towards a true acoustic-visual speech synthesis, 9th International Conference on Auditory-Visual Speech Processing-AVSP2010, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00526782

. Toutios, Predicting tongue positions from acoustics and facial features, Interspeech 2011, pp.2661-2664, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00602412

. Toutios, Protocol for a model-based evaluation of a dynamic acoustic-to-articulatory inversion method using electromagnetic articulography, The eighth International Seminar on Speech Production (ISSP'08), pp.317-320, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00336380

. Toutios, Estimating the Control parameters of an Articulatory Model from Electromagnetic Articulograph Data. The Journal of the Generalized core vector machines Episodic and semantic memory, Neural Networks IEEE Transactions, vol.1, issue.7 5, pp.3-5, 1972.
URL : https://hal.archives-ouvertes.fr/inria-00578733

. Watkins, Seeing and hearing speech excites the motor system involved in speech production, Neuropsychologia, vol.41, issue.8, pp.1-9, 2003.
DOI : 10.1016/S0028-3932(02)00316-0

P. Wik and A. Hjalmarsson, Embodied conversational agents in computer assisted language learning, Speech Communication, vol.51, issue.10, 2009.
DOI : 10.1016/j.specom.2009.05.006

URL : https://hal.archives-ouvertes.fr/hal-00558521

W. Hardcastle, A multichannel articulatory database and its application for automatic speech recognition Continuous speech recognition using articulatory data, Proc. 5th International Seminar on Speech Production Proc. International Conference on Spoken Language Processing (ICSLP), pp.205-308, 2000.

. Wrobel-dautcourt, A low-cost stereovision based system for acquisition of visible articulatory data, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000432

. Yehia, Quantitative association of vocal-tract and facial behavior, Speech Communication, vol.26, issue.1-2, pp.2-3, 1998.
DOI : 10.1016/S0167-6393(98)00048-X

. Yehia, Linking facial animation, head motion and speech acoustics, Journal of Phonetics, vol.30, issue.3, pp.0-5, 2002.
DOI : 10.1006/jpho.2002.0165

R. Zhang, L. Zhang, and S. Renals, Acoustic-Articulatory Modeling With the Trajectory HMM, IEEE Signal Processing Letters, vol.15, 2008.
DOI : 10.1109/LSP.2008.917004