Maximum mutual information estimation of hidden Markov model parameters for speech recognition, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.49-52, 1986. ,
DOI : 10.1109/ICASSP.1986.1169179
The PASCAL CHiME speech separation and recognition challenge, Computer Speech & Language, vol.27, issue.3, pp.621-633, 2013. ,
DOI : 10.1016/j.csl.2012.10.004
URL : https://hal.archives-ouvertes.fr/hal-00646370
Tied mixture continuous parameter modeling for speech recognition. Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.38, issue.12, pp.2033-2045, 1990. ,
Statistical language model adaptation: review and perspectives, Speech Communication, vol.42, issue.1, pp.93-108, 2004. ,
DOI : 10.1016/j.specom.2003.08.002
Input-output HMMs for sequence processing, IEEE Transactions on Neural Networks, vol.7, issue.5, pp.1231-1249, 1996. ,
DOI : 10.1109/72.536317
Variable length and context-dependent HMM letter form models for Arabic handwritten word recognition, Document Recognition and Retrieval XIX, pp.829708-829708, 2012. ,
DOI : 10.1117/12.912093
A gentle tutorial of the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models, 1998. ,
Praat, a system for doing phonetics by computer, Glot International, vol.5, issue.910, pp.341-345, 2001. ,
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription, 2012. ,
Voice puppetry, Proceedings of the 26th annual conference on Computer graphics and interactive techniques , SIGGRAPH '99, pp.21-28, 1999. ,
DOI : 10.1145/311535.311537
Rigid head motion in expressive speech animation: Analysis and synthesis. Audio, Speech, and Language Processing, IEEE Transactions on, vol.15, issue.3, pp.1075-1086, 2007. ,
Natural head motion synthesis driven by acoustic prosodic features, Computer Animation and Virtual Worlds, vol.25, issue.3-4, 2005. ,
DOI : 10.1002/cav.80
Matrix updates for perceptron training of continuous density hidden Markov models, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, p.20, 2009. ,
DOI : 10.1145/1553374.1553394
How to Train Your Avatar: A Data Driven Approach to Gesture Generation, The 11th International Conference on Intelligent Virtual Agents, 2011. ,
DOI : 10.1007/978-3-642-23974-8_14
The chime corpus: a resource and a challenge for computational hearing in multisource environments, Proc. Interspeech´10Interspeech´ Interspeech´10 Makuhari, 2010. ,
Visual prosody analysis for realistic motion synthesis of 3d head models, Proc. of ICAV3D01 -International Conference on Augmented, Virtual Environments and 3D Imaging, pp.343-346, 2001. ,
A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition, ICASSP '03 ,
DOI : 10.1109/TASL.2006.889791
A study of variable-parameter gaussian mixture hidden markov modeling for noisy speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, vol.15, issue.4, pp.1366-1376, 2007. ,
Maximum likelihood from incomplete data via the em algorithm, JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B, vol.39, issue.1, pp.1-38, 1977. ,
A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal, Signal Processing, vol.27, issue.1, pp.65-78, 1992. ,
DOI : 10.1016/0165-1684(92)90112-A
Speech recognition using hidden markov models with polynomial regression functions as nonstationary states. Speech and Audio Processing, IEEE Transactions on, vol.2, issue.4, pp.507-520, 1994. ,
Conditional random field for tracking user behavior based on his eye's movements, Citeseer. Bibliography, vol.139, p.19, 2005. ,
Conditional random fields for online handwriting recognition, Adv in Neur Inf Proc Sys, pp.1097-1104, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00104207
Large margin training for hidden Markov models with partially observed states, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.265-272, 2009. ,
DOI : 10.1145/1553374.1553408
URL : https://hal.archives-ouvertes.fr/hal-01294610
Facial Action Coding System: A Technique for the Measurement of Facial Movement, 1978. ,
Chalearn multi-modal gesture recognition 2013: grand challenge and workshop summary, ICMI, pp.365-368 ,
A 3-D Audio-Visual Corpus of Affective Communication, IEEE Transactions on Multimedia, vol.12, issue.6, pp.591-598, 2010. ,
DOI : 10.1109/TMM.2010.2052239
Automatic speech recognition using Hidden Conditional Neural Fields, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5036-5039, 2011. ,
DOI : 10.1109/ICASSP.2011.5947488
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.657.1405
Multipleregression hidden markov model, Acoustics, Speech, and Signal Processing Proceedings. (ICASSP '01). 2001 IEEE International Conference on, pp.513-516, 2001. ,
Speaker-independent isolated word recognition using dynamic features of speech spectrum, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.34, issue.1, pp.52-59, 1986. ,
DOI : 10.1109/TASSP.1986.1164788
Speech recognition in noisy environments: A survey, Speech Communication, vol.16, issue.3, pp.261-291, 1995. ,
DOI : 10.1016/0167-6393(94)00059-J
Carnegie-mellon university motion capture database ,
Hybrid speech recognition with Deep Bidirectional LSTM, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, pp.273-278, 2013. ,
DOI : 10.1109/ASRU.2013.6707742
Hidden conditional random fields for phone classification, pp.1117-1120, 2005. ,
Markov field on finite graphs and lattices, 1971. ,
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups, IEEE Signal Processing Magazine, vol.29, issue.6, pp.2982-97, 2012. ,
DOI : 10.1109/MSP.2012.2205597
Reducing the Dimensionality of Data with Neural Networks, Science, vol.313, issue.5786, pp.313504-507, 2006. ,
DOI : 10.1126/science.1127647
Improving neural networks by preventing co-adaptation of feature detectors, 2012. ,
Long Short-Term Memory, Neural Computation, vol.4, issue.8, pp.1735-1780, 1997. ,
DOI : 10.1016/0893-6080(88)90007-X
Yamagishi j.: Speech driven head motion synthesis based on a trajectory model, Proc. SIGGRAPH, 2007. ,
Shared-distribution hidden markov models for speech recognition. Speech and Audio Processing, IEEE Transactions on, vol.1, issue.4, pp.414-420, 1993. ,
Tutorial on training recurrent neural networks, p.48, 2002. ,
Discriminative training of HMMs for automatic speech recognition: A survey, Computer Speech & Language, vol.24, issue.4, pp.589-608, 2010. ,
DOI : 10.1016/j.csl.2009.08.002
Discriminative learning for minimum error classification [pattern recognition]. Signal Processing, IEEE Transactions on, issue.12, pp.403043-3054, 1992. ,
A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition, Computer Vision and Image Understanding, vol.113, issue.4, pp.532-543, 2009. ,
DOI : 10.1016/j.cviu.2008.12.001
Conditional random fields: Probabilistic models for segmenting and labeling sequence data, Proceedings of the Eighteenth International Conference on Machine Learning, ICML '01, pp.282-289, 2001. ,
Live Speech Driven Head-and-Eye Motion Generators, IEEE Transactions on Visualization and Computer Graphics, vol.18, issue.11, pp.181902-1914, 2012. ,
DOI : 10.1109/TVCG.2012.74
A tutorial on energy-based learning, Predicting Structured Data, 2006. ,
Speaker-independent phone recognition using hidden markov models. Acoustics, Speech and Signal Processing, IEEE Transactions on, issue.11, pp.371641-1648, 1989. ,
Gesture controllers, 2010. ,
DOI : 10.1145/1778765.1778861
Real-time prosody-driven synthesis of body language, ACM Trans. Graph, vol.28172, issue.5, pp.1-17210, 2009. ,
Learning dynamic audio-visual mapping with inputoutput hidden markov models. Multimedia, IEEE Transactions on, vol.8, issue.3, pp.542-549, 2006. ,
High accuracy phone recognition using context clustering and quasi-triphonic models, Computer Speech & Language, vol.8, issue.2, pp.129-151, 1994. ,
DOI : 10.1006/csla.1994.1006
Training Algorithms for Hidden Conditional Random Fields, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006. ,
DOI : 10.1109/ICASSP.2006.1660010
Generating human-like behaviors using joint, speech-driven models for conversational agents. Audio, Speech, and Language Processing, IEEE Transactions on, issue.8, pp.202329-2340, 2012. ,
Margin-space integration of mpe loss via di?erencing of mmi functionals for generalized error-weighted discriminative training, INTERSPEECH, pp.224-227, 2009. ,
Time Series Modeling with Hidden Variables and Gradientbased Algorithms, 2011. ,
Dynamic Factor Graphs for Time Series Modeling, Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, ECML PKDD '09, pp.128-143, 2009. ,
DOI : 10.1007/978-3-642-04174-7_9
Documentation mocap database hdm05, 2007. ,
Loopy belief propagation for approximate inference: An empirical study, Proceedings of Uncertainty in AI, pp.467-475, 1999. ,
Discriminative training methods and their applications to handwriting recognition, 2005. ,
MPEG-4 Facial Animation: The Standard, Implementation and Applications, 2003. ,
DOI : 10.1002/0470854626
The lincoln tied-mixture hmm continuous speech recognizer, Acoustics, Speech, and Signal Processing ICASSP-91., 1991 International Conference on, pp.329-332, 1991. ,
Minimum phone error and i-smoothing for improved discriminative training, Acoustics, Speech, and Signal Processing (ICASSP) IEEE International Conference on, pp.105-108, 2002. ,
Hidden-state conditional random fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007. ,
DOI : 10.1109/tpami.2007.1124
Fundamentals of speech recognition, 1993. ,
A tutorial on hidden markov models and selected applications in speech recognition, Proceedings of the IEEE, pp.257-286, 1989. ,
Hidden Conditional Random Fields for Meeting Segmentation, Multimedia and Expo, 2007 IEEE International Conference on, pp.639-642, 2007. ,
DOI : 10.1109/ICME.2007.4284731
Analysis of head gesture and prosody patterns for prosody-driven head-gesture animation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, issue.8, pp.301330-1345, 2008. ,
Dynamic Time Warping Algorithm Review, 2008. ,
Large margin training of acoustic models for speech recognition, 2006. ,
Large margin hidden markov models for automatic speech recognition, pp.1249-1256, 2007. ,
Improving neural networks with dropout (doctoral dissertation , university of toronto), 2013. ,
Hidden Conditional Random Fields for phone recognition, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, pp.107-112, 2009. ,
DOI : 10.1109/ASRU.2009.5373329
An introduction to conditional random fields for relational learning, Introduction to Statistical Relational Learning, 2007. ,
Speech parameter generation algorithms for HMM-based speech synthesis, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.1315-1318, 2000. ,
DOI : 10.1109/ICASSP.2000.861820
Dimensionality reduction: A comparative review, 2009. ,
The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.126-130, 2013. ,
DOI : 10.1109/ICASSP.2013.6637622
Joint Optimization of Hidden Conditional Random Fields and Non Linear Feature Extraction, 2011 International Conference on Document Analysis and Recognition, pp.513-517, 2011. ,
DOI : 10.1109/ICDAR.2011.109
URL : https://hal.archives-ouvertes.fr/hal-00706021
Regularization of neural networks using dropconnect, ICML (3), volume 28 of JMLR Proceedings, pp.1058-1066, 2013. ,
Gaussian Process Dynamical Models for Human Motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30, issue.2, 2007. ,
DOI : 10.1109/TPAMI.2007.1167
A comparison of word graph and n-best list based confidence measures, Proc. EUROSPEECH, pp.315-318, 1999. ,
Parametric hidden markov models for gesture recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.21, issue.9, pp.884-900, 1999. ,
Acoustically-driven Talking Face Animations Using Dynamic Bayesian Networks, 2008. ,
DOI : 10.1109/icme.2006.262743
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.331.9125
State clustering in hidden Markov model-based continuous speech recognition, Computer Speech & Language, vol.8, issue.4, pp.369-383, 1994. ,
DOI : 10.1006/csla.1994.1019
The HTK Book, version 3.4, 2006. ,
Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition, International Conference on Semantic Computing (ICSC 2007), pp.429-438, 2007. ,
DOI : 10.1109/ICSC.2007.11
A novel framework and training algorithm for variable-parameter hmms, IEEE Trans. on Audio, Speech, and Language Processing, issue.7, pp.171348-1360, 2009. ,
Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences, Computer Speech & Language, vol.21, issue.1, pp.153-173, 2007. ,
DOI : 10.1016/j.csl.2006.01.002