. Enfin, adaptation locale (par ex. l'adaptation des facteurs de gains [Benaroya-06, Vincent-04a]) et d'adaptation globale traitée dans cette thèse (voir la discussion section 6.2.3) En effet, si la taille de la fenêtre glissante est comparable avec la taille d'enregistrement, il s'agit plutôt d'adaptation globale. Par contre, si la taille de la fenêtre est de l'ordre d'une trame

S. F. Bibliographie and . Boll, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoustics, Speech and Signal Processing, vol.2, issue.27, pp.112-120, 1979.

G. J. Brown and M. P. Cooke, Computational auditory scene analysis, Computer Speech & Language, vol.8, issue.4, pp.297-336, 1994.
DOI : 10.1006/csla.1994.1016

S. Gannot, Speech enhancement using a mixture-maximum model, European Conf. on Speech Communication and Technology (EuroSpeech'99), pp.2591-2594, 1999.

J. Cardoso, Blind signal separation: statistical principles, Proc. IEEE, pp.2009-2025, 1998.
DOI : 10.1109/5.720250

A. Casey and . Westner, Separation of mixed audio sources by independent subspace analysis, International Computer Music Conference, 2000.

W. Chen, H. Liau, L. Wang, and . Lee, Fast speaker adaptation using eigenspace-based maximum likelihood linear regression, Intl. Conf. on Spoken Language Proc. (ICSLP'00), pp.742-745, 2000.

J. Chen, K. Bilmes, and . Kirchhoff, Low-resource noise-robust feature postprocessing on aurora 2.0, Intl. Conf. on Spoken Language Proc. (ICSLP'02), pp.2445-2448, 2002.

M. P. Cooke, P. D. Green, L. Josifovski, and A. Vizinho, Robust automatic speech recognition with missing and unreliable acoustic data, Speech Communication, vol.34, issue.3, pp.267-285, 2001.
DOI : 10.1016/S0167-6393(00)00034-0

G. J. Cooke, M. D. Brown, P. Crawford, and . Green, Computational auditory scene analysis: listening to several things at once, Endeavour, vol.17, issue.4, pp.186-190, 1993.
DOI : 10.1016/0160-9327(93)90061-7

R. A. Curtis and R. J. Niederjohn, An investigation of several frequency-domain processing methods for enhancing the intelligibility of speech in wideband random noise, ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.602-605, 1978.
DOI : 10.1109/ICASSP.1978.1170571

J. R. Deller, J. J. Hansen, and J. G. Proakis, Discrete-Time Processing of Speech Signals, 1999.
DOI : 10.1109/9780470544402

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, vol.39, pp.1-38, 1977.

R. Ellis and . Weiss, Model-based monaural source separation using a vectorquantized phase-vocoder representation, IEEE Intl. Conf. on Acoustics, Speech and Signal Proc. (ICASSP'06), pp.957-960, 2006.

D. P. Ellis, Prediction-driven computational auditory scene analysis, 1996.

E. and D. Malah, Speech enhancement using a minimum mean square error log-spectral amplitude estimator, In IEEE Trans. on Acoust

P. Pye and . Woodland, Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation, Intl. Conf. on Spoken Language Proc. (ICSLP'96), pp.1832-1835, 1996.

]. J. Gauvain-94, C. Gauvain, and . Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Transactions on Speech and Audio Processing, vol.2, issue.2, pp.291-298, 1994.
DOI : 10.1109/89.279278

M. I. Jordan, Supervised learning from incomplete data via an em approach, Neural Info. Processing Systems (NIPS'93), pp.120-127, 1993.

. Ghahramani-97-]-z, M. Ghahramani, and . Jordan, Factorial hidden Markov models, Machine Learning, vol.29, issue.2/3, pp.245-273, 1997.
DOI : 10.1023/A:1007425814087

R. Gribonval, L. Benaroya, E. Vincent, and C. Févotte, Proposals for performance measurement in source separation, Intl. Conf. on Indep. Component Analysis and Blind Source Separation (ICA'03), pp.763-768, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00570123

B. Helén and T. Virtanen, Separation of drums from polyphonic music using nonnegative matrix factorization and support vector machine, European Signal Processing Conference (EUSIPCO'05), 2005.

M. Casey, Audio-visual sound separation via hidden Markov models, Advances in Neural Information Processing Systems (NIPS'01), 2001.

P. O. Hoyer, Non-negative sparse coding, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002.
DOI : 10.1109/NNSP.2002.1030067

URL : http://arxiv.org/abs/cs/0202009

G. Hu and D. L. Wang, Monaural speech separation, Neural Info. Processing Systems, 2003.

T. Jang and . Lee, A maximum likelihood approach to single-channel source separation, Journal of Machine Learning Research, issue.4, pp.1365-1392, 2003.

M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, An Introduction to Variational Methods for Graphical Models, Learning in Graphical Models, vol.37, issue.2, pp.183-233, 1999.
DOI : 10.1007/978-94-011-5014-9_5

S. M. Kay, Fundamentals of Statistical Signal Processing, Estimation Theory, 1993.

Y. E. Kim and B. Whitman, Singer identification in popular music recordings using voice coding features, Intl. Sympos. on Music Information Retrieval (ISMIR'02), pp.164-169, 2002.

M. Kim and S. Choi, Monaural Music Source Separation: Nonnegativity, Sparseness, and Shift-Invariance, Intl. Conf. on Indep. Component Analysis and Blind Source Separation (ICA'06), 2006.
DOI : 10.1007/11679363_77

V. Krishnamurthy and J. B. Moore, On-line estimation of hidden Markov model parameters based on the Kullback-Leibler information measure, IEEE Transactions on Signal Processing, vol.41, issue.8
DOI : 10.1109/78.229888

J. Attias and . Hershey, Single microphone source separation using high resolution signal reconstruction, IEEE Intl. Conf. on Acoustics, Speech and Signal Proc. (ICASSP'04), pp.817-820, 2004.

Y. Li and D. L. Wang, Singing voice separation from monaural recordings, ISMIR'06, 2006.

A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, The DET curve in assessment of detection task performance, European Conf. on Speech Communication and Technology (EuroSpeech'97), pp.1895-1898, 1997.

T. Mclachlan and . Krishnan, The EM Algorithm and Extensions, 1997.

P. J. Moreno, B. Raj, and R. M. Stern, A vector Taylor series approach for environment-independent speech recognition, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, 1996.
DOI : 10.1109/ICASSP.1996.543225

. P. Murphy-02-]-k and . Murphy, Dynamic Bayesian Networks : Representation, Inference and Learning, 2002.

A. Nádas, D. Nahamoo, and M. A. Picheny, Speech recognition using noise-adaptive prototype, In IEEE Trans. on Speech and Audio Proc, pp.1495-1505, 1989.

F. D. Neeser and J. L. Massey, Proper complex random processes with applications to information theory, IEEE Transactions on Information Theory, vol.39, issue.4, pp.1293-1302, 1993.
DOI : 10.1109/18.243446

T. L. Nwe, A. Shenoy, and Y. Wang, Singing voice detection in popular music, Proceedings of the 12th annual ACM international conference on Multimedia , MULTIMEDIA '04, pp.324-327, 2004.
DOI : 10.1145/1027527.1027602

A. Ozerov, R. Gribonval, P. Philippe, and F. Bimbot, Séparation voix / musique ` a partir d'enregistrements mono quelques remarques sur le choix et l'adaptation des modèles, GRETSI'05 Symposium on Signal and Image Processing

]. A. Bibliographie-[-ozerov-05b, P. Ozerov, R. Philippe, F. Gribonval, and . Bimbot, One microphone singing voice separation using source-adapted models, IEEE Worksh. on Apps. of Signal Processing to Audio and Acoustics (WASPAA'05), pp.90-93, 2005.

B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C : The Art of Scientific Computing, Press, 1992.

L. R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proceedings of the IEEE, pp.257-286, 1989.
DOI : 10.1016/B978-0-08-051584-7.50027-9

B. Reddy and . Raj, Soft mask estimation for single channel speaker separation, ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2004.

D. Gomez, N. Ellis, and . Jojic, Multiband audio modeling for single-channel acoustic source separation, IEEE Intl. Conf. on Acoustics, Speech and Signal Proc. (ICASSP'04), pp.641-644, 2004.

A. Reynolds, T. F. Quatieri, and R. B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing, vol.10, issue.1-3, pp.19-41, 2000.
DOI : 10.1006/dspr.1999.0361

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

S. Rickard and O. Yilmaz, On the approximate W-disjoint orthogonality of speech

B. Rivet, L. Girin, C. Jutten, and J. Schwartz, Using audiovisual speech processing to improve the robustness of the separation of convolutive speech mixtures, IEEE 6th Workshop on Multimedia Signal Processing, 2004.
DOI : 10.1109/MMSP.2004.1436412

R. C. Rose, E. M. Hofstetter, and D. A. Reynolds, Integrated models of signal and background with application to speaker identification in noise, IEEE Transactions on Speech and Audio Processing, vol.2, issue.2, pp.245-257, 1994.
DOI : 10.1109/89.279273

S. T. Roweis, One microphone source separation, Advances in Neural Information Processing Systems, pp.793-799, 2001.

A. Ryynänen and . Klapuri, Polyphonic music transcription using note event modeling, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005., 2005.
DOI : 10.1109/ASPAA.2005.1540233

M. Mørup, Nonnegative matrix factor 2-d deconvolution for blind single channel source separation, Intl. Conf. on Indep. Component Analysis and Blind Source Separation (ICA'06), 2006.

. Shinoda-97-]-k, C. Shinoda, and . Lee, Structural MAP speaker adaptation using hierarchical priors, IEEE Workshop on Speech Recognition and Understanding, pp.381-388, 1997.

D. Tsai, H. Rogers, and . Wang, Blind Clustering of Popular Music Recordings Based on Singer Voice Characteristics, Computer Music Journal, vol.39, issue.3, pp.68-78, 2004.
DOI : 10.1109/TSA.2002.800560

. H. Tsai-04a-]-w, H. M. Tsai, and . Wang, Automatic detection and tracking of target singer in multi-singer music recordings, IEEE Intl. Conf. on Acoustics, Speech and Signal Proc. (ICASSP'04), pp.221-224, 2004.

J. Valin, F. Rouat, and . Michaud, Microphone array post-filter for separation of simultaneous non-stationary sources, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004.
DOI : 10.1109/ICASSP.2004.1325962

R. Vergin, D. Shaughnessy, and A. Farhat, Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition, IEEE Transactions on Speech and Audio Processing, vol.7, issue.5
DOI : 10.1109/89.784104

E. Vincent, Séparation de signaux audio : principes statistiques de l'analyse en composantes indépendantes et applications au signal monophonique, DEA ATIAM, IRCAM, 2001.

E. Bibliographie, C. Vincent, R. Févotte, and . Gribonval, A tentative typology of audio source separation tasks, Intl. Conf. on Indep. Component Analysis and Blind Source Separation (ICA'03), 2003.

E. Vincent, Modèles d'instruments pour la séparation de sources et la transcription d'enregistrements musicaux, 2004.

. Vincent-04a-]-e, X. Vincent, and . Rodet, Underdetermined source separation with structured source priors, Intl. Conf. on Indep. Component Analysis and Blind Source Separation (ICA'04), pp.327-334, 2004.

E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, and M. E. Davies, Blind audio source separation, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00544230

]. E. Vincent-05a, C. Vincent, R. Févotte, and . Gribonval, Performance measurement in blind audio source separation, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.4, pp.1462-1469, 2005.
DOI : 10.1109/TSA.2005.858005

. Vincent-05b-]-e, R. Vincent, and . Gribonval, Construction d'estimateurs oracles pour la séparation de sources, GRETSI'05 Symposium on Signal and Image Processing, 2005.

M. D. Plumbley, Musical audio stream separation by non-negative matrix factorization, UK Digital Music Research Network (DMRN) Summer Conf, 2005.

R. J. Weiss and D. P. Ellis, Estimating single-channel source separation masks : Relevance vector machine classifiers vs. pitch-based masking, (SAPA'06) ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2006.