E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, Mixed membership stochastic blockmodels, Journal of Machine Learning Research, vol.9, pp.1981-2014, 2008.

H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control, vol.19, issue.6, pp.716-723, 1974.
DOI : 10.1109/TAC.1974.1100705

T. Alexandrov, J. Decker, B. Mertens, A. M. Deelder, R. A. Tollenaar et al., Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation, Bioinformatics, vol.25, issue.5, pp.643-649, 2009.
DOI : 10.1093/bioinformatics/btn662

M. David and . Allen, The relationship between variable selection and data augmentation and a method for prediction, Technometrics, vol.16, pp.125-127, 1974.

J. Baek, G. Mclachlan, and L. Flack, Mixtures of Factor Analyzers with Common Factor Loadings : Applications to the Clustering and Visualisation of High-Dimensional Data, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.1-13, 2009.

M. Barker and W. Rayens, Partial least squares for discrimination, Journal of Chemometrics, vol.10, issue.3, pp.166-173, 2003.
DOI : 10.1002/cem.785

S. Bashir and E. Carter, High breakdown mixture discriminant analysis, Journal of Multivariate Analysis, vol.93, issue.1, pp.102-111, 2005.
DOI : 10.1016/j.jmva.2003.12.003

URL : http://doi.org/10.1016/j.jmva.2003.12.003

A. Bellas, C. Bouveyron, M. Cottrell, and J. Lacaille, Robust clustering of high-dimensional data, Proceedings of the 20th European Symposium on Artificial Neural Networks, pp.2012-78
URL : https://hal.archives-ouvertes.fr/hal-00707055

R. Bellman, Dynamic Programming, 1957.

H. Bensmail and G. Celeux, Regularized Gaussian Discriminant Analysis through Eigenvalue Decomposition, Journal of the American Statistical Association, vol.91, issue.436, pp.1743-1748, 1996.
DOI : 10.1002/0471725293

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.1873

C. Bernard-michel, S. Doutã©, M. Fauvel, L. Gardes, and S. Girard, Retrieval of Mars surface physical properties from OMEGA hyperspectral images using regularized sliced inverse regression, Journal of Geophysical Research, vol.20, issue.2, p.77, 2009.
DOI : 10.1029/2008JE003171

URL : https://hal.archives-ouvertes.fr/inria-00276116

C. Biernacki, F. Beninel, and V. Bretagnolle, A Generalized Discriminant Rule When Training Population and Test Population Differ on Their Descriptive Parameters, Biometrics, vol.4, issue.Part II, pp.387-397, 2002.
DOI : 10.1111/j.0006-341X.2002.00387.x

URL : https://hal.archives-ouvertes.fr/hal-00191396

C. Biernacki, G. Celeux, and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.7, pp.719-725, 2000.
DOI : 10.1109/34.865189

A. Blum and T. Mitchell, Combining labeled and unlabeled data with co-training, Proceedings of the eleventh annual conference on Computational learning theory , COLT' 98, pp.92-100, 1998.
DOI : 10.1145/279943.279962

URL : http://axon.cs.byu.edu/~martinez/classes/678/Papers/Mitchell_cotraining.pdf

H. Bock, Probabilistic models in cluster analysis, Computational Statistics & Data Analysis, vol.23, issue.1, pp.5-28, 1996.
DOI : 10.1016/0167-9473(96)88919-5

R. Boulet, B. Jouve, F. Rossi, and N. Villa, Batch kernel SOM and related Laplacian methods for social network analysis, Neurocomputing, vol.71, issue.7-9, pp.1257-1273, 2008.
DOI : 10.1016/j.neucom.2007.12.026

URL : https://hal.archives-ouvertes.fr/hal-00202339

C. Bouveyron and H. Chipman, Visualization and classification of graph-structured data: the case of the Enron dataset, 2007 International Joint Conference on Neural Networks, pp.1506-1511, 2007.
DOI : 10.1109/IJCNN.2007.4371181

C. Bouveyron, H. Chipman, and E. Côme, Supervised classification and visualization of social networks based on a probabilistic latent space model, 7th International Workshop on Mining and Learning with Graphs, pp.55-57, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00407831

C. Bouveyron, S. Girard, and M. Olteanu, Supervised classification of categorical data with uncertain labels for dna barcoding, 17th European Symposium on Artificial Neural Networks, pp.29-34, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00407834

G. Celeux and J. Diebolt, The SEM algorithm : a probabilistic teacher algorithm from the EM algorithm for the mixture problem, Computational Statistics Quaterly, vol.2, issue.1, pp.73-92, 1985.

G. Celeux and G. Govaert, A classification EM algorithm for clustering and two stochastic versions, Computational Statistics & Data Analysis, vol.14, issue.3, pp.315-332, 1992.
DOI : 10.1016/0167-9473(92)90042-E

URL : https://hal.archives-ouvertes.fr/inria-00075196

G. Celeux, M. Hurn, and C. Robert, Computational and Inferential Difficulties with Mixture Posterior Distributions, Journal of the American Statistical Association, vol.60, issue.451, pp.957-970, 2000.
DOI : 10.1080/01621459.1995.10476589

URL : https://hal.archives-ouvertes.fr/inria-00073049

J. Couto, Kernel k-means for categorical data In Advances in Intelligent Data Analysis VI, Lecture Notes in Computer Science, vol.3646, pp.739-739

B. Dasarathy, Noising around the neighbourhood : a new system structure and classification rule for recognition in partially exposed environments, IEEE Trans. Pattern Analysis and Machine Intelligence, vol.2, pp.67-71, 1980.

A. Delaigle and P. Hall, Defining probability density for a distribution of random functions. The Annals of Statistics, pp.1171-1193, 2010.

A. Dempster, N. Laird, and D. Robin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, vol.39, issue.1, pp.1-38, 1977.

M. M. Dundar and D. A. Landgrebe, Toward an optimal supervised classifier for the analysis of hyperspectral data. Geoscience and Remote Sensing, IEEE Transactions on, vol.42, issue.1, pp.271-277

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression, Annals of Statistics, vol.32, issue.2, pp.407-499, 2004.

M. Fan, H. Qiao, and B. Zhang, Intrinsic dimension estimation of manifolds by incising balls, Pattern Recognition, vol.42, issue.5, pp.780-787, 2009.
DOI : 10.1016/j.patcog.2008.09.016

R. A. Fisher, THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS, Annals of Eugenics, vol.59, issue.2, pp.179-188, 1936.
DOI : 10.1111/j.1469-1809.1936.tb02137.x

B. Flury, Common Principal Components in K Groups, Journal of the American Statistical Association, vol.79, issue.388, pp.892-897, 1984.
DOI : 10.2307/2288721

D. H. Foley and J. W. Sammon, An Optimal Set of Discriminant Vectors, IEEE Transactions on Computers, vol.24, issue.3, pp.281-289, 1975.
DOI : 10.1109/T-C.1975.224208

C. Fraley and A. Raftery, MCLUST: Software for Model-Based Cluster Analysis, Journal of Classification, vol.16, issue.2, pp.297-306, 1999.
DOI : 10.1007/s003579900058

C. Fraley and A. Raftery, Model-Based Clustering, Discriminant Analysis, and Density Estimation, Journal of the American Statistical Association, vol.97, issue.458, pp.611-631, 2002.
DOI : 10.1198/016214502760047131

J. H. Friedman, Regularized Discriminant Analysis, Journal of the American Statistical Association, vol.33, issue.405, pp.165-175, 1989.
DOI : 10.1080/01621459.1989.10478752

L. A. García-escudero, A. Gordaliza, and C. Matrán, Trimming Tools in Exploratory Data Analysis, Journal of Computational and Graphical Statistics, vol.12, issue.2, pp.434-449, 2003.
DOI : 10.1198/1061860031806

L. A. García-escudero, A. Gordaliza, C. Matrán, and A. Mayo-iscar, A general trimming approach to robust cluster analysis. The Annals of Statistics, pp.1324-1345, 2008.

I. Guyon, U. V. Luxburg, and R. Williamson, Clustering : Science or art, NIPS 2009 Workshop on Clustering Theory, p.26, 2009.

I. Guyon, N. Matic, and V. Vapnik, Discovering informative patterns and data cleaning Advances in Knowledge Discovery and Data Mining, pp.181-203, 1996.

M. Handcock, A. Raftery, and J. Tantrum, Model-based clustering for social networks, Journal of the Royal Statistical Society: Series A (Statistics in Society), vol.6, issue.2, pp.1-22, 2007.
DOI : 10.1111/j.1467-9574.2005.00283.x

T. Hastie, A. Buja, and R. Tibshirani, Penalized Discriminant Analysis, The Annals of Statistics, vol.23, issue.1, pp.73-102, 1995.
DOI : 10.1214/aos/1176324456

T. Hastie and R. Tibshirani, Discriminant analysis by gaussian mixture, Journal of the Royal Statistical Society, vol.58, issue.1 2, pp.155-176, 1996.

D. Hawkins and G. Mclachlan, High-Breakdown Linear Discriminant Analysis, Journal of the American Statistical Association, vol.16, issue.437, pp.136-143, 1997.
DOI : 10.1080/01621459.1997.10473610

P. Hoff, A. Raftery, and M. Handcock, Latent Space Approaches to Social Network Analysis, Journal of the American Statistical Association, vol.97, issue.460, pp.1090-1098, 2002.
DOI : 10.1198/016214502388618906

J. Jacques and C. Biernacki, Extension of model-based classification for binary data when training and test populations differ, Journal of Applied Statistics, vol.1, issue.1, pp.749-766
DOI : 10.2307/4088413

URL : https://hal.archives-ouvertes.fr/hal-00316080

G. M. James and C. A. Sugar, Clustering for Sparsely Sampled Functional Data, Journal of the American Statistical Association, vol.98, issue.462, pp.98397-408, 2003.
DOI : 10.1198/016214503000189

I. T. Jolliffe, Principal Component Analysis, p.12, 2002.
DOI : 10.1007/978-1-4757-1904-8

B. Krishnapuram, D. Williams, Y. Xue, A. Hartemink, L. Carin et al., On semi-supervised classification, NIPS, 2004.

N. Lawrence and B. Schölkopf, Estimating a kernel Fisher discriminant in the presence of label noise, Proc. of 18th International Conference on Machine Learning, pp.306-313, 2001.

E. Levina and P. Bickel, Maximum Likelihood Estimation of Intrinsic Dimension, 17th Annual Conference on Neural Information Processing Systems, pp.17-20, 2005.

Y. Li, L. Wessels, D. De-ridder, and M. Reinders, Classification in the presence of class noise using a probabilistic Kernel Fisher method, Pattern Recognition, vol.40, issue.12, pp.3349-3357, 2007.
DOI : 10.1016/j.patcog.2007.05.006

B. G. Lindsay, Mixture models : Theory, geometry and applications, NSF-CBMS Regional Conference Series in Probability and Statistics, 1995.

S. A. Macskassy and F. Provost, Classification in networked data : A toolkit and a univariate case study, Journal of Machine Learning Research, vol.8, pp.935-983

M. Markou and S. Singh, Novelty detection: a review???part 1: statistical approaches, Signal Processing, vol.83, issue.12, pp.2481-2497, 2003.
DOI : 10.1016/j.sigpro.2003.07.018

M. Markou and S. Singh, Novelty detection: a review???part 2:, Signal Processing, vol.83, issue.12, pp.2499-2521, 2003.
DOI : 10.1016/j.sigpro.2003.07.019

C. Maugis, G. Celeux, and M. Martin-magniette, Variable Selection for Clustering with Gaussian Mixture Models, Biometrics, vol.100, issue.3, pp.701-709, 2009.
DOI : 10.1111/j.1541-0420.2008.01160.x

URL : https://hal.archives-ouvertes.fr/inria-00153057

C. Maugis, G. Celeux, and M. Martin-magniette, Variable selection in model-based clustering: A general variable role modeling, Computational Statistics & Data Analysis, vol.53, issue.11, pp.3872-3882, 2009.
DOI : 10.1016/j.csda.2009.04.013

URL : https://hal.archives-ouvertes.fr/inria-00342108

G. Mclachlan, Iterative Reclassification Procedure for Constructing an Asymptotically Optimal Rule of Allocation in Discriminant Analysis, Journal of the American Statistical Association, vol.31, issue.350, pp.365-369, 1975.
DOI : 10.1080/01621459.1975.10479874

G. Mclachlan, Discriminant Analysis and Statistical Pattern Recognition, pp.11-52, 1992.
DOI : 10.1002/0471725293

G. Mclachlan and T. Krishnan, The EM algorithm and extensions, pp.11-14, 1997.

G. Mclachlan and D. Peel, Finite Mixture Models, 2000.
DOI : 10.1002/0471721182

G. Mclachlan, D. Peel, and R. Bean, Modelling high-dimensional data by mixtures of factor analyzers, Computational Statistics & Data Analysis, vol.41, issue.3-4, p.379, 2003.
DOI : 10.1016/S0167-9473(02)00183-4

G. Mclachlan, D. Peel, and R. Bean, Modelling high-dimensional data by mixtures of factor analyzers, Computational Statistics & Data Analysis, vol.41, issue.3-4, pp.379-388, 2003.
DOI : 10.1016/S0167-9473(02)00183-4

P. Mcnicholas and B. Murphy, Parsimonious Gaussian mixture models, Statistics and Computing, vol.61, issue.3, pp.285-296, 2008.
DOI : 10.1007/s11222-008-9056-0

S. Mika, G. Ratsch, J. Weston, B. Schölkopf, and K. R. Müllers, Fisher discriminant analysis with kernels, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468), p.70, 1999.
DOI : 10.1109/NNSP.1999.788121

J. Mingers, An empirical comparison of pruning methods for decision tree induction, Machine Learning, vol.4, issue.2, pp.227-243, 1989.
DOI : 10.1023/A:1022604100933

T. Minka, Automatic choice of dimensionality for PCA, 13th Annual Conference on Neural Information Processing Systems, pp.17-20, 2000.

A. Montanari and C. Viroli, Heteroscedastic Factor Mixture Analysis Statistical Modeling : An International journal, pp.441-460, 2010.

J. L. Moreno, Who shall survive ? : a new approach to the problem of Human interrelations. Nervous and Mental Disease Publishing, p.55, 1934.

M. E. Newman, Fast algorithm for detecting community structure in networks, Physical Review E, vol.69, issue.6, p.56, 2004.
DOI : 10.1103/PhysRevE.69.066133

T. O. Neill, Normal discrimination with unclassified observations, Journal of the American Statistical Association, issue.73, pp.821-826, 1978.

W. Pan and X. Shen, Penalized model-based clustering with application to variable selection, Journal of Machine Learning Research, vol.8, pp.1145-1164, 2007.

T. Pavlenko, On feature selection, curse-of-dimensionality and error probability in discriminant analysis, Journal of Statistical Planning and Inference, vol.115, issue.2, pp.565-584, 2003.
DOI : 10.1016/S0378-3758(02)00166-0

T. Pavlenko and D. Von-rosen, Effect of dimensionality on discrimination, Statistics, vol.9, issue.3, pp.191-213, 2001.
DOI : 10.1016/0031-3203(90)90100-Y

Z. Qiao, L. Zhou, and J. Z. Huang, Sparse linear discriminant analysis with applications to high dimensional low sample size data, International Journal of Applied Mathematics, vol.39, issue.1, p.31, 2009.

A. Raftery and N. Dean, Variable Selection for Model-Based Clustering, Journal of the American Statistical Association, vol.101, issue.473, pp.168-178, 2006.
DOI : 10.1198/016214506000000113

J. O. Ramsay and B. W. Silverman, Functional Data Analysis. Springer Series in Statistics, Biometrical Journal, vol.40, issue.1, pp.60-61, 2005.
DOI : 10.1002/(SICI)1521-4036(199804)40:1<56::AID-BIMJ56>3.0.CO;2-#

F. Rossi and N. Villa-vialaneix, Représentation d'un grand réseau à partir d'une classification hiérarchique de ses sommets, Journal de la Société Française de Statistique, pp.34-65, 2011.

P. J. Rousseeuw and A. Leroy, Robust Regression and Outlier Detection, p.45, 1987.
DOI : 10.1002/0471725382

S. Sampson, A novitiate in a period of change : An experimental and case study of relationships, p.58, 1968.

B. Scholkopf and A. Smola, Learning with Kernels : Support Vector Machines, Regularization, Optimization, and Beyond, p.70, 2001.

B. Schölkopf and A. Smola, Learning with Kernels, p.16, 2002.

J. Schott, Dimensionality reduction in quadratic discriminant analysis, Computational Statistics & Data Analysis, vol.16, issue.2, pp.161-174, 1993.
DOI : 10.1016/0167-9473(93)90111-6

G. Schwarz, Estimating the dimension of a model. The Annals of Statistics, pp.461-464, 1978.

D. Scott and J. Thompson, Probability density estimation in higher dimensions, Fifteenth Symposium in the Interface, pp.173-179, 1983.

B. Shölkopf, R. Williamson, A. Smola, J. Taylor, and J. Platt, Support vector method for novelty detection, Advances in Neural Information Processing Systems, pp.582-588, 2000.

A. Smola and R. Kondor, Kernels and Regularization on Graphs, Proc. Conf. on Learning Theory and Kernel Machines, pp.144-158, 2003.
DOI : 10.1007/978-3-540-45167-9_12

A. Storkey and M. Sugiyama, Mixture regression for covariate shift Advances in Neural Information Processing Systems 19, pp.1337-1344, 2007.

M. Sugiyama, Active learning in approximately linear regression based on conditional expectation of generalization error, Journal of Machine Learning Research, vol.7, pp.141-166, 2006.

M. Sugiyama, T. Idé, S. Nakajima, and J. Sese, Semi-supervised local Fisher discriminant analysis for dimensionality reduction, Machine Learning, pp.35-61, 2009.

M. Sugiyama and K. Müller, Input-dependent estimation of generalization error under covariate shift, Statistics & Decisions, vol.23, issue.4/2005, p.36, 2005.
DOI : 10.1524/stnd.2005.23.4.249

M. Sugiyama, K. M. Müller, and K. , Covariate shift adaptation by importance weighted cross validation, Journal of Machine Learning Research, vol.8, pp.985-1005, 2007.

D. Tax and R. Duin, Outlier detection using classifier instability, Advances in Pattern Recognition, pp.251-256, 1999.
DOI : 10.1007/BFb0033283

E. Tipping and C. Bishop, Mixtures of Probabilistic Principal Component Analyzers, Neural Computation, vol.2, issue.1, pp.443-482, 1999.
DOI : 10.1007/BF00162527

M. Tipping and C. Bishop, Probabilistic Principal Component Analysis, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.61, issue.3, pp.611-6222, 1999.
DOI : 10.1111/1467-9868.00196

D. Tyler, Asymptotic Inference for Eigenvectors, The Annals of Statistics, vol.9, issue.4, pp.725-736, 1981.
DOI : 10.1214/aos/1176345514

D. M. Witten and R. Tibshirani, A Framework for Feature Selection in Clustering, Journal of the American Statistical Association, vol.105, issue.490, pp.713-726, 2010.
DOI : 10.1198/jasa.2010.tm09415

D. M. Witten, R. Tibshirani, and T. Hastie, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, vol.10, issue.3, pp.515-534, 2009.
DOI : 10.1093/biostatistics/kxp008

S. Wold, Pattern recognition by means of disjoint principal components models, Pattern Recognition, vol.8, issue.3, pp.127-139, 1976.
DOI : 10.1016/0031-3203(76)90014-5

B. Xie, W. Pan, and X. Shen, Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data, Bioinformatics, vol.26, issue.4, pp.501-508
DOI : 10.1093/bioinformatics/btp707

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320, 2005.
DOI : 10.1073/pnas.201162998

H. Zou, T. Hastie, and R. Tibshirani, On the ???degrees of freedom??? of the lasso, The Annals of Statistics, vol.35, issue.5, pp.2173-2192, 2007.
DOI : 10.1214/009053607000000127