K. Alsabti, S. Ranka, and V. Singh, An efficient k-means clustering algorithm, Proceedings of the First Workshop High Performance Data Mining. Pages, p.75, 1998.

C. Andrieu, N. De-freitas, A. Doucet, J. , and M. I. , An introduction to MCMC for machine learning, Machine Learning, pp.5-43, 2003.

F. Anouar, F. Badran, and S. Thiria, Probabilistic self-organizing map and radial basis function networks, Neurocomputing, vol.20, issue.1-3, pp.83-96, 1998.
DOI : 10.1016/S0925-2312(98)00026-5

M. Arbib, The handbook of brain theory and neural networks, p.148, 1995.

F. R. Bach and M. I. Jordan, Kernel independent component analysis, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)., pp.1-48, 2002.
DOI : 10.1109/ICASSP.2003.1202783

F. R. Bach and M. I. Jordan, Learning spectral clustering, Advances in Neural Information Processing Systems, p.17, 2004.

M. Balasubramanian, E. L. Schwartz, J. B. Tenenbaum, V. De-silva, and J. C. Langford, The Isomap Algorithm and Topological Stability, Science, vol.295, issue.5552, pp.2957-2992, 2002.
DOI : 10.1126/science.295.5552.7a

P. Baldi and K. Hornik, Neural networks and principal component analysis: Learning from examples without local minima, Neural Networks, vol.2, issue.1, pp.53-58, 1989.
DOI : 10.1016/0893-6080(89)90014-2

S. Baluja, Probabilistic modeling for face orientation discrimination: learning from labeled and unlabeled data, Advances in Neural Information Processing Systems, p.148, 1998.

S. Banddyopadhyay, U. Maulik, and M. K. Pakhira, CLUSTERING USING SIMULATED ANNEALING WITH PROBABILISTIC REDISTRIBUTION, International Journal of Pattern Recognition and Artificial Intelligence, vol.15, issue.02, pp.269-285, 2001.
DOI : 10.1142/S0218001401000927

M. Beal and Z. Ghahramani, The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures, Bayesian Statistics, pp.48-61, 2003.

M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Advances in Neural Information Processing Systems, pp.585-591, 2002.

M. Belkin and P. Niyogi, Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, Neural Computation, vol.15, issue.6, pp.1373-1396, 2003.
DOI : 10.1126/science.290.5500.2319

R. Bellman, Adaptive control processes: a guided tour, p.3, 1961.
DOI : 10.1515/9781400874668

Y. Bengio, J. Paiement, P. Vincent, O. Delalleau, N. L. Roux et al., Out-of-sample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering, Advances in Neural Information Processing Systems, p.107, 2004.

J. L. Bentley, Multidimensional binary search trees used for associative searching, Communications of the ACM, vol.18, issue.9, pp.509-517, 1975.
DOI : 10.1145/361002.361007

J. L. Bentley, Multidimensional divide-and-conquer, Communications of the ACM, vol.23, issue.4, pp.214-229, 1980.
DOI : 10.1145/358841.358850

M. Bernstein, V. De-silva, J. Langford, and J. Tenenbaum, Graph approximations to geodesics on embedded manifolds, p.34, 2000.

C. M. Bishop, Neural networks for pattern recognition, pp.17-52, 1995.

C. M. Bishop, M. Svensén, W. , and C. K. , Developments of the generative topographic mapping, Neurocomputing, vol.21, issue.1-3, pp.203-224, 1998.
DOI : 10.1016/S0925-2312(98)00043-5

C. M. Bishop, M. Svensén, W. , and C. K. , GTM: The Generative Topographic Mapping, Neural Computation, vol.39, issue.1, pp.215-234, 1998.
DOI : 10.1007/BF01889678

C. M. Bishop and M. E. Tipping, A hierarchical latent variable model for data visualization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.20, issue.3, pp.281-293, 1998.
DOI : 10.1109/34.667885

C. Blake and C. Merz, UCI repository of machine learning databases, pp.69-102, 1998.

D. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum, Hierarchical topic models and the nested Chinese restaurant process, 2004.

A. Blum and T. Mitchell, Combining labeled and unlabeled data with cotraining A review of reliable maximum likelihood algorithms for semiparametric mixture models, Proceedings of the Annual Conference on Computational Learning Theory, pp.92-1005, 1995.

H. Bourlard and Y. Kamp, Auto-association by multilayer perceptrons and singular value decomposition, Biological Cybernetics, vol.13, issue.4-5, pp.291-294, 1988.
DOI : 10.1121/1.395916

M. Brand, Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction, Neural Computation, vol.37, issue.6, pp.1155-1182, 1999.
DOI : 10.1002/jgt.3190010407

M. Brand, Charting a manifold, Advances in Neural Information Processing Systems, pp.961-968, 2003.

M. Brand, Minimax embeddings, Advances in Neural Information Processing Systems, p.37, 2004.

M. Brand and K. Huang, A unifying theorem for spectral embedding and clustering, Proceedings of the International Workshop on Artificial Intelligence and Statistics, p.15, 2003.

G. Brassard and P. Bratley, Fundamentals of algorithmics, p.34, 1996.

M. ´. Carreira-perpiñánperpi?perpiñán, A review of dimension reduction techniques, pp.40-127, 1997.

G. Celeux, F. Forbes, and N. Payrard, EM procedures using mean field-like approximations for Markov model-based image segmentation, Pattern Recognition, vol.36, issue.1, pp.131-144, 2003.
DOI : 10.1016/S0031-3203(02)00027-4

URL : https://hal.archives-ouvertes.fr/inria-00072526

Y. Cheng, Mean shift, mode seeking, and clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.17, issue.8, pp.790-799, 1995.
DOI : 10.1109/34.400568

F. Chung, Spectral graph theory. Number 92 in CBMS Regional Conference Series in Mathematics, p.13, 1997.

T. Cover and J. Thomas, Elements of Information Theory, pp.44-65, 1991.

T. Cox and M. Cox, Multidimensional scaling. Number 59 in Monographs on statistics and applied probability, Chapman & Hall, pp.30-31, 1994.

S. Dasgupta, Learning mixtures of Gaussians, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039), pp.634-644, 1999.
DOI : 10.1109/SFFCS.1999.814639

D. De-ridder, Adaptive methods of image processing, p.51, 2001.

D. De-ridder and R. P. Duin, Sammon's mapping using neural networks: A comparison, Pattern Recognition Letters, vol.18, issue.11-13, pp.11-131307, 1997.
DOI : 10.1016/S0167-8655(97)00093-7

D. De-ridder and R. P. Duin, Locally linear embedding for classification, Pattern Recognition Group, p.38, 2002.

D. De-ridder and V. Franc, Robust subspace mixture models using t-distributions, Procedings of the British Machine Vision Conference 2003, pp.319-328, 2003.
DOI : 10.5244/C.17.35

P. Delicado and M. Huerta, Principal Curves of Oriented Points: theoretical and computational improvements, Computational Statistics, vol.18, issue.2, pp.293-315, 2003.
DOI : 10.1007/s001800300145

A. P. Dempster, N. M. Laird, R. , and D. B. , Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), vol.39, issue.44, pp.1-38, 1977.

R. Dersimonian, Maximum likelihood estimation of a mixing distribution, Journal of the Royal Statistical Society. Series C (Applied Statistics), vol.35, pp.302-309, 1986.

I. S. Dhillon, S. Mallela, and R. Kumar, Enhanced word clustering for hierarchical text classification, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '02, pp.191-200, 2002.
DOI : 10.1145/775047.775076

D. L. Donoho and C. Grimes, Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data, Proceedings of the National Academy of Sciences of the USA, pp.5591-5596, 2003.
DOI : 10.1073/pnas.1031596100

E. Erwin, K. Obermayer, and K. J. Schulten, Self-organizing maps: ordering, convergence properties and energy functions, Biological Cybernetics, vol.64, issue.1, pp.47-55, 1992.
DOI : 10.1007/BF00201801

C. Faloutsos and K. Lin, FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets, Proceedings of the ACM SIGMOD international conference on Management of data, pp.163-174, 1995.

M. A. Figueiredo and A. K. Jain, Unsupervised learning of finite mixture models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, issue.3, pp.381-396, 2002.
DOI : 10.1109/34.990138

I. K. Fodor, A survey of dimension reduction techniques, p.40, 2002.
DOI : 10.2172/15002155

A. Fred and A. K. Jain, Data clustering using evidence accumulation, Object recognition supported by user interaction for service robots, pp.276-280, 2002.
DOI : 10.1109/ICPR.2002.1047450

Y. Freund and R. E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, pp.119-139, 1997.
DOI : 10.1006/jcss.1997.1504

S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.6, issue.6, pp.712-741, 1984.

A. Gersho and R. M. Gray, Vector quantization and signal compression, Series in Engineering and Computer Science. Kluwer, vol.159, p.11, 1992.
DOI : 10.1007/978-1-4615-3626-0

Z. Ghahramani and M. J. Beal, Variational inference for Bayesian mixtures of factor analysers, Advances in Neural Information Processing Systems, pp.449-455, 2000.

Z. Ghahramani and G. E. Hinton, The EM algorithm for mixtures of factor analyzers, Canada, pp.50-134, 1996.

Z. Ghahramani and M. I. Jordan, Supervised learning from incomplete data via an EM approach, Advances in Neural Information Processing Systems, pp.120-127, 1994.

M. Girolami, The topographic organization and visualization of binary data using multivariate-Bernoulli latent variable models, IEEE Transactions on Neural Networks, vol.12, issue.6, pp.1367-1374, 2001.
DOI : 10.1109/72.963773

G. H. Golub and C. F. Van-loan, Matrix computations, p.22, 1996.

T. Graepel, M. Burger, and K. Obermayer, Self-organizing maps: Generalizations and new optimization techniques, Neurocomputing, vol.21, issue.1-3, pp.173-190, 1998.
DOI : 10.1016/S0925-2312(98)00035-6

P. Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika, vol.82, issue.4, pp.711-732, 1995.
DOI : 10.1093/biomet/82.4.711

J. H. Ham, D. D. Lee, and L. K. Saul, Learning high dimensional correspondences from low dimensional manifolds, Proceedings of the ICML workshop on the continuum from labeled to unlabeled data in machine learning and data mining, pp.34-41, 2003.

P. Hansen, E. Ngai, B. K. Cheung, and N. Mladenovi´cmladenovi´c, Analysis of global kmeans , an incremental heuristic for minimum sum-of-squares clustering, HEC Montreal, Group for Research in Decision Analysis, p.68, 2002.

T. Hastie, J. Friedman, and R. Tibshirani, The elements of statistical learning, pp.17-40, 2001.

T. Hastie and W. Stuetzle, Principal Curves, Journal of the American Statistical Association, vol.26, issue.406, pp.502-516, 1989.
DOI : 10.1080/03610927508827223

X. He and P. Niyogi, Locality preserving projections, Advances in Neural Information Processing Systems, p.37, 2004.

T. Heskes, Self-organizing maps, vector quantization, and mixture modeling, IEEE Transactions on Neural Networks, vol.12, issue.6, pp.1299-1305, 2001.
DOI : 10.1109/72.963766

G. E. Hinton, P. Dayan, and M. Revow, Modeling the manifolds of images of handwritten digits, IEEE Transactions on Neural Networks, vol.8, issue.1, pp.65-74, 1997.
DOI : 10.1109/72.554192

G. E. Hinton and S. T. Roweis, Stochastic neighbor embedding, Advances in Neural Information Processing Systems, pp.833-840, 2003.

R. A. Horn and C. R. Johnson, Matrix analysis, pp.16-115, 1985.

R. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, Adaptive Mixtures of Local Experts, Neural Computation, vol.4, issue.1, pp.79-87, 1991.
DOI : 10.1162/neco.1989.1.2.281

A. K. Jain and R. C. Dubes, Algorithms for clustering data, p.17, 1988.

A. K. Jain, M. N. Murty, F. , and P. J. , Data clustering: a review, ACM Computing Surveys, vol.31, issue.3, pp.264-323, 1999.
DOI : 10.1145/331499.331504

S. C. Johnson, Hierarchical clustering schemes, Psychometrika, vol.58, issue.4, pp.241-254, 1967.
DOI : 10.1007/BF02289588

I. T. Jolliffe, Principal component analysis, p.50, 1986.
DOI : 10.1007/978-1-4757-1904-8

A. Kaban and M. Girolami, A combined latent class and trait model for the analysis and visualization of discrete data, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23, issue.8, pp.859-872, 2001.
DOI : 10.1109/34.946989

N. Kambhatla and T. K. Leen, Fast non-linear dimension reduction, Advances in Neural Information Processing Systems, pp.51-108, 1994.

T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman et al., An efficient k-means clustering algorithm: analysis and implementation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, issue.7, pp.881-892, 2002.
DOI : 10.1109/TPAMI.2002.1017616

D. R. Karger and M. Ruhl, Finding nearest neighbors in growth-restricted metrics, Proceedings of the thiry-fourth annual ACM symposium on Theory of computing , STOC '02, pp.741-750, 2002.
DOI : 10.1145/509907.510013

B. Kégl, Intrinsic dimension estimation using packing numbers, Advances in Neural Information Processing Systems, pp.681-688, 2003.

B. Kégl, A. Krzyzak, T. Linder, and K. Zeger, Learning and design of principal curves, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.3, pp.281-297, 2000.
DOI : 10.1109/34.841759

S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Optimization by Simulated Annealing, Science, vol.220, issue.4598, pp.671-680, 1983.
DOI : 10.1126/science.220.4598.671

T. Kohonen, Self-Organizing Maps. Springer Series in Information Sciences, pp.27-88, 2001.

T. Kohonen, S. Kaski, and H. Lappalainen, Self-Organized Formation of Various Invariant-Feature Filters in the Adaptive-Subspace SOM, Neural Computation, vol.58, issue.6, pp.1321-1344, 1997.
DOI : 10.1209/0295-5075/10/7/015

T. Kohonen and P. Somervuo, How to make large self-organizing maps for nonvectorial data, Neural Networks, vol.15, issue.8-9, pp.945-952, 2002.
DOI : 10.1016/S0893-6080(02)00069-2

T. Kostiainen and J. Lampinen, On the generative probability density model in the self-organizing map, Neurocomputing, vol.48, issue.1-4, pp.217-228, 2002.
DOI : 10.1016/S0925-2312(01)00649-X

M. A. Kramer, Nonlinear principal component analysis using autoassociative neural networks, AIChE Journal, vol.37, issue.2, pp.233-243, 1991.
DOI : 10.1002/aic.690370209

K. Krishna and M. Murty, Genetic K-means algorithm, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.29, issue.3, pp.433-439, 1999.
DOI : 10.1109/3477.764879

J. Laaksonen, K. Koskela, S. Laakso, and E. Oja, Self-Organising Maps as a Relevance Feedback Technique in Content-Based Image Retrieval, Pattern Analysis & Applications, vol.4, issue.2-3, pp.140-152, 2001.
DOI : 10.1007/PL00014575

J. M. Lee, Introduction to smooth manifolds, Graduate Texts in Mathematics, vol.218, p.38, 2003.

Y. Leung, J. Zhang, and Z. Xu, Clustering by scale-space filtering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.12, pp.1396-1410, 2000.
DOI : 10.1109/34.895974

J. Q. Li, Estimation of Mixture Models, p.56, 1999.

J. Q. Li and A. R. Barron, Mixture density estimation, Advances in Neural Information Processing Systems, pp.53-56, 2000.

A. Likas, N. Vlassis, and J. J. Verbeek, The global k-means clustering algorithm Pages, Pattern Recognition, vol.36, issue.2, p.66, 2003.

B. G. Lindsay, The Geometry of Mixture Likelihoods: A General Theory, The Annals of Statistics, vol.11, issue.1, pp.86-94, 1983.
DOI : 10.1214/aos/1176346059

G. J. Mclachlan and D. Peel, Finite Mixture Models, p.42, 2000.
DOI : 10.1002/0471721182

M. Meil?-a and J. Shi, A random walks view of spectral segmentation, Proceedings of the International Workshop on Artificial Intelligence and Statistics, p.13, 2001.

S. Mika, G. Rätsch, M. ¨-uller, and K. , A mathematical programming approach to the kernel fisher algorithm, Advances in Neural Information Processing Systems, pp.591-597, 2001.

G. W. Milligan and M. C. Cooper, An examination of procedures for determining the number of clusters in a data set, Psychometrika, vol.77, issue.2, pp.159-179, 1985.
DOI : 10.1007/BF02294245

T. P. Minka, Automatic choice of dimensionality for PCA, Advances in Neural Information Processing Systems, pp.598-604, 2001.

A. W. Moore, Very fast EM-based mixture model clustering using multiresolution kd-trees, Advances in Neural Information Processing Systems, pp.543-549, 1999.

A. W. Moore and D. Pelleg, Accelerating exact k-means algorithms with geometric reasoning, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.277-281, 1999.

R. M. Neal and G. E. Hinton, A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants, Learning in Graphical Models, pp.355-368, 1998.
DOI : 10.1007/978-94-011-5014-9_12

S. Negri and L. Belanche, Heterogeneous Kohonen Networks, Connectionist Models of Neurons, Learning Processes and Artificial Intelligence : 6th International Work-Conference on Artificial and Natural Neural Networks, pp.243-252, 2001.
DOI : 10.1007/3-540-45720-8_28

A. Y. Ng, M. I. Jordan, and Y. Weiss, Spectral clustering: analysis and an algorithm, Advances in Neural Information Processing Systems, pp.13-15, 2002.

K. Nigam, A. Mccallum, S. Thrun, M. , and T. , Text classification from labeled and unlabeled documents using EM, Machine Learning, pp.103-134, 2000.

E. Oja, Simplified neuron model as a principal component analyzer, Journal of Mathematical Biology, vol.35, issue.3, pp.267-273, 1982.
DOI : 10.1007/BF00275687

E. T. Oja, M. Mäkisara, O. Simula, and J. Kangas, Data compression, feedforward neural networks. In Kohonen, Proceedings of the International Conference on Artificial Neural Networks, pp.737-745, 1991.

E. Parzen, On the estimation of a probability density function and mode. The Annals of Mathematical Statistics, pp.1064-1076, 1962.

K. Pearson, On lines and planes of closest fit to systems of points in space. The London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, vol.6, issue.2, pp.559-572, 1901.

D. Pelleg and A. W. Moore, X-means: extending k-means with efficient estimation of the number of clusters, Proceedings of the International Conference on Machine Learning, pp.727-734, 2000.

J. M. Pena, J. A. Lozano, and P. Larranaga, An empirical comparison of four initialization methods for the K-Means algorithm, Pattern Recognition Letters, vol.20, issue.10, pp.1027-1040, 1999.
DOI : 10.1016/S0167-8655(99)00069-0

G. Peters, B. Zitova, V. Der-malsburg, and C. , How to measure the pose robustness of object views, Image and Vision Computing, vol.20, issue.4, pp.249-256, 2002.
DOI : 10.1016/S0262-8856(02)00006-9

J. M. Porta, J. J. Verbeek, and B. J. Krösekr¨kröse, Active appearance-based robot localization using stereo vision. Autonomous Robots, p.19, 2004.
URL : https://hal.archives-ouvertes.fr/inria-00321476

C. E. Rasmussen, The infinite Gaussian mixture model, Advances in Neural Information Processing Systems, pp.554-560, 2000.

S. Richardson and P. J. Green, On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion), Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.59, issue.4, pp.731-792, 1997.
DOI : 10.1111/1467-9868.00095

B. D. Ripley, Pattern Recognition and Neural Networks, pp.17-69, 1996.
DOI : 10.1017/CBO9780511812651

J. Rissanen, Stochastic complexity in statistical inquiry, World Scientific, p.47, 1989.
DOI : 10.1142/0822

H. Ritter, Parametrized Self-Organizing Maps, Proceedings of the International Conference on Artificial Neural Networks, pp.568-577, 1993.
DOI : 10.1007/978-1-4471-2063-6_159

R. T. Rockafellar, Lagrange Multipliers and Optimality, SIAM Review, vol.35, issue.2, pp.183-238, 1993.
DOI : 10.1137/1035044

K. Rose, Deterministic annealing for clustering, compression, classification, regression, and related optimization problems, Proceedings of the IEEE, vol.86, issue.11, pp.2210-2239, 1998.
DOI : 10.1109/5.726788

R. Rosipal and L. J. Trejo, Kernel partial least squares regression in reproducing kernel Hilbert space, Journal of Machine Learning Research, vol.2, pp.97-123, 2001.

S. T. Roweis, EM Algorithms for PCA and SPCA, Advances in Neural Information Processing Systems, pp.626-632, 1998.

S. T. Roweis and L. K. Saul, Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science, vol.290, issue.5500, pp.2323-2326, 2000.
DOI : 10.1126/science.290.5500.2323

S. T. Roweis, L. K. Saul, and G. E. Hinton, Global coordination of local linear models, Advances in Neural Information Processing Systems, pp.889-896, 2002.

Y. Rubner, C. Tomasi, and L. J. Guibas, A metric for distributions with applications to image databases, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), pp.59-66, 1998.
DOI : 10.1109/ICCV.1998.710701

J. W. Sammon, A Nonlinear Mapping for Data Structure Analysis, IEEE Transactions on Computers, vol.18, issue.5, pp.18401-409, 1969.
DOI : 10.1109/T-C.1969.222678

L. K. Saul and F. Pereira, Aggregate and mixed-order Markov models for statistical language processing, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp.81-89, 1997.

L. K. Saul and S. T. Roweis, Think globally, fit locally: unsupervised learning of low dimensional manifolds, Journal of Machine Learning Research, vol.4, pp.119-155, 2003.

B. Schölkopfsch¨schölkopf, A. Smola, M. ¨-uller, and K. , Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, vol.20, issue.5, pp.1299-1319, 1998.
DOI : 10.1007/BF02281970

B. Schölkopfsch¨schölkopf and A. J. Smola, Learning with kernels, p.23, 2002.

P. H. Schönemannsch¨schönemann, On the Formal Differentiation of Traces and Determinants, Multivariate Behavioral Research, vol.20, issue.2, pp.113-139, 1985.
DOI : 10.1207/s15327906mbr2002_1

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

G. L. Scott and H. C. Longuet-higgins, Feature grouping by 'relocalisation' of eigenvectors of the proximity matrix, Procedings of the British Machine Vision Conference 1990, pp.103-108, 1990.
DOI : 10.5244/C.4.20

M. Seeger, Bayesian model selection for support vector machines, Gaussian processes and other kernel classifiers, Advances in Neural Information Processing Systems, p.603, 2000.

J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.8, pp.888-905, 2000.

A. J. Smola and B. Schölkopfsch¨schölkopf, Sparse greedy matrix approximation for machine learning, Proceedings of the International Conference on Machine Learning, pp.911-918, 2000.

R. F. Sproull, Refinements to nearest-neighbor searching ink-dimensional trees, Algorithmica, vol.3, issue.3, pp.579-589, 1991.
DOI : 10.1007/BF01759061

M. Szummer and T. Jaakkola, Partially labeled classification with Markov random walks, Advances in Neural Information Processing Systems, pp.945-952, 2002.

Y. W. Teh and S. T. Roweis, Automatic alignment of local representations, Advances in Neural Information Processing Systems, pp.841-848, 2003.

J. Tenenbaum, V. De-silva, and J. Langford, A Global Geometric Framework for Nonlinear Dimensionality Reduction, Science, vol.290, issue.5500, pp.2319-2323, 2000.
DOI : 10.1126/science.290.5500.2319

R. Tibshirani, Principal curves revisited, Statistics and Computing, vol.11, issue.4, pp.183-190, 1992.
DOI : 10.1007/BF01889678

R. Tibshirani, T. Hastie, B. Narasimhan, C. , and G. , Diagnosis of multiple cancer types by shrunken centroids of gene expression, Proceedings of the National Academy of Sciences of the USA, pp.6567-6572, 2002.
DOI : 10.1073/pnas.082099299

M. E. Tipping and C. M. Bishop, Mixtures of Probabilistic Principal Component Analyzers, Neural Computation, vol.2, issue.1, pp.443-482, 1999.
DOI : 10.1007/BF00162527

N. Ueda, R. Nakano, Z. Ghahramani, and G. E. Hinton, SMEM Algorithm for Mixture Models, Neural Computation, vol.21, issue.9, pp.2109-2128, 2000.
DOI : 10.1207/s15327906mbr0503_6

A. Utsugi, Density Estimation by Mixture Models with Smoothing Priors, Neural Computation, vol.39, issue.8, pp.2115-2135, 1998.
DOI : 10.1162/neco.1997.9.3.623

V. Vapnik, The nature of statistical learning theory. Statistics for Engineering and Information Science Series. Spinger-Verlag, pp.23-47, 1995.

J. Verbeek, An information theoretic approach to finding word groups for text classification, p.48, 2000.
URL : https://hal.archives-ouvertes.fr/inria-00321519

J. J. Verbeek, S. T. Roweis, and N. Vlassis, Non-linear CCA and PCA by alignment of local models, Advances in Neural Information Processing Systems, p.107, 2004.
URL : https://hal.archives-ouvertes.fr/inria-00321485

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, A Soft k-Segments Algorithm for Principal Curves, Proceedings of the International Conference on Artificial Neural Networks, pp.450-456, 2001.
DOI : 10.1007/3-540-44668-0_63

URL : https://hal.archives-ouvertes.fr/inria-00321506

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, A k-segments algorithm for finding principal curves, Pattern Recognition Letters, vol.23, issue.8, pp.1009-1017, 2002.
DOI : 10.1016/S0167-8655(02)00032-6

URL : https://hal.archives-ouvertes.fr/inria-00321497

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, Coordinating Principal Component Analyzers, Proceedings of the International Conference on Artificial Neural Networks, pp.914-919, 2002.
DOI : 10.1007/3-540-46084-5_148

URL : https://hal.archives-ouvertes.fr/inria-00321498

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, Non-linear feature extraction by the coordination of mixture models, Proceedings of the Annual Conference of the Advanced School for Computing and Imaging, p.121, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00321490

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, Self-Organization by Optimizing Free-Energy, Proceedings of the European Symposium on Artificial Neural Networks, p.85, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00321491

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, Efficient Greedy Learning of Gaussian Mixture Models, Neural Computation, vol.35, issue.1, pp.469-485, 2003.
DOI : 10.1214/aos/1176344374

URL : https://hal.archives-ouvertes.fr/inria-00321487

J. J. Verbeek, N. Vlassis, and B. J. Krösekr¨kröse, Self-organizing mixture models, Neurocomputing, vol.63, p.85, 2004.
DOI : 10.1016/j.neucom.2004.04.008

URL : https://hal.archives-ouvertes.fr/inria-00321479

J. J. Verbeek, N. Vlassis, and J. R. Nunnink, A variational EM algorithm for large-scale mixture modeling, Proceedings of the Annual Conference of the Advanced School for Computing and Imaging, p.76, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00321486

D. Verma and M. Meil?-a, A comparison of spectral clustering algorithms, pp.15-17, 2003.

P. J. Verveer and R. P. Duin, An evaluation of intrinsic dimensionality estimators, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.17, issue.1, pp.81-86, 1995.
DOI : 10.1109/34.368147

P. Viola and M. J. Jones, Robust Real-Time Face Detection, International Journal of Computer Vision, vol.57, issue.2, pp.137-154, 2004.
DOI : 10.1023/B:VISI.0000013087.49260.fb

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.9805

N. Vlassis and A. Likas, A greedy EM algorithm for Gaussian mixture learning, Neural Processing Letters, vol.15, issue.1, pp.77-87, 2002.
DOI : 10.1023/A:1013844811137

N. Vlassis, Y. Motomura, and B. J. Krösekr¨kröse, Supervised Dimension Reduction of Intrinsically Low-Dimensional Data, Neural Computation, vol.39, issue.1, pp.191-215, 2002.
DOI : 10.1214/aos/1176343886

M. P. Wand, Fast Computation of Multivariate Kernel Estimators, Journal of Computational and Graphical Statistics, vol.9, issue.4, pp.433-445, 1994.
DOI : 10.1214/aos/1176346792

J. Wang, J. Lee, and C. Zhang, Kernel Trick Embedded Gaussian Mixture Model, Proceedings of the International Conference on Algorithmic Learning Theory, pp.159-174, 2003.
DOI : 10.1007/978-3-540-39624-6_14

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.114.2451

J. H. Ward, Hierarchical Grouping to Optimize an Objective Function, Journal of the American Statistical Association, vol.58, issue.301, pp.236-244, 1963.
DOI : 10.1007/BF02289263

A. R. Webb, Multidimensional scaling by iterative majorization using radial basis functions, Pattern Recognition, vol.28, issue.5, pp.753-759, 1995.
DOI : 10.1016/0031-3203(94)00135-9

A. R. Webb, Statistical pattern recognition, pp.10-48, 2002.
DOI : 10.1002/9781119952954

Y. Weiss, Segmentation using eigenvectors: a unifying view, Proceedings of the Seventh IEEE International Conference on Computer Vision, pp.975-982, 1999.
DOI : 10.1109/ICCV.1999.790354

J. Wieghardt, Learning the topology of views: from images to objects, p.119, 2001.

C. F. Wu, Some Algorithmic Aspects of the Theory of Optimal Designs, The Annals of Statistics, vol.6, issue.6, pp.1286-1301, 1978.
DOI : 10.1214/aos/1176344374

G. Young and A. S. Householder, Discussion of a set of points in terms of their mutual distances, Psychometrika, vol.45, issue.1, pp.19-22, 1938.
DOI : 10.1007/BF02287916

H. Zha, X. He, C. H. Ding, M. Gu, and H. D. Simon, Spectral relaxation for k-means clustering, Advances in Neural Information Processing Systems, pp.1057-1064, 2002.

Y. Zhao and G. Karypis, Evaluation of hierarchical clustering algorithms for document datasets, Proceedings of the eleventh international conference on Information and knowledge management , CIKM '02, pp.515-524, 2002.
DOI : 10.1145/584792.584877

X. Zhu, Z. Ghahramani, and J. Lafferty, Semi-supervised learning using Gaussian fields and harmonic functions, Proceedings of the International Conference on Machine Learning, pp.912-919, 2003.