. Chapelle, Semi- Supervised Learning, 2006.
DOI : 10.7551/mitpress/9780262033589.001.0001

. Chopra, Learning a Similarity Metric Discriminatively, with Application to Face Verification, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.539-546, 2005.
DOI : 10.1109/CVPR.2005.202
URL : http://yann.lecun.com/exdb/publis/psgz/chopra-05.ps.gz

. Clémençon, Ranking and Empirical Minimization of U -statistics, The Annals of Statistics, vol.36, issue.2, pp.844-874, 2008.
DOI : 10.1214/009052607000000910

. Collobert, Torch7: A Matlab-like Environment for Machine Learning, BigLearn NIPS Workshop, 2011.

V. R. De-sa-de-sa, Learning classification with unlabeled data Advances in neural information processing systems, pp.112-112, 1994.

S. De, Influence of graph construction on semi-supervised learning, Lecture Notes in Computer Science, vol.8190, issue.3, pp.160-175, 2013.

. Dempster, Maximum likelihood from incomplete data via the em algorithm, JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B, vol.39, issue.1, pp.1-38, 1977.

A. Fischer and C. Igel, An Introduction to Restricted Boltzmann Machines, pp.14-36, 2012.
DOI : 10.1007/978-3-642-33275-3_2

R. A. Fisher, THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS, Annals of Eugenics, vol.8, issue.4, pp.376-386, 1938.
DOI : 10.1111/j.1469-1809.1938.tb02181.x

. Frome, Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification, 2007 IEEE 11th International Conference on Computer Vision, pp.1-8, 2007.
DOI : 10.1109/ICCV.2007.4408839

K. Fukunaga, Introduction to Statistical Pattern Recognition, 1990.

J. Fürnkranz, Separate-and-conquer rule learning, Artificial Intelligence Review, vol.13, issue.1, pp.3-54, 1999.
DOI : 10.1023/A:1006524209794

. Geurts, Extremely randomized trees, Machine Learning, vol.63, issue.1, pp.3-42, 2006.
DOI : 10.1007/s10994-006-6226-1
URL : https://hal.archives-ouvertes.fr/hal-00341932

B. Guillory, A. Guillory, and J. A. Bilmes, Label selection on graphs, Advances in Neural Information Processing Systems 22, pp.691-699, 2009.

. Guyon, Automatic capacity tuning of very large vc-dimension classifiers, Advances in Neural Information Processing Systems 5, [NIPS Conference], pp.147-155, 1993.

. Guyon, I. Elisseeff-]-guyon, and A. Elisseeff, An introduction to variable and feature selection, Journal of Machine Learning Research, vol.3, pp.1157-1182, 2003.

A. Hoffer, E. Hoffer, and N. Ailon, Deep Metric Learning Using Triplet Network, 2014.
DOI : 10.1145/1553374.1553469

[. J. Weston, ]. J. Weston, F. Rattle, and R. C. , Deep learning via semisupervised embedding, International Conference on Machine Learning, 2008.
DOI : 10.1007/978-3-642-35289-8_34
URL : http://www.cs.uiuc.edu/homes/hmobahi2/pubs/embedding12.pdf

. Kedem, Non-linear metric learning, Advances in Neural Information Processing Systems 25, pp.2582-2590, 2012.

S. B. Kotsiantis, Supervised machine learning: A review of classification techniques, Informatica (Slovenia), issue.3, pp.31249-268, 2007.

B. Kulis, Metric Learning: A Survey, Machine Learning, pp.287-364, 2012.
DOI : 10.1561/2200000019

Y. Le-cun-]-le-cun, Learning Process in an Asymmetric Threshold Network, pp.233-240, 1986.
DOI : 10.1007/978-3-642-82657-3_24

Y. Lecun, Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks ), pp.599-604, 1985.

M. Maier, How the result of graph clustering methods depends on the construction of the graph, ESAIM: Probability and Statistics, vol.17, pp.370-418, 2013.
DOI : 10.1214/009053607000000640

. Mcpherson, Birds of a Feather: Homophily in Social Networks, Annual Review of Sociology, vol.27, issue.1, pp.415-444, 2001.
DOI : 10.1146/annurev.soc.27.1.415

S. K. Murthy, Automatic construction of decision trees from data: A multi-disciplinary survey, Data Mining and Knowledge Discovery, vol.2, issue.4, pp.345-389, 1998.
DOI : 10.1023/A:1009744630224
URL : https://hal.archives-ouvertes.fr/hal-00442435

. Pedregosa, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

. Perozzi, DeepWalk, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '14, 2014.
DOI : 10.1145/2623330.2623732

. Ramanan, D. Baker-]-ramanan, and S. Baker, Local Distance Functions: A Taxonomy, New Algorithms, and an Evaluation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.4, pp.794-806, 2011.
DOI : 10.1109/TPAMI.2010.127

. Rifai, The manifold tangent classifier, Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting, pp.12-14, 2011.

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain., Psychological Review, vol.65, issue.6, pp.65-386, 1958.
DOI : 10.1037/h0042519

F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington . it Early work on what would now be referred to as a, 1962.

. Rumelhart, Parallel distributed processing: Explorations in the microstructure of cognition, chapter Learning Internal Representations by Error Propagation, pp.318-362, 1986.

. Schölkopf, Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, vol.20, issue.5, pp.1299-1319, 1998.
DOI : 10.1007/BF02281970

A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 2001.

K. Q. Weinberger and L. K. Saul, Distance metric learning for large margin nearest neighbor classification, Journal of Machine Learning Research (JMLR), vol.10, pp.207-244, 2009.

L. Yang, Distance metric learning: A comprehensive survey, 2006.

D. Yarowsky, Unsupervised word sense disambiguation rivaling supervised methods, Proceedings of the 33rd annual meeting on Association for Computational Linguistics -, pp.189-196, 1995.
DOI : 10.3115/981658.981684
URL : http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/Yarowsky-ACL95.pdf

G. P. Zhang, Neural networks for classification: a survey, IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol.30, issue.4, pp.451-462, 2000.
DOI : 10.1109/5326.897072

. Zhou, Learning with local and global consistency, Advances in Neural Information Processing Systems 16, pp.321-328, 2004.

X. Zhu, Semi-supervised learning literature survey, 2005.

G. Zhu, X. Zhu, and Z. Ghahramani, Learning from labeled and unlabeled data with label propagation, 2002.

. Zhu, Semisupervised learning using gaussian fields and harmonic functions, Machine Learning, Proceedings of the Twentieth International Conference, pp.912-919, 2003.

D. Kedem, proposent des approches pour apprendre une distance non-linéaire en apprenant une nouvelle représentation des données. La non-linéarité peut être introduite dans la distance fixée ou dans la fonction de mapping apprise, Ces métriques peuvent être apprises soit sur le domaine entier dans lequel les données sont tirées ou sur des patches locaux, 2005.

. Geurts, nouvel espace dans lequel une distance fixée satisfait un ensemble de contraintes Ces algorithmes peuvent donc être associés à l'apprentissage de représentation, qui cherche à projeter les données dans un espace de représentation dans lequel une distance est représentative de la tâche à résoudre Selon le nouvel ensemble d'attributs qu'ils proposent, les algorithmes d'apprentissages peuvent être regroupés en différentes catégories. Les premiers, appelés algorithmes de sélection d'attributs ([Guyon and Elisseeff, 2003]), choisissent un sous-ensemble des attributs initiaux, selon leurs capacités de prédiction individuelles ou combinées, tels que les algorithmes basés sur les arbres Les attributs initialement disponibles pouvant être peu adaptés à la tâche ciblée, les autres algorithmes d'apprentissage de représentation proposent de construire un ensemble d'attributs en transformant les attributs initialement disponibles, tout en réduisant le nombre d'attributs décrivant les données. Ces approches sont dans le domaine de la réduction de dimensionnalité, parmi lesquels on retrouve l'Analyse en Composantes Principales Un dernier groupe d'algorithmes d'apprentissage de représentation proposent un nouvel ensemble d'attributs sans contraintes sur la dimension. Avec ces méthodes, les données sont projetées dans un nouvel espace de représentation en appliquant des transformations sur l'ensemble initial d'attributs, Bien que considérées comme des méthodes d'apprentissage de métrique, les algorithmes non-linéaires introduits précédemment peuvent être vus comme des algorithmes où les données sont projetées dans un, 2006.

. Bibliography and . Belkin, Regularization and semi-supervised learning on large graphs, In In COLT, pp.624-638, 2004.

. Bellet, A Survey on Metric Learning for Feature Vectors and Structured Data, 2013.

. Bengio, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.351798-1828, 2013.
DOI : 10.1109/TPAMI.2013.50

M. Blum, A. Blum, and T. Mitchell, Combining labeled and unlabeled data with co-training, Proceedings of the eleventh annual conference on Computational learning theory , COLT' 98, pp.92-100, 1998.
DOI : 10.1145/279943.279962
URL : http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/cotraining.pdf

. Bromley, Signature verification using a "siamese" time delay neural network, NIPS Proc, 1994.

C. J. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery, vol.2, issue.2, pp.121-167, 1998.
DOI : 10.1023/A:1009715923555