=. 2. Pour and L. , position des sous-parties est classée en P () groupes par K-means. Un parent est initialisé pour chaque centre de groupe. L'affectation aux clusters donne les affectations initiales des sousparties aux parties. En réalité, la matrice des affectations ? (

D. Linéraire, Relevance Vector Machines linéaire, la méthode LASSO

W. C. Charnes and E. Rhodes, Measuring the efficiency of decision making units, European Journal of Operational Research, vol.2, issue.6, pp.429-444, 1978.
DOI : 10.1016/0377-2217(78)90138-8

B. P. Kneip and L. Simar, A NOTE ON THE CONVERGENCE OF NONPARAMETRIC DEA ESTIMATORS FOR PRODUCTION EFFICIENCY SCORES, Econometric Theory, vol.14, issue.06, pp.783-793, 1998.
DOI : 10.1017/S0266466698146042

H. Abbar, Un estimateur spline du contour d'une répartition ponctuelle aléatoire, pp.1-19, 1990.

A. Agarwal and B. Triggs, 3D human pose from silhouettes by relevance vector regression, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., pp.882-888, 2004.
DOI : 10.1109/CVPR.2004.1315258

URL : https://hal.archives-ouvertes.fr/inria-00548551

A. Agarwal and B. Triggs, Learning methods for recoverng 3d human pose from monocular images, Research Report, vol.5333, 2004.

S. Agarwal and D. Roth, Learning a Sparse Representation for Object Detection, Proceedings of the 7th European Conference on Computer Vision, pp.113-128, 2002.
DOI : 10.1007/3-540-47979-1_8

A. Agresti, An Introduction to Categorical Data Analysis, 1996.
DOI : 10.1002/0470114754

H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control, vol.19, issue.6, pp.716-723, 1974.
DOI : 10.1109/TAC.1974.1100705

Y. Altun and T. Hofmann, Large margin methods for label sequence learning, 8th European Conference on Speech Communication and Technology, 2003.

Y. Altun, A. Smola, and T. Hofmann, Exponential families for conditional random fields, 20th Conference on Uncertainty in Artificial Intelligennce, 2004.

C. Ambroise and G. Govaert, Em algorithm for partially known labels, data analysis, classification, and related methods, Proceedings of the 7th Conference of the International Federation of Classication Societies (IFCS-2000), pp.161-166, 2000.

C. Andrieu, N. D. Freitas, A. Doucet, and M. I. Jordan, An introduction to mcmc for machine learning, Machine Learning, vol.50, issue.1/2, pp.5-43, 2003.
DOI : 10.1023/A:1020281327116

L. B. Barron and P. Massart, Risk bounds for model selection via penalization. Probability Theory and Related Fields, pp.301-413, 1999.

P. Baufays and J. Rasson, A new geometric discriminant rule, Computational Statistics Quaterly, vol.2, pp.15-30, 1985.

H. Bensmail and G. Celeux, Regularized Gaussian Discriminant Analysis through Eigenvalue Decomposition, Journal of the American Statistical Association, vol.91, issue.436, pp.1743-1791, 1996.
DOI : 10.1002/0471725293

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.1873

J. M. Bernardo and A. F. Smith, Bayesian Theory, 1994.
DOI : 10.1002/9780470316870

C. Biernacki, G. Celeux, and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.7, pp.719-725, 2000.
DOI : 10.1109/34.865189

L. Birgé and P. Massart, Minimum contrast estimators on sieves, 1995.

D. Bosq, Linear processes in function spaces. theory and applications, Lecture Notes in Statistics, vol.149, 2000.

L. Bottou, Une Approche théorique de l'Apprentissage Connexionniste : Applications à la Reconnaissance de la Parole, 1991.

G. Bouchard and G. Celeux, Supervised classification with spherical Gaussian mixtures, Proceedings of CLADAG 2003, pp.75-78, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00548239

G. Bouchard and B. Triggs, Hierarchical Part-Based Visual Object Categorization, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2004.
DOI : 10.1109/CVPR.2005.174

URL : https://hal.archives-ouvertes.fr/inria-00548513

J. G. Bryan, The generalized discriminant function : mathematical foundations and computational routine, Harvard Educational Review, vol.21, pp.90-95, 1951.

O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, Choosing multiple parameters for support vector machines, Machine Learning, pp.131-159, 2002.

P. Cheeseman and J. Stutz, Bayesian classification (AUTOCLASS) : Theory and results, Advances in Knowledge Discovery and Data Mining, pp.153-180, 1996.

M. Collins, Discriminative Reranking for Natural Language Parsing, Proc. 17th International Conf. on Machine Learning, pp.175-182, 2000.
DOI : 10.1145/1968.1972

M. Collins, Discriminative training methods for hidden Markov models, Proceedings of the ACL-02 conference on Empirical methods in natural language processing , EMNLP '02, 2002.
DOI : 10.3115/1118693.1118694

T. Cover and J. Thomas, Elements of Information Theory. Series in Telecommunications, 1991.

A. Cowling and P. Hall, On pseudodata methods for removing boundary effects in kernel density estimation, Journal of the Royal Statistical Society B, pp.551-563, 1996.

N. Cristianini and J. Shawe-taylor, An introduction to support vector machines, 2000.

G. Csurka, C. Bray, C. Dance, and L. Fan, Visual categorization with bags of keypoints, Proceedings of the 8th European Conference on Computer Vision, pp.59-74, 2004.

G. Csurka, C. Dance, L. Fan, J. Williamowski, and C. Bray, Visual categorization with bags of keypoints, ECCV'04 workshop on Statistical Learning in Computer Vision, pp.59-74, 2004.

L. S. Deprins and H. Tulkens, Measuring Labor-Efficiency in Post Offices
DOI : 10.1007/978-0-387-25534-7_16

H. Bibliographie and . Tulkens, The Performance of Public Enterprises : Concepts and Measurements, 1984.

A. F. Titterington and U. E. Makov, Statistical analysis of finite mixture distributions

A. Dempster, N. Laird, and D. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, B, vol.39, pp.1-38, 1977.

P. Domingos and M. J. Pazzani, Beyond independence : Conditions for the optimality of the simple bayesian classifier, International Conference on Machine Learning, pp.105-112, 1996.

G. Dorko and C. Schmid, Selection of scale-invariant parts for object class recognition, Proceedings Ninth IEEE International Conference on Computer Vision, pp.634-640, 2003.
DOI : 10.1109/ICCV.2003.1238407

URL : https://hal.archives-ouvertes.fr/inria-00548234

G. Dorko and C. Schmid, Object class recognition using discriminative local features, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004.
URL : https://hal.archives-ouvertes.fr/inria-00070510

A. Doucet, N. De-freitas, and N. Gordon, Sequential Monte Carlo methods in practice, 2001.
DOI : 10.1007/978-1-4757-3437-9

R. Duda and P. Hart, Pattern Classification and Scene Analysis, 1973.

B. Efron, The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis, Journal of the American Statistical Association, vol.24, issue.352, pp.892-898, 1975.
DOI : 10.1080/01621459.1975.10480319

L. Fei-fei, R. Fergus, and P. Perona, A Bayesian approach to unsupervised one-shot learning of object categories, Proceedings of the 9th International Conference on Computer Vision, pp.1134-1141, 2003.

R. Fergus, P. Perona, and A. Zisserman, Object class recognition by unsupervised scale-invariant learning, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., pp.264-271, 2003.
DOI : 10.1109/CVPR.2003.1211479

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.114.7863

]. R. Fergus, P. Perona, and A. Zisserman, Object class recognition by unsupervised scale-invariant learning, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., 2003.
DOI : 10.1109/CVPR.2003.1211479

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.114.7863

R. Fisher, THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS, Annals of Eugenics, vol.59, issue.2, pp.179-188, 1936.
DOI : 10.1111/j.1469-1809.1936.tb02137.x

C. Fraley and A. E. Raftery, Model-Based Clustering, Discriminant Analysis, and Density Estimation, Journal of the American Statistical Association, vol.97, issue.458, pp.611-631, 2002.
DOI : 10.1198/016214502760047131

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.5734

J. Friedman, Regularized Discriminant Analysis, Journal of the American Statistical Association, vol.33, issue.405, pp.165-175, 1989.
DOI : 10.1080/01621459.1989.10478752

N. Friedman, D. Geiger, and M. Goldszmidt, Bayesian network classifiers, Machine Learning, vol.29, issue.2/3, pp.131-163, 1997.
DOI : 10.1023/A:1007465528199

J. Fritsch, Modular neural networks for speech recognition, 1996.

J. Fritsch, M. Finke, and A. Waibel, Adaptively growing hierarchical mixtures of experts, Advances in Neural Informations Processing Systems 9, 1997.

A. I. Bouchard, S. Girard, and A. Nazin, Linear programming problems for frontier estimation, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00071869

W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice, 1996.

S. Girard and P. Jacob, Extreme values and kernel estimates of point processes boundaries, ESAIM: Probability and Statistics, vol.8, pp.1-02, 2001.
DOI : 10.1051/ps:2004008

]. S. Bibliographie57, P. Girard, and . Jacob, Extreme values and haar series estimates of point processes boundaries, Scandinavian Journal of Statistics, vol.30, pp.369-384, 2003.

S. Girard and P. Jacob, Projection estimates of point processes boundaries, Journal of Statistical Planning and Inference, vol.116, issue.1, pp.1-15, 2003.
DOI : 10.1016/S0378-3758(02)00182-9

S. Girard and L. Menneteau, Central limit theorems for smoothed extreme value estimates of point processes boundaries, Journal of Statistical Planning and Inference, 2003.
DOI : 10.1016/j.jspi.2003.08.005

URL : https://hal.archives-ouvertes.fr/hal-00383141

T. Gonçalves and P. Quaresma, Using IR Techniques to Improve Automated Text Classification, Proc. of the 9th International Conference on Applications of Natural Language to Information Systems, pp.374-379, 2004.
DOI : 10.1007/978-3-540-27779-8_34

G. L. Goodman and D. W. Mcmichael, Objective functions for maximum likelihood classifier design, 1999 Information, Decision and Control. Data and Information Fusion Symposium, Signal Processing and Communications Symposium and Decision and Control Symposium. Proceedings (Cat. No.99EX251)
DOI : 10.1109/IDC.1999.754220

C. Goutte, E. Gaussier, N. Cancedda, and H. Déjean, Generative vs discriminative approaches to entity recognition from label deficient data, Proc. of the 7èmes Journées internationales Analyse statistique des Données Textuelles, 2004.

P. Green and B. Silverman, Nonparametric Regression and Generalized Linear Models. Monographs on Statistics and Probability, 1994.

R. Greiner and W. Zhou, Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers, Proc. of the Eighteenth Annual National Conference on Artificial Intelligence, pp.167-173, 2002.
DOI : 10.1007/s10994-005-0469-0

A. Grigor-'yan and M. Noguchi, The Heat Kernel on Hyperbolic Space, Bulletin of the London Mathematical Society, vol.30, issue.6, pp.643-650, 1998.
DOI : 10.1112/S0024609398004780

P. Hall, M. Nussbaum, and S. E. Stern, On the Estimation of a Support Curve of Indeterminate Sharpness, Journal of Multivariate Analysis, vol.62, issue.2, pp.204-232, 1997.
DOI : 10.1006/jmva.1997.1681

]. A. Hardy and J. Rasson, Une nouvelle approche des problèmes de classification automatique, pp.41-56, 1982.

J. Hartigan, Classification and Clustering, Journal of Marketing Research, vol.18, issue.4, 1975.
DOI : 10.2307/3151350

T. Hastie and R. Tibshirani, Generalized Additive Models, Statistical Science, vol.1, issue.3, pp.297-318, 1986.
DOI : 10.1214/ss/1177013604

T. Hastie and R. Tibshirani, Discriminant analysis by Gaussian mixtures, Journal of the Royal Statistical Society series B, vol.58, pp.158-176, 1996.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2001.

C. Hennig, Models and Methods for Clusterwise Linear Regression, Classification in the Information Age, pp.179-187, 1999.
DOI : 10.1007/978-3-642-60187-3_17

C. Hennig, Identifiablity of Models for Clusterwise Linear Regression, Journal of Classification, vol.17, issue.2, pp.273-296, 2000.
DOI : 10.1007/s003570000022

J. Hiriart-urruty and C. Lemaréchal, Convex analysis and minimization algorithms. part 1 : Fundamentals, Grundlehren der Mathematischen Wissenschaften, vol.305, 1993.

J. A. Hoeting, D. D. Madigan, A. E. Raftery, and C. T. Volinsky, Bayesian model averaging : A tutorial (with discussion), Statistical Science, vol.14, pp.382-417, 1999.

P. J. Huber, The behavior of maximum likelihood estimates under nonstandard conditions, Proc. Fifth Berkeley Symp, pp.221-233, 1967.

M. A. Hurn, A. Justel, and R. C. , Estimating Mixtures of Regressions, Journal of Computational and Graphical Statistics, vol.12, issue.1, 2000.
DOI : 10.1198/1061860031329

B. P. Gijbels, E. Mammen, and L. Simar, On Estimation of Monotone and Concave Frontier Functions, Journal of the American Statistical Association, vol.84, issue.1, pp.220-228, 1999.
DOI : 10.1287/mnsc.44.1.49

J. Geffroy, Sur un problème d'estimation géométrique. Publications de l'Institut de Statistique de l, pp.191-200, 1964.

T. Jaakkola, M. Diekhans, and D. Haussler, A Discriminative Framework for Detecting Remote Protein Homologies, Journal of Computational Biology, vol.7, issue.1-2, pp.95-114, 2000.
DOI : 10.1089/10665270050081405

T. S. Jaakkola and D. Haussler, Exploiting generative models in discriminative classifiers, Proc. of Tenth Conference on Advances in Neural Information Processing Systems, 1999.

P. Jacob and P. Suquet, Estimating the edge of a Poisson process by orthogonal series, Journal of Statistical Planning and Inference, vol.46, issue.2, pp.215-234, 1995.
DOI : 10.1016/0378-3758(94)00103-3

R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, Adaptive Mixtures of Local Experts, Neural Computation, vol.4, issue.1, pp.79-87, 1991.
DOI : 10.1162/neco.1989.1.2.281

G. Jarrad and D. Mcmichael, Shared mixture distributions and shared mixture classiers, Proc. of the Information, Decision and Control Conference, pp.335-340, 1999.
DOI : 10.1109/idc.1999.754179

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2220

T. Jebara, R. Kondor, and A. Howard, Probability product kernels, Journal of Machine Learning Research JMLR, pp.819-844, 2004.

T. Jebara and A. Pentland, The generalized cem algorithm, 1999.

W. Jiang and M. Tanner, Hierarchical mixtures-of-experts for exponential family regression models, approximation and maximum likelihood estimation, Ann. Statistics, vol.27, pp.987-1011, 1999.

T. Joachims, Text categorization with Support Vector Machines: Learning with many relevant features, Proceedings of ECML-98, 10th European Conference on Machine Learning, pp.137-142, 1998.
DOI : 10.1007/BFb0026683

M. I. Jordan and R. A. Jacobs, Hierarchical Mixtures of Experts and the EM Algorithm, Neural Computation, vol.26, issue.2, pp.181-214, 1994.
DOI : 10.1214/aos/1176346060

R. Kass and A. Raftery, Bayes Factors, Journal of the American Statistical Association, vol.2, issue.430, pp.773-795, 1995.
DOI : 10.1080/01621459.1995.10476572

W. T. Murphy and A. Torralba, Using the forest to see the trees : A graphical model relating features, objects, and scenes, Neural Info. Processing Systems, 2003.

N. M. Kiefer, Discrete Parameter Variation: Efficient Estimation of a Switching Regression Model, Econometrica, vol.46, issue.2, pp.427-434, 1978.
DOI : 10.2307/1913910

J. Kim, K. K. Kim, and C. Y. Suen, An HMM-MLP Hybrid Model for Cursive Script Recognition, Pattern Analysis & Applications, vol.3, issue.4, pp.314-324, 2000.
DOI : 10.1007/s100440070003

T. Kohonen, Learning vector quantization, Neural Networks, vol.1, issue.1, p.303, 1988.
DOI : 10.1007/978-3-642-56927-2_6

P. Kontkanen, P. Myllymäki, and H. Tirri, Classifier learning with supervised marginal likelihood, Proceedings of the 17th International Conference on Uncertainty in Artificial Intelligence, pp.277-284, 2001.

A. Korostelev and A. Tsybakov, Minimax theory of image reconstruction, Lecture Notes in Statistics, vol.82, 1993.
DOI : 10.1007/978-1-4612-2712-0

A. P. Korostelev, L. Simar, and A. B. Tsybakov, Efficient estimation of monotone boundaries. The Annals of Statistics, pp.476-489, 1995.

S. Kullback, Information Theory and Statistics, 1959.

L. G. Devroye and L. Lugosi, A probabilistic Theory of Pattern Recognition, 1997.
DOI : 10.1007/978-1-4612-0711-5

L. Gardes, Estimating the support of a poisson process via the Faber-Shauder basis and extreme values. Publications de l'Institut de Statistique de l, pp.43-72, 2002.

N. C. Tarassenko, P. Hayton, and M. Brady, Novelty detection for the identification of masses in mammograms, 4th International Conference on Artificial Neural Networks, pp.442-447, 1995.
DOI : 10.1049/cp:19950597

S. Lacoste-julien, An introduction to max-margin markov networks, 2003.

J. Lafferty and G. Lebanon, Information diffusion kernels, Advances in Neural Information Processing, 2003.

J. Lafferty, A. Mccallum, and F. Pereira, Conditional random fields : Probabilistic models for segmenting and labeling sequence data, Proceedings of the ICML, pp.282-289, 2001.

P. Langley and S. Sage, Tractable average-case analysis of naive bayesian classifiers, Sixteenth International Conference on Machine Learning, pp.220-228, 1999.

M. W. Layard, Large Sample Tests for the Equality of Two Covariance Matrices, The Annals of Mathematical Statistics, vol.43, issue.1, pp.123-141, 1972.
DOI : 10.1214/aoms/1177692708

Q. Le and S. Bengio, Hybrid generative-discriminative models for speech and speaker recognition, 2002.

B. Leibe, A. Leonardis, and B. Schiele, An Implicit Shape Model for Combined Object Categorization and Segmentation, ECCV'04 workshop on Statistical Learning in Computer Vision, pp.17-32, 2004.
DOI : 10.1007/11957959_26

D. G. Lowe, Local feature view clustering for 3D object recognition, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp.682-688, 2001.
DOI : 10.1109/CVPR.2001.990541

D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

L. P. Devroye and G. L. Wise, Detection of abnormal behavior via non parametric estimation of the support, SIAM Journal of Applied Mathematics, vol.38, pp.448-480, 1980.

P. Mccullach and J. Nelder, Generalized Linear Models. Number 37 in Monographs on Statistics and Applied Probability, 1983.

G. J. Mclachlan, Discriminant Analysis and Statistical Pattern Recognition, 1992.
DOI : 10.1002/0471725293

G. J. Mclachlan and D. Peel, Finite Mixture Models, 2000.
DOI : 10.1002/0471721182

R. E. Melchers, Integration and Simulation Methods, chapter Second-Moment and Transformation Methods, pp.64-93, 2001.

K. Mikolajczyk and C. Schmid, A performance evaluation of local descriptors, Proceedings of the Conference on Computer Vision and Pattern Recognition, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00548227

D. Mladenic, J. Branka, M. Grobelnik, and N. Milic-frayling, Feature selection using linear classifier weights, Proceedings of the 27th annual international conference on Research and development in information retrieval , SIGIR '04, pp.234-241, 1999.
DOI : 10.1145/1008992.1009034

P. Moerland, A comparison of mixture models for density estimation, 9th International Conference on Artificial Neural Networks: ICANN '99, pp.25-30, 1999.
DOI : 10.1049/cp:19991079

P. J. Moreno, P. P. Ho, and N. Vasconcelos, A Kullback-Leibler divergence based kernel for SVM classification in multimedia applications, Advances in Neural Information Processing Systems 16, 2004.

N. Murata, S. Yoshizawa, and S. Amari, Network information criterion-determining the number of hidden units for an artificial neural network model, IEEE Transactions on Neural Networks, vol.5, issue.6, pp.865-872, 1994.
DOI : 10.1109/72.329683

K. P. Murphy and M. A. Paskin, Linear-time inference in hierarchical hmms, NIPS, pp.833-840, 2001.

R. Neal, Assessing relevance determination methods using delve, Neural Networks and Machine Learning, 1998.

A. Y. Ng and M. I. Jordan, On discriminative vs. generative classifiers : A comparison of logistic regression and naive bayes, Advances in Neural Information Processing Systems 14, pp.609-616, 2002.

T. O. Neil, The General Distribution of the Error Rate of a Classification Procedure with Application to Logistic Regression Discrimination, Journal of the American Statistical Association, vol.54, issue.369, pp.154-160, 1980.
DOI : 10.1080/01621459.1980.10477446

A. Opelt, M. Fussenegger, A. Pinz, and P. Auer, Weak Hypotheses and Boosting for Generic Object Detection and Recognition, Proceedings of the 8th European Conference on Computer Vision, pp.71-84, 2004.
DOI : 10.1007/978-3-540-24671-8_6

B. Park, L. Simar, and C. Wiener, THE FDH ESTIMATOR FOR PRODUCTIVITY EFFICIENCY SCORES Asymptotic Properties, Econometric Theory, vol.16, issue.6, pp.855-877, 2000.
DOI : 10.1017/S0266466600166034

Y. A. Qi, T. P. Minka, R. W. Picard, and Z. Ghahramani, Predictive automatic relevance determination by expectation propagation, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015418

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.644.4929

R. E. Quandt, A New Approach to Estimating Switching Regressions, Journal of the American Statistical Association, vol.67, issue.338, pp.306-310, 1972.
DOI : 10.1080/01621459.1960.10482067

R. E. Quandt and J. B. Ramsey, Estimating Mixtures of Normal Distributions and Switching Regressions, Journal of the American Statistical Association, vol.16, issue.364, pp.730-752, 1978.
DOI : 10.1080/01621459.1978.10480085

A. E. Raftery, Bayesian model selection in social research (with discussion) Sociological Methodology, pp.111-196, 1995.

R. Raina, Y. Shen, A. Y. Ng, and A. Mccallum, Classification with hybrid generative/discriminative models

C. Rao, Linear Statistical Inference and its applications, 2001.
DOI : 10.1002/9780470316436

B. D. Ripley, Pattern Recognition and Neural Networks, 1996.
DOI : 10.1017/CBO9780511812651

C. Robert, Intrinsic losses, Theory and Decision, vol.27, issue.1, pp.191-214, 1996.
DOI : 10.1007/BF00133173

C. P. Robert, Simulation of truncated normal variables, Statistics and Computing, vol.82, issue.2, pp.121-125, 1995.
DOI : 10.1007/BF00143942

URL : https://hal.archives-ouvertes.fr/hal-00431310

C. P. Robert, The Bayesian Choice : from Decision-Theoretic Motivations to Computational Implementation, 2001.
DOI : 10.1007/978-1-4757-4314-2

K. Roeder and L. Wasserman, Practical Bayesian Density Estimation Using Mixtures of Normals, Journal of the American Statistical Association, vol.22, issue.439, pp.894-902, 1997.
DOI : 10.1080/01621459.1997.10474044

Y. D. Rubinstein and T. Hastie, Discriminative vs. informative learning, Proc. of the Third International Conference on Knowledge and Data Mining, pp.49-53, 1997.

M. Saerens, Building cost functions minimizing to some summary statistics, IEEE Transactions on Neural Networks, vol.11, issue.6, pp.1263-1271, 2000.
DOI : 10.1109/72.883416

M. Sato and S. Ishii, On-line EM Algorithm for the Normalized Gaussian Network, Neural Computation, vol.39, issue.2, pp.407-432, 2000.
DOI : 10.1162/089976698300016963

B. Schölkopf, A. Smola, A. Sha, and F. Pereira, Learning with Kernels Learning with kernels Estimating the dimension of a model Shallow parsing with conditional random fields, Proceedings of HLT-NAACL, pp.461-464, 1978.

L. Sigal, S. Bhatia, S. Roth, M. J. Black, and M. Isard, Tracking loose-limbed people, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., pp.421-428, 2004.
DOI : 10.1109/CVPR.2004.1315063

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.59.2040

M. M. Jaakkola and T. Jebara, Maximum entropy discrimination, Advances in Neural Information Processing Systems 11, 1999.

B. Taskar, C. Guestrin, and D. Koller, Max-margin markov networks, Advances in Neural Information Processing Systems 16, 2004.

M. Tipping, The relevance vector machine, Advances in Neural Information Processing Systems, 2000.

M. Tipping and C. Bishop, Probabilistic Principal Component Analysis, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.61, issue.3, 1997.
DOI : 10.1111/1467-9868.00196

M. Titsias and A. Likas, Mixture of Experts Classification Using a Hierarchical Mixture Model, Neural Computation, vol.58, issue.9, pp.2221-2244, 2002.
DOI : 10.1214/aos/1176346060

K. Tsuda, M. Kawanabe, G. Rätsch, S. Sonnenburg, and K. Müller, A New Discriminative Kernel from Probabilistic Models, Neural Computation, vol.14, issue.10, pp.2397-2414, 2002.
DOI : 10.1023/A:1007618119488

S. Ullman, E. Sali, and M. Vidal-naquet, A Fragment-Based Approach to Object Representation and Classification, 4th International Workshop on Visual Form, 2001.
DOI : 10.1007/3-540-45129-3_7

V. N. Vapnik, Statistical Learning Theory, 1998.

B. P. Härdle and A. Tsybakov, Estimation of Non-sharp Support Boundaries, Journal of Multivariate Analysis, vol.55, issue.2, pp.205-218, 1995.
DOI : 10.1006/jmva.1995.1075

P. H. Härdle and L. Simar, Iterated boostrap with application to frontier models, Journal of Productivity Analysis, vol.6, pp.63-76, 1995.

S. Waterhouse, Classification and regression using mixtures of experts, 1997.

M. Weber, M. Welling, and P. Perona, Unsupervised Learning of Models for Recognition, Proceedings of the 6th European Conference on Computer Vision, pp.18-32, 2000.
DOI : 10.1007/3-540-45054-8_2

H. Wettig, P. Grünwald, T. Roos, P. Myllymäki, and H. Tirri, When discriminative learning of bayesian network parameters is easy, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp.491-498, 2003.

L. Xu, G. Hinton, and M. I. Jordan, An alternative model for mixtures of experts, Advances in Neural Information Processing Systems, pp.633-640, 1995.

L. Xu and M. I. Jordan, On Convergence Properties of the EM Algorithm for Gaussian Mixtures, Neural Computation, vol.11, issue.1, pp.129-151, 1996.
DOI : 10.1162/neco.1994.6.2.334

Y. Yang, An evaluation of statistical approaches to text categorization, Information Retrieval, vol.1, issue.1/2, pp.69-90, 1999.
DOI : 10.1023/A:1009982220290

J. Zhu and T. Hastie, Kernel Logistic Regression and the Import Vector Machine, NIPS, pp.1081-1088, 2001.
DOI : 10.1198/106186005X25619