F. Abramovich and Y. Benjamini, Adaptive thresholding of wavelet coefficients, Computational Statistics & Data Analysis, vol.22, issue.4, pp.351-361, 1996.
DOI : 10.1016/0167-9473(96)00003-5

F. Abramovich, Y. Benjamini, D. L. Donoho, and I. M. Johnstone, Adapting to unknown sparsity by controlling the false discovery rate, The Annals of Statistics, vol.34, issue.2, pp.584-653, 2006.
DOI : 10.1214/009053606000000074

F. Abramovich, T. Sapatinas, and B. W. Silverman, Wavelet thresholding via a Bayesian approach, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.60, issue.4, pp.725-749, 1998.
DOI : 10.1111/1467-9868.00151

A. Antoniadis and J. Bigot, Wavelet Estimators in Nonparametric Regression: A Comparative Simulation Study, Journal of Statistical Software, vol.6, issue.6, pp.1-83, 2001.
DOI : 10.18637/jss.v006.i06

URL : https://hal.archives-ouvertes.fr/hal-00823485

A. Antos, L. Devroye, and L. Györfi, Lower bounds for Bayes error estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.21, issue.7, pp.643-645, 1999.
DOI : 10.1109/34.777375

P. Assouad, Deux remarques sur l'estimation, C. R. Acad. Sci. Paris Sér. I Math, vol.296, issue.23, pp.1021-1024, 1983.

J. Audibert, Aggregated estimators and empirical complexity for least square regression, Probability and Statistics, pp.685-736, 2004.

J. Audibert, Classification under polynomial entropy and margin assumptions and randomized estimators, Laboratoire de Probabilités et Modèles Aléatoires, 2004.

J. Audibert and A. B. Tsybakov, Fast learning rates for plug-in classifiers under margin condition, Ann. Statist, vol.35, issue.2, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00005882

N. H. Augustin, S. T. Buckland, and K. P. Burnham, Model selection: An integral part of inference, Boimetrics, vol.53, pp.603-618, 1997.

A. R. Barron, L. Birgé, and P. Massart, Risk bounds for model selection via penalization, Probability Theory and Related Fields, vol.113, issue.3, pp.301-413, 1999.
DOI : 10.1007/s004400050210

A. R. Barron and T. M. Cover, Minimum complexity density estimation, IEEE Transactions on Information Theory, vol.37, issue.4, pp.1034-1054, 1991.
DOI : 10.1109/18.86996

P. L. Bartlett, S. Boucheron, and G. Lugosi, Model Selection and Error Estimation, Machine Learning, pp.85-113, 2002.
DOI : 10.2139/ssrn.248567

P. L. Bartlett, Y. Freund, W. S. Lee, and R. E. Schapire, Boosting the margin: a new explanantion for the effectiveness of voting methods, Ann. Statist, vol.26, pp.1651-1686, 1998.

P. L. Bartlett, M. I. Jordan, and J. D. Mcauliffe, Convexity, Classification, and Risk Bounds, Journal of the American Statistical Association, vol.101, issue.473, pp.138-156, 2006.
DOI : 10.1198/016214505000000907

P. Bickel and K. Doksum, Mathematical Statistics: Basic Ideas and Selected Topics, 2001.

L. Birgé, Model selection via testing: an alternative to (penalized) maximum likelihood estimators, Annales de l'Institut Henri Poincare (B) Probability and Statistics, vol.42, issue.3, 2005.
DOI : 10.1016/j.anihpb.2005.04.004

L. Birgé and P. Massart, Gaussian model selection, Journal of the European Mathematical Society, vol.3, issue.3, pp.203-268, 2001.
DOI : 10.1007/s100970100031

G. Blanchard, O. Bousquet, and P. Massart, Statistical performance of support vector machines, The Annals of Statistics, vol.36, issue.2, 2004.
DOI : 10.1214/009053607000000839

G. Blanchard, G. Lugosi, and N. Vayatis, On the rate of convergence of regularized boosting classifiers, JMLR, vol.4, pp.861-894, 2003.

G. Blanchard, C. Schäfer, Y. Rozenholc, and K. Müller, Optimal dyadic decision trees, Machine Learning, 2006.
DOI : 10.1007/s10994-007-0717-6

URL : https://hal.archives-ouvertes.fr/hal-00264988

S. Boucheron, O. Bousquet, and G. Lugosi, Theory of Classification: a Survey of Some Recent Advances, ESAIM: Probability and Statistics, vol.9, pp.323-375, 2005.
DOI : 10.1051/ps:2005018

URL : https://hal.archives-ouvertes.fr/hal-00017923

S. Boucheron, G. Lugosi, and P. Massart, A sharp concentration inequality with applications. Random Structures and Algorithms, pp.277-292, 2000.

L. Breiman, J. Freidman, J. Olshen, and C. Stone, Classification and regression trees, 1984.

P. Bühlmann and B. Yu, Analyzing bagging, The Annals of Statistics, vol.30, issue.4, pp.927-961, 2002.
DOI : 10.1214/aos/1031689014

F. Bunea and A. Nobel, Online prediction algorithms for aggregation of arbitrary estimators of a conditional mean, 2005.

F. Bunea, A. B. Tsybakov, and M. H. Wegkamp, Aggregation for gaussian regression. to appear in Ann. Statist, 2005.

F. Bunea, A. B. Tsybakov, and M. H. Wegkamp, Aggregation and sparsity via l1 penalized least squares, pp.379-391, 2006.
DOI : 10.1007/11776420_29

URL : https://hal.archives-ouvertes.fr/hal-00084553

T. Cai, On adaptivity of Blockshrink wavelet estimator over Besov spaces, pp.97-102, 1997.

T. Cai, inequality approach, The Annals of Statistics, vol.27, issue.3, pp.898-924, 1999.
DOI : 10.1214/aos/1018031262

T. Cai and E. Chicken, Block thresholding for density estimation: local and global adaptivity, Journal of Multivariate Analysis, vol.95, pp.76-106, 2005.

T. Cai and B. W. Silverman, Incorporating information on neighboring coefficients into wavelet estimation, Sankhya, issue.63, pp.127-148, 2001.

O. Catoni, A mixture approach to universal model selection. preprint LMENS-97-30, 1997.

O. Catoni, universal " aggregation rules with exact bias bounds, 1999.

O. Catoni, Statistical Learning Theory and Stochastic Optimization Ecole d'´ eté de Probabilités de Saint-Flour, Lecture Notes in Mathematics, 2001.

L. Cavalier and A. Tsybakov, Penalized blockwise Stein's method, monotone oracles and sharp adaptive estimation, Math. Meth. Statist, vol.10, issue.3, pp.247-282, 2001.

C. Chesneau and G. Lecué, Adapting to unknown smoothness by aggregation of thresholded wavelet estimators, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00121088

E. Chicken, H. A. Chipman, E. Kolaczyk, and R. Mcculloch, Nonparametric regression on random processes and design Adaptive Bayesian wavelet shrinkage, J. Am. Statist. Ass, vol.92, pp.1413-1421, 1997.

C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, pp.273-297, 1995.
DOI : 10.1007/BF00994018

T. M. Cover and J. A. Thomas, Elements of Information Theory, 1991.

I. Daubechies, Ten Lectures on Wavelets. CBMS-NSF Reg. Conf, Series in Applied Math. SIAM, Philadelphia, 1992.

M. Delecroix, W. Härdle, and M. Hristache, Efficient estimation in conditional single-index regression, Journal of Multivariate Analysis, vol.86, issue.2, pp.213-226, 2003.
DOI : 10.1016/S0047-259X(02)00046-5

M. Delecroix, M. Hristache, and V. Patilea, On semiparametric -estimation in single-index regression, Journal of Statistical Planning and Inference, vol.136, issue.3, pp.730-769, 2006.
DOI : 10.1016/j.jspi.2004.09.006

URL : https://hal.archives-ouvertes.fr/hal-00458962

B. Delyon and A. Juditsky, On Minimax Wavelet Estimators, Applied and Computational Harmonic Analysis, vol.3, issue.3, pp.215-228, 1996.
DOI : 10.1006/acha.1996.0017

L. Devroye, L. Györfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition, 1996.
DOI : 10.1007/978-1-4612-0711-5

L. Devroye and G. Lugosi, Combinatorial methods in density estimation, 2001.
DOI : 10.1007/978-1-4613-0125-7

D. L. Donoho and I. M. Johnstone, Ideal spatial adaptation by wavelet shrinkage, Biometrika, vol.81, issue.3, pp.425-455, 1994.
DOI : 10.1093/biomet/81.3.425

D. L. Donoho and I. M. Johnstone, Adapting to Unknown Smoothness via Wavelet Shrinkage, Journal of the American Statistical Association, vol.31, issue.432, pp.1200-1224, 1995.
DOI : 10.1080/01621459.1979.10481038

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

D. L. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard, Wavelet shrinkage: Asymptotia ?, J. Royal Statist. Soc. Ser. B, vol.57, pp.301-369, 1995.

D. L. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard, Density estimation by wavelet thresholding, The Annals of Statistics, vol.24, issue.2, pp.508-539, 1996.
DOI : 10.1214/aos/1032894451

U. Einmahl and D. Mason, Some universal results on the behavior of increments of partial sums, Ann. Probab, vol.24, pp.2626-2635, 1996.

Y. Freund and R. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, pp.119-139, 1997.
DOI : 10.1006/jcss.1997.1504

J. Friedman, T. Hastie, and R. Tibshirani, Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors), The Annals of Statistics, vol.28, issue.2, pp.337-374, 2000.
DOI : 10.1214/aos/1016218223

S. Ga¨?ffasga¨?ffas and G. Lecué, Optimal rates and adaptation in the single-index model using aggregation

H. Gao, Wavelet shrinkage denoising using the nonnegative garrote, J. Comput. Graph. Statist, vol.7, pp.469-488, 1998.

G. Geenens and M. Delecroix, A survey about single-index models theory, 2005.

L. Györfi, M. Kohler, A. Krzy?, and H. Walk, A distribution-free theory of nonparametric regression, 2002.
DOI : 10.1007/b97848

P. Hall, G. Kerkyacharian, and D. Picard, Block thresholding rules for curve estimation using kernel and wavelet methods, Ann. Statist, vol.26, pp.942-962, 1998.

P. Hall, G. Kerkyacharian, and D. Picard, On the minimax optimality of block thresholded wavelet estimators, Statist. Sinica, vol.9, issue.1, pp.33-49, 1999.

J. A. Hartigan, Bayesian regression using akaike priors, 2002.

R. Herbei and H. Wegkamp, Classification with reject option, Canadian Journal of Statistics, vol.33, issue.4, 2005.
DOI : 10.1002/cjs.5550340410

D. R. Herrick, G. P. Nason, and B. W. Silverman, Some new methods for wavelet density estimation, Sankhya Series A, vol.63, pp.394-411, 2001.

J. L. Horowitz, Semiparametric methods in econometrics, Lecture Notes in Statistics, vol.131, 1998.
DOI : 10.1007/978-1-4612-0621-7

M. Hristache, A. Juditsky, and V. Spokoiny, Direct estimation of the index coefficient in a single-index model, Ann. Statist, vol.29, issue.3, pp.595-623, 2001.

I. A. Ibragimov and R. Z. Hasminskii, An estimate of density of a distribution, Studies in mathematical stat. IV, pp.61-85, 1980.

M. Jansen, Noise reduction by wavelet thresholding, lecture notes in statistics edition, 2001.
DOI : 10.1007/978-1-4613-0145-5

I. Johnstone and B. W. Silverman, Empirical Bayes selection of wavelet thresholds, The Annals of Statistics, vol.33, issue.4, pp.1700-1752, 1998.
DOI : 10.1214/009053605000000345

A. Juditsky, Wavelet estimators: adapting to unknown smoothness, Math. Methods of Statistics, issue.1, pp.1-20, 1997.

A. Juditsky, A. Nazin, A. B. Tsybakov, and N. Vayatis, Recursive aggregation of estimators via the mirror descent algorithm with averaging. Problems of Information Transmission, pp.368-384, 2005.

A. Juditsky, A. Nazin, A. B. Tsybakov, and N. Vayatis, Generalization error bounds for aggregation by mirror descent, Advances in Neural Information Processing 18. Proceedings of NIPS-2005, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00083573

A. Juditsky and A. Nemirovski, Functional aggregation for nonparametric estimation, Ann. Statist, vol.28, issue.3, pp.681-712, 2000.

A. B. Juditsky, . Ph, A. B. Rigollet, and . Tsybakov, Learning by mirror averaging, The Annals of Statistics, vol.36, issue.5, 2006.
DOI : 10.1214/07-AOS546

URL : https://hal.archives-ouvertes.fr/hal-00341026

G. Kerkyacharian, D. Picard, and K. Tribouley, Lp adaptive density estimation, Bernoulli, vol.2, issue.3, pp.229-247, 1996.
DOI : 10.3150/bj/1178291720

V. Koltchinskii, Local Rademacher complexities and oracle inequalities in risk minimization, IMS Medallion Lecture, pp.1-50, 2004.
DOI : 10.1214/009053606000001019

V. Koltchinskii and D. Panchenko, Empirical margin distributions and bounding the generalization error of combined classifiers, Ann. Statist, vol.30, pp.1-50, 2002.

G. Lecué, Classification with fast rates for sparsity class of Bayes rules, To appear in Electronic Journal of Statistics, 2005.

G. Lecué, Lower bounds and aggregation in density estimation, Journal of Machine Learning research, vol.7, pp.971-981, 2005.

G. Lecué, Optimal rates of aggregation in classification, 2005.

G. Lecué, Simultaneous adaptation to the margin and to complexity in classification [83] G. Lecué. Optimal oracle inequality for aggregation of classifiers under low noise condition, To appear in Ann. Statist. Proceeding of the 19th Annual Conference on Learning Theory, pp.364-378, 2005.

G. Lecué, Suboptimality of Penalized Empirical Risk Minimization, 2006.

G. Lecué, Suboptimality of Penalized Empirical Risk Minimization in Classification, 2006.
DOI : 10.1007/978-3-540-72927-3_12

W. S. Lee, P. L. Bartlett, and R. C. Williamson, The importance of convexity in learning with squared loss, Proceedings of the ninth annual conference on Computational learning theory , COLT '96, pp.1974-1980, 1998.
DOI : 10.1145/238061.238082

G. Leung and A. Barron, Information Theory and Mixing Least-Squares Regressions, IEEE Transactions on Information Theory, vol.52, issue.8, pp.3396-3410, 2006.
DOI : 10.1109/TIT.2006.878172

Y. Lin, A note on margin-based loss functions in classification, Statistics & Probability Letters, vol.68, issue.1, 1999.
DOI : 10.1016/j.spl.2004.03.002

G. Lugosi and N. Vayatis, On the Bayes-risk consistency of regularized boosting methods, Ann. Statist, vol.32, issue.1, pp.30-55, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00102140

G. Lugosi and M. Wegkamp, Complexity regularization via localized random penalties, Ann. Statist, vol.32, issue.4, pp.1679-1697, 2004.

E. Mammen and A. B. Tsybakov, Smooth discrimination analysis, Ann. Statist, vol.27, pp.1808-1829, 1999.

P. Massart, Some applications of concentration inequalities to statistics. Probability Theory, pp.245-303, 2000.

P. Massart, Concentration inequalities and model selection Ecole d'´ eté de Probabilités de Saint-Flour 2003, Lecture Notes in Mathematics, 2006.

P. Massart and E. Nédélec, Risk bounds for statistical learning, The Annals of Statistics, vol.34, issue.5, 2006.
DOI : 10.1214/009053606000000786

Y. Meyer, Ondelettes et Opérateurs, 1990.

S. Murthy, Automatic construction of decision trees from data: A multi-disciplinary survey, Data Mining and Knowledge Discovery, vol.2, issue.4, pp.345-389, 1998.
DOI : 10.1023/A:1009744630224

URL : https://hal.archives-ouvertes.fr/hal-00442435

G. P. Nason, Choice of the Threshold Parameter in Wavelet Function Estimation, 1995.
DOI : 10.1007/978-1-4612-2544-7_16

A. Nemirovski, Topics in Non-parametric Statistics, volume 1738 of Ecole d'´ eté de Probabilités de Saint-Flour, Lecture Notes in Mathematics, 1998.

D. Picard and K. Tribouley, Adaptive confidence interval for pointwise curve estimation, Ann. Statist, vol.28, issue.1, pp.298-335, 2000.

J. R. Quinlan, C4.5: Programs for Machine Learning, 1993.

P. Rigollet, Inégalités d'oracle, agrégation et adaptation, 2006.

P. Rigollet and A. B. Tsybakov, Linear and convex aggregation of density estimators, Mathematical Methods of Statistics, vol.16, issue.3, 2006.
DOI : 10.3103/S1066530707030052

URL : https://hal.archives-ouvertes.fr/hal-00068216

R. T. Rockafellar, Convex Analysis, 1970.
DOI : 10.1515/9781400873173

B. Schölkopf and A. Smola, Learning with kernels, 2002.

C. Scott and R. Nowak, Minimax-optimal classification with dyadic decision trees, IEEE Transactions on Information Theory, vol.52, issue.4, pp.1335-1353, 2006.
DOI : 10.1109/TIT.2006.871056

H. Simon, General lower bounds on the number of examples needed for learning probabilistics concepts, Proceedings of the sixth Annual ACM conference on Computational Learning Theory, pp.402-412, 1993.

I. Steinwart, D. Hush, and C. Scovel, Function Classes That Approximate the Bayes Risk, Proceeding of the 19th Annual Conference on Learning Theory, pp.79-93, 2006.
DOI : 10.1007/11776420_9

I. Steinwart and C. Scovel, Fast Rates for Support Vector Machines, Proceeding of the 18th Annual Conference on Learning Theory, COLT 2005. Lecture Notes in Computer Science 3559 Springer, 2005.
DOI : 10.1007/11503415_19

I. Steinwart and C. Scovel, Fast rates for support vector machines using Gaussian kernels, The Annals of Statistics, vol.35, issue.2, 2007.
DOI : 10.1214/009053606000001226

C. J. Stone, Optimal Global Rates of Convergence for Nonparametric Regression, The Annals of Statistics, vol.10, issue.4, pp.1040-1053, 1982.
DOI : 10.1214/aos/1176345969

W. Stute and L. Zhu, Nonparametric checks for single-index models, The Annals of Statistics, vol.33, issue.3, pp.1048-1083, 2005.
DOI : 10.1214/009053605000000020

B. Tarigan and S. A. Van-de-geer, Adaptivity of Support Vector Machines with l1 Penalty, 2004.

R. Tibshirani, Regression shrinkage and selection via the lasso, J. Royal Statist. Soc. Series BB, vol.58, pp.267-288, 1996.

A. B. Tsybakov, Optimal rates of aggregation. Computational Learning Theory and Kernel Machines, Lecture Notes in Artificial Intelligence, vol.2777, pp.303-313, 2003.
URL : https://hal.archives-ouvertes.fr/hal-00104867

A. B. Tsybakov, IntroductionàIntroductionà l'estimation non-paramétrique, 2004.

A. B. Tsybakov, Optimal aggregation of classifiers in statistical learning, The Annals of Statistics, vol.32, issue.1, pp.135-166, 2004.
DOI : 10.1214/aos/1079120131

URL : https://hal.archives-ouvertes.fr/hal-00102142

V. Vapnik, Statistical Learning Theory, 1998.

V. N. Vapnik and A. Ya, Chervonenkis. Theory of pattern recognition, 1974.

V. G. Vovk, AGGREGATING STRATEGIES, Proceedings of the 3rd Annual Workshop on Computational Learning, pp.371-386, 1990.
DOI : 10.1016/B978-1-55860-146-8.50032-1

M. Wegkamp, Model selection in nonparametric regression, The Annals of Statistics, vol.31, issue.1, pp.252-273, 2003.
DOI : 10.1214/aos/1046294464

N. Weyrich and G. T. Warhola, Wavelet shrinkage and generalized cross validation for image denoising, IEEE Transactions on Image Processing, vol.7, issue.1, pp.82-90, 1998.
DOI : 10.1109/83.650852

Y. Xia and W. Härdle, Semi-parametric estimation of partially linear single-index models, Journal of Multivariate Analysis, vol.97, issue.5, pp.1162-1184, 2006.
DOI : 10.1016/j.jmva.2005.11.005

Y. Yang, Minimax nonparametric classification .I. Rates of convergence, IEEE Transactions on Information Theory, vol.45, issue.7, pp.2271-2284, 1999.
DOI : 10.1109/18.796368

Y. Yang, Minimax nonparametric classification. II. Model selection for adaptation, IEEE Transactions on Information Theory, vol.45, issue.7, pp.2285-2292, 1999.
DOI : 10.1109/18.796369

Y. Yang, Mixing strategies for density estimation, The Annals of Statistics, vol.28, issue.1, pp.75-87, 2000.
DOI : 10.1214/aos/1016120365

Y. Yang, Adaptive regression by mixing Aggregating regression procedures to improve performance, J. Am. Statist. Ass. Bernoulli, vol.96127, issue.10, pp.574-58825, 2001.

C. H. Zhang, General empirical Bayes wavelet methods and exactly adaptive minimax estimation, The Annals of Statistics, vol.33, issue.1, pp.54-100, 2005.
DOI : 10.1214/009053604000000995

T. Zhang, On the Convergence of MDL Density Estimation, COLT, 2004.
DOI : 10.1007/978-3-540-27819-1_22

T. Zhang, Statistical behavior and consistency of classification methods based on convex risk minimization, The Annals of Statistics, vol.32, issue.1, pp.56-85, 2004.
DOI : 10.1214/aos/1079120130