. Broadcast and . Finally, Probability and Statistics news " [PS-news:http://groups.google.fr/group/maths-ps-news] in order to help broadcasting job announcements or conference events related to mathematical probability and statistics, like the google group " Machine-Learning news " [MLnews:http://groups .google.fr/group/ml-news] does successfully for the machine learning community. The goal is here to provide a tool in order to facilitate inter and intracommunication for the two strong communities of Probability and of Statistics at a worldscale level

B. Abdeslam, K. Jens, and P. , Relative entropy inverse reinforcement learning, Proceedings of the 14th international conference on Artificial Intelligence and Statistics, 2011.

J. D. Abernethy, P. L. Bartlett, A. Rakhlin, and A. Tewari, Optimal strategies and minimax lower bounds for online convex games, Servedio and Zhang, p.65, 2008.

J. D. Abernethy, E. Hazan, and A. Rakhlin, Competing in the dark: An efficient algorithm for bandit linear optimization, pp.263-274, 2008.

D. Achlioptas, Database-friendly random projections: Johnson-Lindenstrauss with binary coins, Journal of Computer and System Sciences, vol.66, issue.4, pp.671-687, 2003.
DOI : 10.1016/S0022-0000(03)00025-4

R. Aïd, V. Grellier, A. Renaud, and O. Teytaud, Application de l'apprentissage par renforcement à la gestion du risque, Conférences Francophone sur l'Apprentissage Automatique, 2003.

N. Ailon and B. Chazelle, Approximate nearest neighbors and the fast johnsonlindenstrauss transform, Proceedings of the 38th annual ACM Symposium on Theory of computing, STOC '06, pp.557-563, 2006.

C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák, Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, Proceedings of the 17th international conference on Algorithmic Learning Theory, pp.229-243, 2006.
DOI : 10.1007/11894841_20

P. Alquier, PAC-Bayesian bounds for randomized empirical risk minimizers, Mathematical Methods of Statistics, vol.17, issue.4, pp.279-304, 2008.
DOI : 10.3103/S1066530708040017
URL : https://hal.archives-ouvertes.fr/hal-00354922

P. Alquier and K. Lounici, PAC-Bayesian bounds for sparse regression estimation with exponential weights, Electronic Journal of Statistics, vol.5, issue.0, 2010.
DOI : 10.1214/11-EJS601
URL : https://hal.archives-ouvertes.fr/hal-00465801

. Shun-ichi, H. Amari, and . Nagaoka, Methods of Information Geometry, volume 191 of Translations of Mathematical monographs, 2000.

R. Kubota, A. , and T. Zhang, Learning on graph with laplacian regularization, pp.25-32, 2007.

C. Andrieu, N. De-freitas, A. Doucet, and M. I. Jordan, An introduction to mcmc for machine learning, Machine Learning, pp.5-43, 1969.

A. Antos, C. Szepesvári, and R. Munos, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Machine Learning, pp.89-129, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00830201

J. Aubry and S. Jaffard, Random Wavelet Series, Communications in Mathematical Physics, vol.227, issue.3, pp.483-514, 2002.
DOI : 10.1007/s002200200630
URL : https://hal.archives-ouvertes.fr/hal-00012098

J. Audibert and O. Bousquet, Combining pac-bayesian and generic chaining bounds, Journal of Machine Learning Research, vol.8, issue.116, pp.863-889, 2007.

J. Audibert and S. Bubeck, Minimax policies for adversarial and stochastic bandits, In Dasgupta and Klivans, p.85, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00834882

J. Audibert and S. Bubeck, Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.11, issue.37, pp.2785-2836, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654356

J. Audibert and O. Catoni, Robust linear regression through PAC-Bayesian truncation, p.142, 2010.

J. Audibert and O. Catoni, Robust linear least squares regression, 48 pages 62J05, 62J07, 2010b. 117, p.119
DOI : 10.1214/11-AOS918SUPP
URL : https://hal.archives-ouvertes.fr/hal-00522534

J. Audibert, R. Munos, and C. Szepesvári, Exploration???exploitation tradeoff using variance estimates in multi-armed bandits, Theoretical Computer Science, vol.410, issue.19, pp.1876-1902, 2009.
DOI : 10.1016/j.tcs.2009.01.016
URL : https://hal.archives-ouvertes.fr/hal-00711069

P. Auer, Using confidence bounds for exploitation-exploration trade-offs, Journal of Machine Learning Research, vol.3, issue.22, pp.397-422, 2003.

P. Auer and R. Ortner, Logarithmic online regret bounds for undiscounted reinforcement learning, Proceedings of the 20th conference on advances in Neural Information Processing Systems, NIPS '06, pp.49-56, 2006.

P. Auer and R. Ortner, UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem, Periodica Mathematica Hungarica, vol.5, issue.1-2, pp.55-65, 2010.
DOI : 10.1007/s10998-010-3055-6

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, Gambling in a rigged casino: The adversarial multi-armed bandit problem, Proceedings of IEEE 36th Annual Foundations of Computer Science, pp.322-331, 1995.
DOI : 10.1109/SFCS.1995.492488

P. Auer, N. Cesa-bianchi, and C. Gentile, Adaptive and Self-Confident On-Line Learning Algorithms, Journal of Computer and System Sciences, vol.64, issue.1, p.68, 2000.
DOI : 10.1006/jcss.2001.1795

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2003.
DOI : 10.1137/S0097539701398375

B. Awerbuch and R. D. Kleinberg, Online linear optimization and adaptive routing, Journal of Computer and System Sciences, vol.74, issue.1, pp.97-114, 2008.
DOI : 10.1016/j.jcss.2007.04.016
URL : http://doi.org/10.1016/j.jcss.2007.04.016

K. Azuma, Weighted sums of certain dependent random variables, Tohoku Mathematical Journal, vol.19, issue.3, pp.357-367, 1967.
DOI : 10.2748/tmj/1178243286

C. Leemon and . Baird, Residual algorithms: Reinforcement learning with function approximation, Proceedings of the 12th International Conference on Machine Learning, ICML '95, pp.30-37, 1995.

R. G. Baraniuk, M. A. Davenport, R. Devore, and M. B. Wakin, A Simple Proof of the Restricted Isometry Property for Random Matrices, Constructive Approximation, vol.159, issue.2, pp.253-263, 2008.
DOI : 10.1007/s00365-007-9003-x

A. Barron, A. Cohen, W. Dahmen, and R. Devore, Approximation and learning by greedy algorithms, The Annals of Statistics, vol.36, issue.1, pp.64-94, 2008.
DOI : 10.1214/009053607000000631

L. Peter, A. Bartlett, and . Tewari, Regal: a regularization based algorithm for reinforcement learning in weakly communicating mdps, Proceedings of the 25th conference on Uncertainty in Artificial Intelligence, UAI '09, pp.35-42, 2009.

L. Peter, M. I. Bartlett, J. D. Jordan, and . Mcauliffe, Convexity, classification, and risk bounds, Journal of the American Statistical Association, p.183, 2003.

L. Peter, E. Bartlett, A. Hazan, and . Rakhlin, Adaptive online gradient descent, pp.65-72, 2007.

L. Peter, V. Bartlett, T. P. Dani, S. M. Hayes, A. Kakade et al., High-probability regret bounds for bandit online linear optimization, Servedio and Zhang, pp.335-342, 2008.

J. Baxter, A. Tridgell, and L. Weaver, Reinforcement learning and chess, pp.91-116, 2001.

M. Belkin, I. Matveeva, and P. Niyogi, Regularization and Semi-supervised Learning on Large Graphs, Proceedings of the 17th annual Conference On Learning Theory, pp.624-638, 2004.
DOI : 10.1007/978-3-540-27819-1_43

M. Belkin, P. Niyogi, and V. Sindhwani, On Manifold Regularization, Proceedings of the 8th international conference on Artificial Intelligence and Statistics, AI&Stats '05, pp.181-193, 2005.

S. Ben-david and U. V. Luxburg, Relating clustering stability to properties of cluster boundaries, pp.379-390, 0199.

S. Ben-david, U. Von-luxburg, and D. Pál, A Sober Look at Clustering Stability, pp.5-19, 2006.
DOI : 10.1007/11776420_4

G. Bennett, Probability Inequalities for the Sum of Independent Random Variables, Journal of the American Statistical Association, vol.18, issue.297, pp.33-45, 1962.
DOI : 10.1214/aoms/1177730437

B. Bercu and A. Touati, Exponential inequalities for self-normalized martingales with applications, The Annals of Applied Probability, vol.18, issue.5, pp.1848-1869, 2008.
DOI : 10.1214/07-AAP506
URL : https://hal.archives-ouvertes.fr/hal-00165219

S. Daniel, S. Bernstein, and . Zilberstein, Reinforcement learning for weakly coupled mdps and an application to planetary rover control, 2001.

S. Bernstein, On a modification of chebyshev's inequality and of the error formula of laplace. Original publication, Ann. Sci. Inst. Sav. Ukraine, Sect. Math, vol.1, issue.31, p.108, 1924.

D. A. Berry, R. W. Chen, A. Zame, D. C. Heath, and L. A. Shepp, Bandit problems with infinitely many arms, The Annals of Statistics, vol.25, issue.5, pp.2103-2116, 1997.
DOI : 10.1214/aos/1069362389

P. Berthet and Z. Shi, Small ball estimates for Brownian motion under a weighted sup-norm, Studia Sci. Math. Hung, pp.1-2, 2001.
DOI : 10.1556/SScMath.36.2000.1-2.17

P. Dimitri, S. E. Bertsekas, and . Shreve, Stochastic Optimal Control (The Discrete Time Case), p.208, 1978.

P. Dimitri, J. N. Bertsekas, and . Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, vol.208, p.209, 0205.

L. Birgé and P. Massart, Minimum Contrast Estimators on Sieves: Exponential Bounds and Rates of Convergence, Bernoulli, vol.4, issue.3, pp.329-375, 1998.
DOI : 10.2307/3318720

C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics), p.193, 2006.

G. Blanchard, G. Lugosi, and N. Vayatis, On the rate of convergence of regularized boosting classifiers, Journal of Machine Learning Research, vol.4, pp.861-894, 2003.

G. Blanchard, O. Bousquet, and P. Massart, Statistical performance of support vector machines. The Annals of Statistics, pp.489-531, 2008.

A. Blum and S. Chawla, Learning from labeled and unlabeled data using graph mincuts, Proceedings of the 18th International Conference on Machine Learning, ICML '01, pp.19-26, 2001.

A. Blum and Y. Mansour, From External to Internal Regret, pp.621-636, 2005.
DOI : 10.1007/11503415_42

A. Blum and Y. Mansour, From External to Internal Regret, Journal of Machine Learning Research, vol.8, pp.1307-1324, 2007.
DOI : 10.1007/11503415_42

A. Blum and T. Mitchell, Combining labeled and unlabeled data with co-training, Proceedings of the eleventh annual conference on Computational learning theory , COLT' 98, pp.92-100, 1998.
DOI : 10.1145/279943.279962

S. Boucheron, O. Bousquet, and G. Lugosi, Theory of Classification: a Survey of Some Recent Advances, ESAIM: Probability and Statistics, vol.9, pp.323-375, 2005.
DOI : 10.1051/ps:2005018
URL : https://hal.archives-ouvertes.fr/hal-00017923

G. Bourdaud, Ondelettes et espaces de Besov, Revista Matem??tica Iberoamericana, vol.11, issue.3, pp.477-512, 1995.
DOI : 10.4171/RMI/181

J. A. Boyan, Least-squares temporal difference learning, Proceedings of the 16th International Conference on Machine Learning, pp.49-56, 1999.

J. Steven, A. G. Bradtke, and . Barto, Linear least-squares algorithms for temporal difference learning, Machine Learning Journal, vol.22, pp.33-57, 1996.

R. I. Brafman and M. Tennenholtz, R-max -a general polynomial time algorithm for near-optimal reinforcement learning, Journal of Machine Learning Research, vol.3, pp.213-231, 2003.

U. Brefeld, T. Gärtner, T. Scheffer, and S. Wrobel, Efficient co-regularised least squares regression, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.137-144, 2006.
DOI : 10.1145/1143844.1143862
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.68.7014

S. Bubeck, Bandits Games and Clustering Foundations, pp.95-96, 2010.
URL : https://hal.archives-ouvertes.fr/tel-00845565

S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári, Online optimization of X-armed bandits, p.65, 2008.

S. Bubeck, R. Munos, and G. Stoltz, Pure Exploration in Multi-armed Bandits Problems, pp.23-37, 2009.
DOI : 10.1090/S0002-9904-1952-09620-8

H. Bungartz and M. Griebel, Sparse grids, Acta Numerica, vol.13, p.150, 2004.

A. N. Burnetas and M. N. Katehakis, Optimal Adaptive Policies for Sequential Allocation Problems, Advances in Applied Mathematics, vol.17, issue.2, pp.122-142, 1996.
DOI : 10.1006/aama.1996.0007

E. J. Candès, The restricted isometry property and its implications for compressed sensing, Comptes Rendus Mathematique, vol.346, issue.9-10, pp.589-592, 2008.
DOI : 10.1016/j.crma.2008.03.014

J. Emmanuel, J. K. Candés, and . Romberg, Sparsity and incoherence in compressive sampling, Inverse Problems, vol.23, issue.163, pp.969-985, 2007.

J. Emmanuel, T. Candés, and . Tao, The Dantzig selector: statistical estimation when p is much larger than n, Annals of Statistics, vol.35, issue.6, pp.2313-2351, 2007.

E. J. Candés, J. K. Romberg, and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, vol.52, issue.2, pp.489-509, 2006.
DOI : 10.1109/TIT.2005.862083

E. J. Candés, J. K. Romberg, and T. Tao, Stable signal recovery from incomplete and inaccurate measurements, Communications on Pure and Applied Mathematics, vol.7, issue.8, pp.1207-1223, 2006.
DOI : 10.1002/cpa.20124

E. J. Candés, X. Li, Y. Ma, and J. Wright, Robust principal component analysis? CoRR, abs/0912, 2009.

S. Canu, X. Mary, and A. Rakotomamonjy, Functional learning through kernel. arXiv, oct, p.135, 2009.

A. Caponnetto and E. D. Vito, Optimal Rates for the Regularized Least-Squares Algorithm, Foundations of Computational Mathematics, vol.7, issue.3, pp.331-368, 2007.
DOI : 10.1007/s10208-006-0196-8

O. Catoni, Statistical Learning Theory and Stochastic Optimization, p.140, 2004.
DOI : 10.1007/b99352
URL : https://hal.archives-ouvertes.fr/hal-00104952

A. Celisse, Model selection via cross-validation in density estimation, regression, and change-points detection, 0199.
URL : https://hal.archives-ouvertes.fr/tel-00346320

N. Cesa-bianchi and G. Lugosi, Potential-based algorithms in on-line prediction and game theory, Machine Learning, vol.51, issue.3, pp.239-261, 2003.
DOI : 10.1023/A:1022901500417

N. Cesa-bianchi and G. Lugosi, Prediction, Learning, and Games, p.68, 2006.
DOI : 10.1017/CBO9780511546921

N. Cesa-bianchi and G. Lugosi, Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, p.64, 2009.
DOI : 10.1016/j.jcss.2012.01.001

N. Cesa-bianchi, Y. Freund, D. P. Helmbold, D. Haussler, R. E. Schapire et al., How to use expert advice, Proceedings of the 25th annual ACM Symposium on Theory Of Computing, STOC '93, pp.382-391, 1993.

N. Cesa-bianchi, Y. Freund, D. Haussler, D. P. Helmbold, R. E. Schapire et al., How to use expert advice, Journal of the ACM, vol.44, issue.3, pp.427-485, 1997.
DOI : 10.1145/258128.258179

N. Cesa-bianchi, G. Lugosi, and G. Stoltz, Minimizing Regret With Label Efficient Prediction, IEEE Transactions on Information Theory, vol.51, issue.6, pp.77-92, 2005.
DOI : 10.1109/TIT.2005.847729
URL : https://hal.archives-ouvertes.fr/hal-00007537

D. Chakrabarti, R. Kumar, F. Radlinski, and E. Upfal, Mortal multi-armed bandits, pp.273-280, 2008.

D. Chakraborty and P. Stone, Online model learning in adversarial markov decision processes International Foundation for Autonomous Agents and Multiagent Systems, Proceedings of the 9th international conference on Autonomous Agents and Multiagent Systems, pp.1583-1584, 2010.

S. Yuan, H. Chow, and . Teicher, Probability Theory, p.45, 1988.

B. Chih-chun-wang, S. R. Kulkarni, H. Vincent, and . Poor, Bandit problems with side observations, IEEE Transactions on Automatic Control, vol.50, issue.3, pp.338-355, 2005.
DOI : 10.1109/TAC.2005.844079

D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, Proceedings of the nineteenth annual ACM conference on Theory of computing , STOC '87, pp.1-6, 1987.
DOI : 10.1145/28395.28396
URL : http://doi.org/10.1016/s0747-7171(08)80013-2

R. Coulom, Computing Elo ratings of move patterns in the game of Go, ICGA Journal, vol.30, issue.4, pp.198-208, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00149859

M. Thomas, J. A. Cover, and . Thomas, Elements of information theory, 1991.

H. Cramér, Sur un nouveau th??or??me-limite de la th??orie des probabilit??s, Actualités Scientifiques et Industrielles, vol.736, pp.5-23, 1938.
DOI : 10.1007/978-3-642-40607-2_8

R. Crites and A. G. Barto, Improving elevator performance using reinforcement learning, Advances in Neural Information Processing Systems, pp.1017-1023, 1996.

I. Csiszár, Sanov property, generalized I-projection and a conditional limit theorem. The Annals of Probability, pp.768-793, 1984.

A. Dalalyan and A. B. Tsybakov, Sparse regression learning by aggregation and langevin monte-carlo. Arxiv preprint arXiv:0903.1223, p.164, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00362471

S. Arnak, A. B. Dalalyan, and . Tsybakov, Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity, Machine Learning Journal, vol.72, pp.39-61, 2008.

V. Dani, T. P. Hayes, and S. M. Kakade, The price of bandit information for online optimization, pp.345-352, 2008.

V. Dani, T. P. Hayes, and S. M. Kakade, Stochastic linear optimization under bandit feedback, pp.355-366, 2008.

S. Dasgupta and Y. Freund, Random projection trees and low dimensional manifolds, Proceedings of the fourtieth annual ACM symposium on Theory of computing, STOC 08, pp.537-546, 2008.
DOI : 10.1145/1374376.1374452
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.3236

S. Dasgupta and A. Gupta, An elementary proof of a theorem of johnson and lindenstrauss. Random Struct, Algorithms, vol.22, pp.60-65, 0114.

H. Victor, . De, and . Peña, A general class of exponential inequalities for martingales and ratios. The Annals of Probability, pp.537-564, 0114.

H. Victor, M. J. De-la-peña, T. Klass, and . Lai, Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws. The Annals of Probability, pp.1902-1933, 2004.

S. Deguy and A. Benassi, A flexible noise model for designing maps, Proceedings of the Vision Modeling and Visualization Conference 2001, VMV '01, pp.299-308

P. Del and M. , Feynman-Kac formulae : genealogical and interacting particle systems with applications, 2004.

S. Delattre and S. Gaiffas, Nonparametric regression with martingale increment errors, Stochastic Processes and their Applications, vol.121, issue.12, p.114, 2010.
DOI : 10.1016/j.spa.2011.08.002
URL : https://hal.archives-ouvertes.fr/hal-00530581

A. Dembo and O. Zeitouni, Large deviations techniques and applications. Elearn, p.110, 1998.
DOI : 10.1007/978-1-4612-5320-4

R. Devore, Nonlinear approximation, Acta Numerica, vol.41, issue.2, p.148, 1997.
DOI : 10.1007/BF02274662

L. Devroye, L. Györfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition, p.78, 1996.
DOI : 10.1007/978-1-4612-0711-5

I. H. Dinwoodie, Mesures dominantes et théorème de Sanov Annales de l'Institut Henri Poincaré ? Probabilités et Statistiques, pp.365-373, 1992.

L. David and . Donoho, Compressed sensing, IEEE Transactions on Information Theory, vol.52, issue.173, pp.1289-1306, 2006.

L. David, P. B. Donoho, and . Stark, Uncertainty principles and signal recovery, SIAM Journal on Applied Mathematics, vol.49, issue.3, pp.906-931, 1989.

R. Douc, A. Guillin, J. Marin, and C. P. Robert, Minimum variance importance sampling via population monte carlo, Esaim P&S, issue.11, p.70, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00070316

R. M. Dudley, Real Analysis and Probability, p.124, 1989.
DOI : 10.1017/CBO9780511755347

A. Durand, Random Wavelet Series Based on a Tree-Indexed Markov Chain, Communications in Mathematical Physics, vol.41, issue.12, pp.451-477, 2008.
DOI : 10.1007/s00220-008-0504-7

A. Edelman, Eigenvalues and Condition Numbers of Random Matrices, SIAM Journal on Matrix Analysis and Applications, vol.9, issue.4, pp.543-560, 1988.
DOI : 10.1137/0609045

D. Ernst, G. Stan, J. Goncalves, and L. Wehenkel, Clinical data based optimal STI strategies for HIV: a reinforcement learning approach, Proceedings of the 45th IEEE Conference on Decision and Control, pp.65-72, 2006.
DOI : 10.1109/CDC.2006.377527
URL : https://hal.archives-ouvertes.fr/hal-00121732

M. Amir, M. Farahmand, C. Ghavamzadeh, S. Szepesvári, and . Mannor, Regularized policy iteration, pp.441-448, 2008.

M. Amir, M. Farahmand, C. Ghavamzadeh, S. Szepesvári, and . Mannor, Regularized fitted Q-iteration for planning in continuous-space Markovian decision problems, Proceedings of the American Control Conference, p.226, 2009.

M. Amir, R. Farahmand, C. Munos, and . Szepesvári, Error propagation for approximate policy and value iteration, p.218, 2010.

A. A. Fedotov, P. Harremoës, and F. Topsøe, Best Pinsker Bound equals Taylor Polynomial of Degree $49$, Computational Technologies, vol.8, issue.111, pp.3-14, 2003.

S. Filippi, Stratégies optimistes en apprentissage par renforcement, pp.37-43, 2010.

A. D. Flaxman, A. T. Kalai, and H. B. Mcmahan, Online convex optimization in the bandit setting: gradient descent without a gradient, Proceedings of the 16th annual ACM-SIAM Symposium On Discrete Algorithms, SODA '05, pp.385-394, 2005.

M. Florina-balcan and A. Blum, A pac-style model for learning from labeled and unlabeled data, pp.111-126, 2005.

M. Fornasier and H. Rauhut, Compressive Sensing
DOI : 10.1007/978-3-642-27795-5_6-5

P. Dean, R. Foster, and . Vohra, Asymptotic calibration, Biometrika, vol.85, issue.23, pp.379-390, 1996.

P. Dean, R. Foster, and . Vohra, Regret in the on-line decision problem, Games and Economic Behavior, vol.29, issue.24, pp.7-35, 1999.

S. Foucart and M. Lai, Sparsest solutions of underdetermined linear systems via <mml:math altimg="si1.gif" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:msub><mml:mi>???</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:math>-minimization for <mml:math altimg="si2.gif" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:mn>0</mml:mn><mml:mo><</mml:mo><mml:mi>q</mml:mi><mml:mo>???</mml:mo><mml:mn>1</mml:mn></mml:math>, Applied and Computational Harmonic Analysis, vol.26, issue.3, pp.395-407, 2009.
DOI : 10.1016/j.acha.2008.09.001

M. Frazier and B. Jawerth, Decomposition of Besov Spaces, Indiana University Mathematics Journal, issue.34, p.134, 1985.
DOI : 10.1515/9781400827268.385

A. David and . Freedman, On tail probabilities for martingales. the Annals of Probability, pp.100-118, 1975.

Y. Freund and R. E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, EuroCOLT '95: Proceedings of the 2nd European conference on COmputational Learning Theory, pp.23-37, 1995.
DOI : 10.1006/jcss.1997.1504

A. Garivier, Deviation bounds. Private communication, p.113, 2011.

A. Garivier and O. Cappé, The KL-UCB algorithm for bounded stochastic bandits and beyond, Proceedings of the 24th annual Conference On Learning Theory, pp.37-41, 2011.

A. Garivier and F. Leonardi, Context tree selection: A unifying view, Stochastic Processes and their Applications, vol.121, issue.11, p.113
DOI : 10.1016/j.spa.2011.06.012

S. Gelly and D. Silver, Combining online and offline knowledge in UCT, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.273-280, 2007.
DOI : 10.1145/1273496.1273531
URL : https://hal.archives-ouvertes.fr/inria-00164003

M. Ghavamzadeh and A. Lazaric, Odalric-Ambrym Maillard, and Rémi Munos. Lstd with random projections, pp.721-729, 2010.

M. Ghavamzadeh and A. Lazaric, Rémi Munos, and Odalric-Ambrym Maillard. LSPI with random projections, p.236

W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice, p.69, 1996.

J. C. Gittins, R. Weber, and K. Glazebrook, Multi-armed Bandit Allocation Indices, p.25, 1989.
DOI : 10.1002/9780470980033

C. Gold, A. Holub, and P. Sollich, Bayesian approach to feature selection and parameter tuning for support vector machine classifiers, Neural Networks, vol.18, issue.5-6, pp.693-701, 0199.
DOI : 10.1016/j.neunet.2005.06.044

N. Gozlan and C. Léonard, Transport inequalities -a survey, Markov Processes and Related Fields, pp.635-736, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00515419

S. Guha and K. Munagala, Approximation algorithms for budgeted learning problems, Proceedings of the thirty-ninth annual ACM symposium on Theory of computing , STOC '07, pp.104-113, 1921.
DOI : 10.1145/1250790.1250807

S. Guha, K. Munagala, and S. Sarkar, Information acquisition and exploitation in multichannel wireless systems, IEEE Transactions on Information Theory, p.18, 2007.

S. Guha, K. Munagala, and P. Shi, Approximation algorithms for restless bandit problems. CoRR, abs/0711, p.18, 2007.

L. Györfi, M. Kohler, A. Krzy?ak, and H. Walk, A distribution-free theory of nonparametric regression, pp.146-154, 2002.
DOI : 10.1007/b97848

S. Hart and A. Mas, A Simple Adaptive Procedure Leading to Correlated Equilibrium, Econometrica, vol.68, issue.5, pp.1127-1150, 2000.
DOI : 10.1111/1468-0262.00153

E. Hazan, A. Agarwal, and S. Kale, Logarithmic regret algorithms for online convex optimization, pp.499-513, 2006.

W. Hoeffding, Probability Inequalities for Sums of Bounded Random Variables, Journal of the American Statistical Association, vol.1, issue.301, pp.13-30, 1963.
DOI : 10.1214/aoms/1177730491

J. Honda and A. Takemura, An asymptotically optimal bandit algorithm for bounded support models, Proceedings of the 23rd annual Conference On Learning Theory, pp.67-79, 2010.

J. Honda and A. Takemura, An asymptotically optimal policy for finite support models in the multiarmed bandit problem, Machine Learning, vol.28, issue.3, pp.50-59
DOI : 10.1007/s10994-011-5257-4

M. Hutter, Feature Reinforcement Learning: Part I. Unstructured MDPs, Journal of Artificial General Intelligence, vol.1, issue.1, pp.3-24, 2009.
DOI : 10.2478/v10229-011-0002-8
URL : http://arxiv.org/abs/0906.1713

T. Jaksch, R. Ortner, and P. Auer, Near-optimal regret bounds for reinforcement learning, Journal of Machine Learning Research, vol.99, issue.249, pp.1563-1600, 2010.

S. Janson, Gaussian Hilbert spaces, p.136, 1997.
DOI : 10.1017/CBO9780511526169

M. Sham, S. Kakade, A. Shalev-shwartz, and . Tewari, Efficient bandit algorithms for online multiclass prediction, pp.440-447, 2008.

S. Kale, L. Reyzin, and R. E. Schapire, Non-stochastic bandit slate problems, pp.1054-1062, 2010.

H. B. Varun-kanade, B. Mcmahan, and . Bryan, Sleeping experts and bandits with stochastic action availability and adversarial rewards, Proceedings of the 12th international conference on Artificial Intelligence and Statistics, number 5 in AI&Stats '09, pp.272-279, 2009.

M. Kearns and S. Singh, Near-optimal reinforcement learning in polynomial time, Machine Learning, vol.49, issue.2/3, pp.209-232, 2002.
DOI : 10.1023/A:1017984413808

W. Philipp, S. Keller, D. Mannor, and . Precup, Automatic basis function construction for approximate dynamic programming and reinforcement learning, pp.449-456, 2006.

R. D. Kleinberg, A. Niculescu, and Y. Sharma, Regret bounds for sleeping experts and bandits, Servedio and Zhang, pp.425-436, 2008.
DOI : 10.1007/s10994-010-5178-7

V. Koltchinskii, Local rademacher complexities and oracle inequalities in risk minimization . The Annals of Statistics, pp.2593-2656, 0200.

V. Koltchinskii, The Dantzig selector and sparsity oracle inequalities, Bernoulli, vol.15, issue.3, pp.799-828, 2009.
DOI : 10.3150/09-BEJ187

Z. Kolter and A. Y. Ng, Regularization and feature selection in least-squares temporal difference learning, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.521-528, 2009.
DOI : 10.1145/1553374.1553442

I. Risi, J. D. Kondor, and . Lafferty, Diffusion kernels on graphs and other discrete input spaces, Proceedings of the 19th International Conference on Machine Learning, ICML '02, pp.315-322, 2002.

G. Michail, R. Lagoudakis, and . Parr, Least-squares policy iteration, Journal of Machine Learning Research, vol.4, issue.218, pp.1107-1149, 2003.

T. Leung, L. , and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.40, pp.4-22, 1985.

A. Lazaric and R. Munos, Hybrid stochastic-adversarial online learning, p.19, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00830168

A. Lazaric, M. Ghavamzadeh, and R. Munos, Finite-sample analysis of LSTD, pp.228-232
URL : https://hal.archives-ouvertes.fr/inria-00482189

A. Lazaric, M. Ghavamzadeh, and R. Munos, Finite-sample analysis of least-squares policy iteration, p.238, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00528596

A. Lazaric, M. Ghavamzadeh, and R. Munos, Finite-sample analysis of LSTD, Fürnkranz and Joachims, p.219, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00482189

M. Ledoux, The Concentration of Measure Phenomenon. Mathematical surveys and monographs, p.107, 2001.

S. Legg and M. Hutter, Universal Intelligence: A Definition of Machine Intelligence, Minds and Machines, vol.28, issue.1, pp.391-444, 2007.
DOI : 10.1007/s11023-007-9079-x

E. Lehrer and D. Rosenberg, A wide range no-regret theorem. Game theory and information, p.81, 2003.

D. A. Levin, Y. Peres, and E. L. Wilmer, Markov Chains and Mixing Times, p.69, 2008.
DOI : 10.1090/mbk/058

V. Wenbo, W. Li, and . Linde, Approximation, metric entropy and small ball estimates for gaussian measures, Annals of Probability, vol.27, pp.1556-1578, 0199.

E. Liberty, N. Ailon, and A. Singer, Dense fast random projections and lean walsh transforms APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques, APPROX '08 / RANDOM '08, Proceedings of the 11th international workshop, pp.512-522, 2008.

A. Mikhail and . Lifshits, Gaussian random functions, p.132, 1995.

N. Littlestone and M. K. Warmuth, The weighted majority algorithm, Proceedings of the 30th annual Symposium on Foundations of Computer Science, pp.256-261, 1989.

M. Loth, M. Davy, and P. Preux, Sparse Temporal Difference Learning Using LASSO, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.352-359, 2007.
DOI : 10.1109/ADPRL.2007.368210
URL : https://hal.archives-ouvertes.fr/inria-00117075

T. Lu, D. Pál, and M. Pál, Contextual multi-armed bandits, Proceedings of the 13th international conference on Artificial Intelligence and Statistics, pp.485-492, 2010.

S. Mahadevan, Representation policy iteration, Proceedings of the 21st conference on Uncertainty in Artificial Intelligence, UAI '05, pp.372-379, 2005.

. Odalric-ambrym, R. Maillard, and . Munos, Compressed least-squares regression, Bengio, pp.1213-1221, 2009.

. Odalric-ambrym, R. Maillard, and . Munos, Scrambled objects for least-squares regression, pp.1549-1557, 2010.

. Odalric-ambrym, R. Maillard, and . Munos, Online learning in adversarial lipschitz environments, Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, ECML PKDD'10, pp.305-320, 15882.

. Odalric-ambrym, R. Maillard, and . Munos, Adaptive bandits: Towards the best historydependent strategy, To appear in Proceedings of the 14th international conference on Artificial Intelligence and Statistics, p.255, 2011.

. Odalric-ambrym, N. Maillard, and . Vayatis, Complexity versus agreement for many views, pp.232-246, 2009.

O. Maillard, R. Munos, A. Lazaric, and M. Ghavamzadeh, Finite sample analysis of bellman residual minimization, Asian Conference on Machine Learning, p.255, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00830212

O. Maillard, R. Munos, and G. Stoltz, Finite-time analysis of multiarmed bandits problems with kullback-leibler divergences, To appear in Proceedings of the 24th annual Conference On Learning Theory, p.255, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00574987

A. Vladimir, L. A. Mar?enko, and . Pastur, Distribution of eigenvalues for some sets of random matrices, Mathematics of the USSR-Sbornik, pp.457-483, 1967.

I. Menache, S. Mannor, and N. Shimkin, Basis Function Adaptation in Temporal Difference Reinforcement Learning, Annals of Operations Research, vol.34, issue.1/2/3, pp.215-238, 2005.
DOI : 10.1007/s10479-005-5732-z

R. Munos, Error bounds for approximate policy iteration, Proceedings of the 19th International Conference on Machine Learning, ICML '03, pp.560-567, 2003.

R. Munos, Performance bounds in Lp norm for approximate value iteration, SIAM Journal of Control and Optimization, p.208, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00124685

R. Munos and C. Szepesvári, Finite time bounds for fitted value iteration, Journal of Machine Learning Research, vol.9, issue.218, pp.815-857, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00120882

H. Narayanan, Randomized interior point methods for sampling and optimization, The Annals of Applied Probability, vol.26, issue.1, p.70, 2009.
DOI : 10.1214/15-AAP1104

H. Narayanan and A. Rakhlin, Random walk approach to regret minimization, pp.1777-1785, 2010.

Y. Andrew, S. J. Ng, and . Russell, Algorithms for inverse reinforcement learning, Proceedings of the 17th International Conference on Machine Learning, ICML '00, pp.663-670, 2000.

R. Ortner, Online regret bounds for markov decision processes with deterministic transitions, pp.123-137, 2009.

S. Pandey, D. Chakrabarti, and D. Agarwal, Multi-armed bandit problems with dependent arms, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273587

R. Parr, C. Painter-wakefield, L. Li, and M. L. Littman, Analyzing feature generation for value-function approximation, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.737-744, 2007.
DOI : 10.1145/1273496.1273589
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.151.5750

D. Pechyony, R. Izmailov, A. Vashist, and V. N. Vapnik, Smo-style algorithms for learning using privileged information, DMIN, pp.235-241, 2010.

M. Petrik, G. Taylor, R. Parr, and S. Zilberstein, Feature selection using regularization in approximate linear programs for Markov decision processes, Fürnkranz and Joachims, pp.871-878, 2010.

J. Poland, Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments, Theoretical Computer Science, vol.397, issue.1-3, pp.77-93, 2008.
DOI : 10.1016/j.tcs.2008.02.024

D. Pollard, Empirical processes: theory and applications. NSF-CBMS regional conference series in probability and statistics, p.109, 1990.

W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics), p.218, 2007.

L. Martin and . Puterman, Markov Decision Processes ? Discrete Stochastic Dynamic Programming, p.208, 1994.

A. Rahimi and B. Recht, Random features for large-scale kernel machines, p.148, 2007.

A. Rahimi and B. Recht, Uniform approximation of functions with random bases, 2008 46th Annual Allerton Conference on Communication, Control, and Computing, p.149, 2008.
DOI : 10.1109/ALLERTON.2008.4797607

A. Rakhlin, K. Sridharan, and A. Tewari, Online learning: Beyond regret. ArXiv e-prints, nov 2010, p.179

D. Ramachandran and E. Amir, Bayesian inverse reinforcement learning, Proceedings of the 20th international joint conference on Artifical intelligence, pp.2586-2591, 2007.

H. Rauhut, Compressive Sensing and Structured Random Matrices. Theoretical Foundations and Numerical Methods for Sparse Recovery, pp.169-172, 2010.

H. Rauhut and R. Ward, Sparse legendre expansions via l_1 minimization, Arxiv preprint, p.165, 2010.

H. Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952.
DOI : 10.1090/S0002-9904-1952-09620-8

D. S. Rosenberg, Semi-Supervised Learning with Multiple Views, p.193, 2008.

S. David, P. L. Rosenberg, and . Bartlett, The rademacher complexity of co-regularized kernel classes, Proceedings of the Eleventh ICAIS, pp.186-188, 2007.

M. Rudelson and R. Vershynin, The Littlewood???Offord problem and invertibility of random matrices, Advances in Mathematics, vol.218, issue.2, pp.600-633, 2008.
DOI : 10.1016/j.aim.2008.01.010

M. Rudelson and R. Vershynin, On sparse reconstruction from Fourier and Gaussian measurements, Communications on Pure and Applied Mathematics, vol.52, issue.8, pp.611025-1045, 2008.
DOI : 10.1002/cpa.20227

M. Rudelson and R. Vershynin, Non-asymptotic theory of random matrices: extreme singular values. ArXiv e-prints, mar 2010, p.233

P. Rusmevichientong and J. N. Tsitsiklis, Linearly Parameterized Bandits, Mathematics of Operations Research, vol.35, issue.2, pp.395-411, 1922.
DOI : 10.1287/moor.1100.0446
URL : http://arxiv.org/abs/0812.3465

D. Ryabko and M. Hutter, On the possibility of learning in reactive environments with arbitrary dependence, Theoretical Computer Science, vol.405, issue.3, pp.274-284, 2008.
DOI : 10.1016/j.tcs.2008.06.039
URL : https://hal.archives-ouvertes.fr/hal-00639569

S. Saitoh, Theory of reproducing Kernels and its applications, Longman Scientific & Technical, p.135, 1988.

N. Ivan and . Sanov, On the probability of large deviations of random magnitudes, pp.4211-4255, 1957.

T. Sarlos, Improved Approximation Algorithms for Large Matrices via Random Projections, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), pp.143-152, 2006.
DOI : 10.1109/FOCS.2006.37

B. Scherrer, Should one compute the temporal difference fix point or minimize the bellman residual? the unified oblique projection view, Fürnkranz and Joachims, p.208, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00537403

J. Schmidhuber, Anticipatory Behavior in Adaptive Learning Systems, chapter Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, pp.48-76, 2009.

B. Schölkopf, R. Herbrich, and A. J. Smola, A Generalized Representer Theorem, pp.416-426, 2001.
DOI : 10.1007/3-540-44581-1_27

J. Paul, A. Schweitzer, and . Seidmann, Generalized polynomial approximations in markovian decision processes, Journal of Mathematical Analysis and Applications, vol.110, pp.568-582, 1985.

Y. Seldin, N. Cesa-bianchi, F. Laviolette, P. Auer, J. Shawe-taylor et al., Pac-bayesian analysis of martingales and multiarmed bandits. ArXiv e-prints, 0121.

Y. Seldin, N. Cesa-bianchi, F. Laviolette, P. Auer, J. Shawe-taylor et al., Pac-bayesian analysis of the exploration-exploitation trade-off. ArXiv e-prints, 0121.

S. Shalev-shwartz, Online Learning, p.65, 2007.
DOI : 10.1017/CBO9781107298019.022

J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Handbook of Learning and Approximate Dynamic Programming
DOI : 10.1109/9780470544785

O. Sigaud and O. Buffet, Processus décisionnels de Markov en intelligence artificielle , volume 1 -principes généraux et applications of IC2 -informatique et systèmes d'information, 2008.

V. Sindhwani and D. S. Rosenberg, An RKHS for multi-view learning and manifold co-regularization, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.976-983, 2008.
DOI : 10.1145/1390156.1390279

V. Sindhwani, P. Niyogi, and M. Belkin, A co-regularization approach to semisupervised learning with multiple views, Proceedings of the 22nd International Conference on Machine Learning of ICML '05, ACM International Conference Proceeding Series. Workshop on Learning with Multiple Views, p.193, 2005.

A. J. Smola and R. I. Kondor, Kernels and Regularization on Graphs, Conference On Learning Theory and 7th Kernel Workshop, pp.144-158, 2003.
DOI : 10.1007/978-3-540-45167-9_12

K. Sridharan and S. M. Kakade, An information theoretic framework for multi-view learning, pp.403-414, 2008.

I. Steinwart, D. Hush, and C. Scovel, A new concentration result for regularized risk minimizers. IMS Lecture notes monograph series, pp.260-183, 2006.

G. Stoltz, Incomplete Information and Internal Regret in Prediction of Individual Sequences, p.81, 2005.
URL : https://hal.archives-ouvertes.fr/tel-00009759

G. Stoltz, Contributions to the sequential prediction of arbitrary sequences: applications to the theory of repeated games and empirical studies of the performance of the aggregation of experts. Habilitation à diriger des recherches, p.29, 2011.

A. L. Strehl, L. Li, E. Wiewiora, J. Langford, and M. L. Littman, PAC model-free reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.881-888, 2006.
DOI : 10.1145/1143844.1143955

R. S. Sutton, Generalization in reinforcement learning: Successful examples using sparse coarse coding, Advances in Neural Information Processing Systems, pp.1038-1044, 1996.

S. Richard, A. G. Sutton, and . Barto, Reinforcement Learning: An Introduction, p.227, 0209.

S. Richard, S. D. Sutton, and . Whitehead, Online learning with random representations, Proceedings of the 10th International Conference on Machine Learning, ICML '93, pp.314-321, 1993.

C. Szepesvári, Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, pp.1-103, 0205.
DOI : 10.2200/S00268ED1V01Y201005AIM009

M. Talagrand, The generic chaining: upper and lower bounds of stochastic processes. Springer monographs in mathematics, p.115, 2005.

T. Tao, V. Van, and M. Krishnapur, Random matrices: Universality of ESDs and the circular law, The Annals of Probability, vol.38, issue.5, pp.2023-2065, 2010.
DOI : 10.1214/10-AOP534

G. Tesauro, Temporal difference learning and TD-Gammon, Communications of the ACM, vol.38, issue.3, pp.58-68, 1995.
DOI : 10.1145/203330.203343

A. Tewari and P. L. Bartlett, Optimistic linear programming gives logarithmic regret for irreducible mdps, p.242, 2007.

R. William and . Thompson, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, vol.25, pp.285-294, 1933.

R. William and . Thompson, On the theory of apportionment, American Journal of Mathematics, vol.57, issue.6, pp.450-456, 1935.

R. Tibshirani, Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B, vol.58, issue.125, pp.267-288, 1994.

A. N. Tikhonov, Solution of incorrectly formulated problems and the regularization method, Soviet Math Dokl, vol.4, pp.1035-1038, 1963.

A. B. Tsybakov, Optimal Rates of Aggregation, Proceedings of the 16th annual Conference On Learning Theory, pp.303-313, 2003.
DOI : 10.1007/978-3-540-45167-9_23
URL : https://hal.archives-ouvertes.fr/hal-00104867

A. Sara and . Van-de-geer, The deterministic lasso, Seminar für Statistik, Eidgenössische Technische Hochschule (ETH) Zürich, p.164, 2007.

A. Sara, P. Van-de-geer, and . Buhlmann, On the conditions used to prove oracle results for the lasso, Electronic Journal of Statistics, vol.3, pp.1360-1392, 2009.

N. Vladimir, A. Vapnik, and . Vashist, A new learning paradigm: Learning using privileged information, Neural Networks, vol.22, issue.5-6, pp.544-557, 2009.

S. Santosh and . Vempala, The Random Projection Method, p.226, 2004.

Y. Wang, J. Audibert, and R. Munos, Algorithms for infinitely many-armed bandits, pp.1729-1736, 2008.

J. C. Christopher and . Watkins, Learning from Delayed Rewards King's College, p.218, 1989.

J. Weston, C. Leslie, E. Ie, D. Zhou, A. Elisseeff et al., Semi-supervised protein classification using cluster kernels, Bioinformatics, vol.21, issue.15, pp.3241-3247, 2005.
DOI : 10.1093/bioinformatics/bti497

P. Whittle, Multi-armed bandits and the gittins index, Journal of the Royal Statistical Society. Series B (Methodological), vol.42, issue.2, pp.143-149, 1980.

E. P. Wigner, On the Distribution of the Roots of Certain Symmetric Matrices, The Annals of Mathematics, vol.67, issue.2, pp.325-327, 1958.
DOI : 10.2307/1970008

J. Ronald, L. C. Williams, and . Baird, Tight performance bounds on greedy policies based on imperfect value functions, Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, p.208, 1994.

C. Zenger, Sparse grids Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar, p.150, 1990.

B. Zhao and C. Zhang, Compressed Spectral Clustering, 2009 IEEE International Conference on Data Mining Workshops, pp.344-349, 2009.
DOI : 10.1109/ICDMW.2009.22

P. Zhao and B. Yu, On model selection consistency of Lasso, Journal of Machine Learning Research, vol.7, pp.2563-164, 2006.

D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, Learning with local and global consistency, Proceedings of the 17th conference on advances in Neural Information Processing Systems, NIPS '03, pp.321-328, 2003.

S. Zilles, S. Lange, R. Holte, and M. Zinkevich, Models of cooperative teaching and learning, Journal of Machine Learning Research, vol.12, pp.349-384, 2011.

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, Proceedings of the 20th International Conference on Machine Learning, ICML '03, pp.928-936, 2003.