D. Abbasi-yadkori, C. Pál, and . Szepesvári, Improved algorithms for linear stochastic bandits, Advances in Neural Information Processing Systems, p.268, 2011.

D. Abbasi-yadkori, C. Pal, and . Szepesvari, Online-to-confidence-set conversions and application to sparse stochastic bandits, Artificial Intelligence and Statistics, p.269, 2012.

E. Abernethy, A. Hazan, and . Rakhlin, Competing in the dark: An efficient algorithm for bandit linear optimization, Proceedings of the 21st Annual Conference on Learning Theory (COLT), p.279, 2008.

A. Antos, V. Grover, and C. Szepesvári, Active Learning in Multi-armed Bandits, Algorithmic Learning Theory, pp.287-302, 2008.
DOI : 10.1007/978-3-540-87987-9_25

A. Antos, V. Grover, and C. Szepesvári, Active learning in heteroscedastic noise, Theoretical Computer Science, vol.411, issue.29-30, pp.2712-2728, 2010.
DOI : 10.1016/j.tcs.2010.04.007

. Arouna, Adaptative monte carlo method, a variance reduction technique. Monte Carlo Methods and Applications, pp.1-24, 2004.

M. S. Asif and J. Romberg, On the LASSO and Dantzig selector equivalence, 2010 44th Annual Conference on Information Sciences and Systems (CISS), pp.1-6, 2010.
DOI : 10.1109/CISS.2010.5464890

J. Audibert, R. Munos, and C. Szepesvari, Exploration???exploitation tradeoff using variance estimates in multi-armed bandits, Theoretical Computer Science, vol.410, issue.19, pp.1876-1902, 2009.
DOI : 10.1016/j.tcs.2009.01.016

URL : https://hal.archives-ouvertes.fr/hal-00711069

J. Audibert, S. Bubeck, and R. Munos, Best arm identification in multi-armed bandits, Proceedings of the Twenty-Third Annual Conference on Learning Theory (COLT'10), pp.41-53, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654404

J. Y. Audibert and S. Bubeck, Minimax policies for bandits games, COLT, vol.27, p.88, 2009.

J. Y. Audibert, R. Munos, and C. Szepesvári, Exploration???exploitation tradeoff using variance estimates in multi-armed bandits, Theoretical Computer Science, vol.410, issue.19, pp.1876-1902, 1927.
DOI : 10.1016/j.tcs.2009.01.016

URL : https://hal.archives-ouvertes.fr/hal-00711069

J. Y. Audibert, S. Bubeck, and G. Lugosi, Minimax policies for combinatorial prediction games
URL : https://hal.archives-ouvertes.fr/hal-00624463

R. J. Audibert, S. Bubeck, and G. Lugosi, Regret in online combinatorial optimization. Arxiv preprint arXiv:1204, pp.4710-4738, 2012.
DOI : 10.1287/moor.2013.0598

URL : http://arxiv.org/abs/1204.4710

N. Auer, P. Cesa-bianchi, and . Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

N. Auer, Y. Cesa-bianchi, R. E. Freund, and . Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2003.
DOI : 10.1137/S0097539701398375

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.158

B. Awerbuch and R. D. Kleinberg, Adaptive routing with end-to-end feedback, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing , STOC '04, pp.45-53, 2004.
DOI : 10.1145/1007352.1007367

R. Baraniuk, M. Davenport, R. Devore, and M. Wakin, A Simple Proof of the Restricted Isometry Property for Random Matrices, Constructive Approximation, vol.159, issue.2, pp.253-263, 2008.
DOI : 10.1007/s00365-007-9003-x

P. L. Bartlett, V. Dani, T. Hayes, S. M. Kakade, A. Rakhlin et al., High-probability regret bounds for bandit online linear optimization, Proceedings of the 21st Annual Conference on Learning Theory, pp.335-342, 2008.

G. Bennett, Probability Inequalities for the Sum of Independent Random Variables, Journal of the American Statistical Association, vol.18, issue.297, pp.33-45, 1962.
DOI : 10.1214/aoms/1177730437

D. P. Bertsekas, Nonlinear programming, p.273, 1999.

P. Bickel, Y. Ritov, and A. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector, The Annals of Statistics, vol.37, issue.4
DOI : 10.1214/08-AOS620

URL : https://hal.archives-ouvertes.fr/hal-00401585

T. Blumensath and M. E. Davies, Iterative hard thresholding for compressed sensing, Applied and Computational Harmonic Analysis, vol.27, issue.3, pp.265-274, 2009.
DOI : 10.1016/j.acha.2009.04.002

URL : http://doi.org/10.1016/j.acha.2009.04.002

P. Brémaud, An Introduction to Probabilistic Modeling, p.69, 1988.
DOI : 10.1007/978-1-4612-1046-7

. Bubeck, Jeux de bandits et fondations du clustering, p.24, 2010.
URL : https://hal.archives-ouvertes.fr/tel-00845565

R. Bubeck, G. Munos, and . Stoltz, Pure Exploration in Multi-armed Bandits Problems, Algorithmic Learning Theory, pp.23-37, 2009.
DOI : 10.1090/S0002-9904-1952-09620-8

S. Bubeck, R. Munos, and G. Stoltz, Pure exploration in finitely-armed and continuous-armed bandits, Theoretical Computer Science, vol.412, issue.19, pp.1832-1852, 2011.
DOI : 10.1016/j.tcs.2010.12.059

URL : https://hal.archives-ouvertes.fr/hal-00609550

V. Buldygin and Y. V. Kozachenko, Sub-Gaussian random variables, Ukrainian Mathematical Journal, vol.1, issue.No. 3, pp.483-489, 1980.
DOI : 10.1007/BF01087176

F. Bunea, A. Tsybakov, and M. H. Wegkamp, Sparsity oracle inequalities for the Lasso, Electronic Journal of Statistics, vol.1, issue.0, pp.169-194, 2007.
DOI : 10.1214/07-EJS008

URL : https://hal.archives-ouvertes.fr/hal-00160646

A. N. Burnetas and M. N. Katehakis, Optimal Adaptive Policies for Sequential Allocation Problems, Advances in Applied Mathematics, vol.17, issue.2, pp.122-142, 1996.
DOI : 10.1006/aama.1996.0007

URL : http://doi.org/10.1006/aama.1996.0007

E. Candès and J. Romberg, Sparsity and incoherence in compressive sampling, Inverse Problems, vol.23, issue.3, pp.969-985, 2007.
DOI : 10.1088/0266-5611/23/3/008

E. Candes and T. Tao, The Dantzig selector: Statistical estimation when p is much larger than n, The Annals of Statistics, vol.35, issue.6, pp.2313-2351, 2007.
DOI : 10.1214/009053606000001523

J. Candès, J. Romberg, and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, vol.52, issue.2, pp.489-509, 2004.
DOI : 10.1109/TIT.2005.862083

J. Candès, J. K. Romberg, and T. Tao, Stable signal recovery from incomplete and inaccurate measurements, Communications on Pure and Applied Mathematics, vol.7, issue.8, pp.1207-244, 2006.
DOI : 10.1002/cpa.20124

A. Carpentier and R. Munos, Finite-time analysis of stratified sampling for monte carlo, Neural Information Processing Systems (NIPS), 2011a. 6, pp.136-153
URL : https://hal.archives-ouvertes.fr/inria-00636924

A. Carpentier and R. Munos, Finite-time analysis of stratified sampling for monte carlo, pp.155-183, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00636924

A. Carpentier and R. Munos, Bandit theory meets compressed sensing for high dimensional linear bandit, Artificial Intelligence and Statistics, to appear, p.267, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00659731

A. Carpentier and R. Munos, Minimax number of strata for online stratified sampling given noisy samples Arxiv preprint arXiv:1205, p.133, 2012.

A. Carpentier, A. Lazaric, M. Ghavamzadeh, R. Munos, and P. Auer, Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits, Algorithmic Learning Theory, pp.189-203, 2011.
DOI : 10.1007/978-3-642-24412-4_17

URL : https://hal.archives-ouvertes.fr/hal-00659696

A. Carpentier, O. A. Maillard, and R. Munos, Sparse recovery with brownian sensing, Neural Information Processing Systems, p.241, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00943122

R. Castro, R. Willett, and R. Nowak, Faster rates in regression via active learning, Proceedings of Neural Information Processing Systems (NIPS), pp.179-186, 2005.

G. Cesa-bianchi, R. P. Lugosi, P. A. Chaudhuri, and . Mykland, Combinatorial bandits On efficient designing of nonlinear experiments, Journal of Computer and System Sciences Statistica Sinica, vol.28, issue.5, pp.280421-440, 1995.

S. S. Chen, D. L. Donoho, and M. A. Saunders, Atomic Decomposition by Basis Pursuit, SIAM Journal on Scientific Computing, vol.20, issue.1, pp.33-61, 1999.
DOI : 10.1137/S1064827596304010

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.113.7694

A. Cohen, W. Dahmen, and R. Devore, Compressed sensing and best $k$-term approximation, Journal of the American Mathematical Society, vol.22, issue.1
DOI : 10.1090/S0894-0347-08-00610-3

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.148.5477

D. A. Cohn, Z. Ghahramani, and M. I. Jordan, Active learning with statistical models, J. Artif. Int. Res, vol.4, pp.129-145, 1996.

T. P. Dani, S. M. Hayes, and . Kakade, Stochastic linear optimization under bandit feedback, Proceedings of the 21st Annual Conference on Learning Theory (COLT). Citeseer, p.283, 2008.

R. A. Devore, Deterministic constructions of compressed sensing matrices, Journal of Complexity, vol.23, issue.4-6, pp.918-925, 2007.
DOI : 10.1016/j.jco.2007.04.002

D. L. Donoho, Compressed sensing, IEEE Transactions on Information Theory, vol.52, issue.4, pp.1289-1306, 2006.
DOI : 10.1109/TIT.2006.871582

URL : https://hal.archives-ouvertes.fr/inria-00369486

D. L. Donoho and P. B. Stark, Uncertainty Principles and Signal Recovery, SIAM Journal on Applied Mathematics, vol.49, issue.3, pp.906-931, 1989.
DOI : 10.1137/0149053

M. Elad and A. M. Bruckstein, A generalized uncertainty principle and sparse representation in pairs of bases. Information Theory, IEEE Transactions on, vol.48, issue.9, pp.2558-2567, 2002.

P. Etoré and B. Jourdain, Adaptive Optimal Allocation in Stratified Sampling Methods, Methodology and Computing in Applied Probability, vol.9, issue.2, pp.335-360, 2010.
DOI : 10.1007/s11009-008-9108-0

E. Even-dar, S. Mannor, and Y. Mansour, Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems, The Journal of Machine Learning Research, vol.7, pp.1079-1105, 2006.

. Fedorov, Theory of Optimal Experiments, pp.31-38, 1972.

O. Filippi, A. Cappé, C. Garivier, R. A. Szepesvári, A. T. Flaxman et al., Parametric bandits: The generalized linear case Online convex optimization in the bandit setting: gradient descent without a gradient, Advances in Neural Information Processing Systems Proceedings of the sixteenth annual ACM- SIAM symposium on Discrete algorithms, pp.385-394, 2005.

M. Fornasier and H. Rauhut, Compressive Sensing, Handbook of Mathematical Methods in Imaging, p.257
DOI : 10.1007/978-3-642-27795-5_6-5

S. Foucart and M. J. Lai, Sparsest solutions of underdetermined linear systems via <mml:math altimg="si1.gif" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:msub><mml:mi>???</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:math>-minimization for <mml:math altimg="si2.gif" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:mn>0</mml:mn><mml:mo><</mml:mo><mml:mi>q</mml:mi><mml:mo>???</mml:mo><mml:mn>1</mml:mn></mml:math>, Applied and Computational Harmonic Analysis, vol.26, issue.3, pp.395-407, 2009.
DOI : 10.1016/j.acha.2008.09.001

J. Fruitet, A. Carpentier, R. Munos, and M. Clerc, Automatic motor task selection via a bandit algorithm for a brain-controlled button, Journal of Neural Engineering, vol.10, issue.1, p.20, 2011.
DOI : 10.1088/1741-2560/10/1/016012

URL : https://hal.archives-ouvertes.fr/inria-00624686

A. Garivier and O. Cappé, The kl-ucb algorithm for bounded stochastic bandits and beyond

A. Garivier and E. Moulines, On Upper-Confidence Bound Policies for Switching Bandit Problems, Algorithmic Learning Theory, pp.174-188, 2011.
DOI : 10.1007/978-3-642-24412-4_16

E. Giné and R. Nickl, Confidence bands in density estimation. The Annals of Statistics, pp.1122-1170, 2010.

. Glasserman, Monte Carlo methods in financial engineering, p.152, 2004.
DOI : 10.1007/978-0-387-21617-1

P. Glasserman, P. Heidelberger, and P. Shahabuddin, Asymptotically Optimal Importance Sampling and Stratification for Pricing Path-Dependent Options, Mathematical Finance, vol.9, issue.2, pp.117-152, 1999.
DOI : 10.1111/1467-9965.00065

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.6930

. Grover, Active learning and its application to heteroscedastic problems, pp.91-114, 2009.

W. Hoeffding, Probability inequalities for sums of bounded random variables, Journal of the American Statistical Association, pp.13-30, 1963.

M. Hoffmann and O. Lepski, Random rates in anisotropic regression, Annals of statistics, pp.325-358, 2002.

J. Honda, A. Takemura, R. G. James, P. Radchenko, and J. Lv, An asymptotically optimal bandit algorithm for bounded support models DASSO: connections between the Dantzig selector and lasso, Proceedings of the Twenty-Third Annual Conference on Learning Theory (COLT), pp.127-142, 2009.

. Kawai, Asymptotically optimal allocation of stratified sampling with adaptive variance reduction by strata, ACM Transactions on Modeling and Computer Simulation, vol.20, issue.2, pp.1-17, 2010.
DOI : 10.1145/1734222.1734225

. Koltchinskii, The Dantzig selector and sparsity oracle inequalities, Bernoulli, vol.15, issue.3, pp.799-828, 2009.
DOI : 10.3150/09-BEJ187

URL : http://arxiv.org/abs/0909.0861

M. Koolen, M. K. Warmuth, and J. Kivinen, Hedging structured concepts, Proceedings of the 23rd Annual Conference on Learning Theory (COLT 19). Omnipress, p.280, 2010.

T. L. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985.
DOI : 10.1016/0196-8858(85)90002-8

URL : http://doi.org/10.1016/0196-8858(85)90002-8

N. Littlestone and M. K. Warmuth, The weighted majority algorithm, 30th Annual Symposium on, pp.256-261, 1989.

B. F. Logan, Properties of high-pass signals, p.235, 1965.

O. A. Maillard, R. Munos, and G. Stoltz, A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences. Arxiv preprint arXiv:1105, p.27, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00574987

O. Maron and A. W. Moore, Hoeffding races: Accelerating model selection search for classification and function approximation, Robotics Institute, pp.263-291, 1993.

A. Maurer and M. Pontil, Empirical bernstein bounds and sample-variance penalization, Proceedings of the Twenty-Second Annual Conference on Learning Theory, pp.115-124, 2009.

R. Munos, Optimistic optimization of deterministic functions without the knowledge of its smoothness, Neural Information Processing Systems, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00830143

H. Niederreiter, Quasi-Monte Carlo methods and pseudo-random numbers, Bulletin of the American Mathematical Society, vol.84, issue.6
DOI : 10.1090/S0002-9904-1978-14532-7

J. Nino-mora, Restless bandits, partial conservation laws and indexability, Advances in Applied Probability, vol.19, issue.01, pp.76-98, 2001.
DOI : 10.1287/moor.21.2.257

C. R. Rao and H. Toutenburg, Linear models: least squares and alternatives, p.230, 1999.

H. Rauhut, Compressive Sensing and Structured Random Matrices. Theoretical Foundations and Numerical Methods for Sparse Recovery, p.254, 2010.

H. Rauhut and R. Ward, Sparse legendre expansions via l 1 minimization. Arxiv preprint, p.245, 2010.

S. I. Resnick, A probability path, Birkhäuser, p.100, 1999.

H. Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952.
DOI : 10.1090/S0002-9904-1952-09620-8

R. Y. Rubinstein and D. P. Kroese, Simulation and the Monte Carlo method, pp.78-124, 2008.

M. Rudelson and R. Vershynin, On sparse reconstruction from Fourier and Gaussian measurements, Communications on Pure and Applied Mathematics, vol.52, issue.8, pp.611025-1045, 2008.
DOI : 10.1002/cpa.20227

P. Rusmevichientong and J. N. Tsitsiklis, Linearly parameterized bandits Arxiv preprint arXiv:0812, p.280, 2008.

L. Shepp and B. Logan, The Fourier reconstruction of a head section, IEEE Transactions on Nuclear Science, vol.21, issue.3, pp.21-43, 1974.
DOI : 10.1109/TNS.1974.6499235

A. Slivkins and E. Upfal, Adapting to a changing environment: The brownian restless bandits

P. Stevenhagen and H. W. Lenstra, Chebotarëv and his density theorem. The Mathematical Intelligencer, pp.26-37, 1996.
DOI : 10.1007/bf03027290

G. Stoltz, S. Bubeck, R. Munos, and C. Szepesvari, X-armed bandits, Journal of Machine Learning Research, vol.12, issue.4, pp.1655-1695, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00450235

T. Tao, An uncertainty principle for cyclic groups of prime order. Arxiv preprint math/0308286, p.234, 2003.

W. R. Thompson, ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES, Biometrika, vol.25, issue.3-4, pp.285-294, 1933.
DOI : 10.1093/biomet/25.3-4.285

. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodological), vol.235, pp.267-288, 1996.
DOI : 10.1111/j.1467-9868.2011.00771.x

A. Sara and . Van-de-geer, The deterministic lasso Seminar für Statistik, Eidgenössische Technische Hochschule (ETH) Zürich, 2007. 244 REFERENCES Sara A. van de Geer and Peter Buhlmann. On the conditions used to prove oracle results for the lasso, Electronic Journal of Statistics, vol.3, pp.1360-1392, 2009.

P. Whittle, Restless bandits: Activity allocation in a changing world, Journal of applied probability, pp.287-298, 1988.

T. Zhang, Some sharp performance bounds for least squares regression with l1 regularization. The Annals of Statistics, pp.2109-2144, 2009.

P. Zhao and B. Yu, On model selection consistency of Lasso, The Journal of Machine Learning Research, vol.7, pp.2563-244, 2006.