Improved algorithms for linear stochastic bandits, Neural Information Processing Systems, pp.39-40, 2011. ,
Competing in the dark: an efficient algorithm for bandit linear optimization, Conference on Learning Theory, 2008. ,
Thompson sampling for contextual bandits with linear payoffs, International Conference on Machine Learning, pp.45-46, 2013. ,
Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, Algorithmic Learning Theory, p.89, 2006. ,
DOI : 10.1007/11894841_20
From bandits to experts: A tale of domination and independence, Neural Information Processing Systems, pp.64-99, 2013. ,
Online learning with feedback graphs: Beyond bandits, Conference on Learning Theory, pp.64-78, 2015. ,
Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.89, p.69, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00654356
Regret in Online Combinatorial Optimization, Mathematics of Operations Research, vol.39, issue.1, pp.115-116, 2014. ,
DOI : 10.1287/moor.2013.0598
Using confidence bounds for exploitation-exploration trade-offs, Journal of Machine Learning Research, vol.6, issue.51, p.47, 2002. ,
UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem, Periodica Mathematica Hungarica, vol.5, issue.1-2, 2010. ,
DOI : 10.1007/s10998-010-3055-6
Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.102, pp.10-72, 2002. ,
The non-stochastic multiarmed bandit problem, SIAM Journal on Computing, vol.68, issue.114, pp.26-71, 2002. ,
Adaptive and Self-Confident On-Line Learning Algorithms, Journal of Computer and System Sciences, vol.64, issue.1, p.77, 2002. ,
DOI : 10.1006/jcss.2001.1795
Adaptive routing with end-to-end feedback, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing , STOC '04, 2004. ,
DOI : 10.1145/1007352.1007367
Characterizing truthful multi-armed bandit mechanisms, SIAM Journal on Computing, pp.2014-2017 ,
DOI : 10.1137/120878768
URL : http://arxiv.org/pdf/0812.2291
Minimax regret of finite partial-monitoring games in stochastic environments, Conference on Learning Theory, 2011. ,
Partial monitoringclassification , regret bounds, and algorithms, Mathematics of Operations Research, pp.2014-100 ,
Regularization and semi-supervised learning on large graphs, Conference on Computational Learning Theory, 2004. ,
Manifold regularization: A geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research, vol.20, pp.14-16, 2006. ,
Contextual bandit algorithms with supervised learning guarantees, International Conference on Artificial Intelligence and Statistics, pp.2011-69 ,
A learning agent for wireless news access, Proceedings of the 5th international conference on Intelligent user interfaces , IUI '00, 2000. ,
DOI : 10.1145/325737.325768
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.2012-69 ,
DOI : 10.1561/2200000024
X-armed bandits, Journal of Machine Learning Research, pp.2011-2029 ,
URL : https://hal.archives-ouvertes.fr/hal-00450235
Towards minimax policies for online linear optimization with bandit feedback, Conference on Learning Theory, pp.2012-2018 ,
Stochastic bandits with side observations on networks, International Conference on Measurement and Modeling of Computer Systems, pp.2014-89 ,
Leveraging side observations in stochastic bandits, Conference on Uncertainty in Artificial Intelligence, pp.2012-89 ,
URL : https://hal.archives-ouvertes.fr/hal-01270324
Revealing graph bandits for maximizing local influence, International Conference on Artificial Intelligence and Statistics, pp.2016-89 ,
URL : https://hal.archives-ouvertes.fr/hal-01304020
Prediction, Learning, and Games, p.79, 2006. ,
DOI : 10.1017/CBO9780511546921
Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, 2012. ,
DOI : 10.1016/j.jcss.2012.01.001
How to use expert advice, Journal of the ACM, 1997. ,
DOI : 10.1145/167088.167198
Minimizing regret with label efficient prediction, IEEE Transactions on Information Theory, p.89, 2005. ,
DOI : 10.1109/tit.2005.847729
URL : https://hal.archives-ouvertes.fr/hal-00007537
Online learning of noisy data with kernels, Conference on Learning Theory, 2010. ,
A gang of bandits, Neural Information Processing Systems, pp.2013-2031 ,
An empirical evaluation of Thompson sampling, Neural Information Processing Systems, pp.2011-2017 ,
Apolo, Proceedings of the 2011 annual conference on Human factors in computing systems, CHI '11, 2011. ,
DOI : 10.1145/1978942.1978967
Combinatorial multi-armed bandit: General framework and applications, International Conference on Machine Learning, pp.2013-114 ,
Contextual bandits with linear payoff functions, International Conference on Artificial Intelligence and Statistics, pp.2011-2017 ,
Online learning with feedback graphs without the graphs, International Conference on Machine Learning, p.65, 2016. ,
Unimodal bandits: Regret lower bounds and optimal algorithms, International Conference on Machine Learning, pp.2014-2032 ,
DOI : 10.1145/2745844.2745847
URL : https://hal.archives-ouvertes.fr/hal-01092662
Stochastic linear optimization under bandit feedback, Conference on Learning Theory, 2008. ,
Parallelizing exploration-exploitation tradeoffs in gaussian process bandit optimization, International Conference on Machine Learning, pp.2012-2045 ,
Prediction by random-walk perturbation, Conference on Learning Theory, pp.2013-100 ,
Networked bandits with disjoint linear payoffs, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '14, pp.2014-2032 ,
DOI : 10.1145/2623330.2623672
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, p.95, 1997. ,
DOI : 10.1006/jcss.1997.1504
Online clustering of bandits, International Conference on Machine Learning, pp.2014-2032 ,
Online Spectral Learning on a Graph with Bandit Feedback, 2014 IEEE International Conference on Data Mining, pp.2014-2033 ,
DOI : 10.1109/ICDM.2014.72
Sequential Prediction of Unbounded Stationary Time Series, IEEE Transactions on Information Theory, vol.53, issue.5, p.72, 2007. ,
DOI : 10.1109/TIT.2007.894660
The on-line shortest path problem under partial monitoring, Journal of Machine Learning Research, issue.2, 2007. ,
Cheap bandits, International Conference on Machine Learning, pp.2015-2034 ,
URL : https://hal.archives-ouvertes.fr/hal-01153540
Approximation to Bayes risk in repeated play. Contributions to the theory of games, p.116, 1957. ,
Matrix analysis, 1990. ,
Prediction with Expert Advice by Following the Perturbed Leader for General Weights, Algorithmic Learning Theory, p.119, 2004. ,
DOI : 10.1007/978-3-540-30215-5_22
A matrix factorization technique with trust propagation for recommendation in social networks, Proceedings of the fourth ACM conference on Recommender systems, RecSys '10, pp.2010-59 ,
DOI : 10.1145/1864708.1864736
Recommender systems: An introduction, 2010. ,
DOI : 10.1017/CBO9780511763113
Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.119, issue.129, p.115, 2005. ,
Matrix completion from a few entries, IEEE International Symposium on Information Theory, p.57, 2009. ,
Multi-armed bandits in metric spaces, Proceedings of the fourtieth annual ACM symposium on Theory of computing, STOC 08, p.18, 2008. ,
DOI : 10.1145/1374376.1374475
Efficient learning by implicit exploration in bandit problems with side observations, Neural Information Processing Systems, pp.64-69, 2014. ,
Spectral Thompson sampling, AAAI Conference on Artificial Intelligence, 2014. ,
Spectral bandits for smooth graph functions with applications in recommender systems, AAAI Workshop on Sequential Decision-Making with Big Data, 2014. ,
Online learning with noisy side observations, International Conference on Artificial Intelligence and Statistics, 2016. ,
Online learning with Erd?s-Rényi side-observation graphs, Conference on Uncertainty in Artificial Intelligence, 2016. ,
Hedging structured concepts, Conference on Learning Theory, pp.7-115, 2010. ,
Distributed clustering of linear andits in peer to peer networks, International Conference on Machine Learning, pp.2016-2034 ,
Combinatorial preconditioners and multilevel solvers for problems in computer vision and image processing, Computer Vision and Image Understanding, pp.2011-2044 ,
A contextual-bandit approach to personalized news article recommendation, Proceedings of the 19th international conference on World wide web, WWW '10, pp.16-28, 2010. ,
DOI : 10.1145/1772690.1772758
Online context-dependent clustering in recommendations based on exploration-exploitation algorithms. arXiv preprint, pp.2015-2033 ,
The weighted majority algorithm. Information and Computation, p.5, 1994. ,
DOI : 10.1016/b978-0-08-094829-4.50035-0
A tutorial on spectral clustering, Statistics and Computing, vol.21, issue.1, 2007. ,
DOI : 10.1017/CBO9780511810633
Active search and bandits on graphs using sigma-optimality, Conference on Uncertainty in Artificial Intelligence, pp.2015-2034 ,
From bandits to experts: On the value of side-observations ,
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary, Conference on Learning Theory, 2004. ,
DOI : 10.1007/978-3-540-27819-1_8
Birds of a Feather: Homophily in Social Networks, Annual Review of Sociology, vol.27, issue.1, 2001. ,
DOI : 10.1146/annurev.soc.27.1.415
Signal processing techniques for interpolation in graph structured data, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.2013-2032 ,
DOI : 10.1109/ICASSP.2013.6638704
An Efficient Algorithm for Learning with Semi-bandit Feedback, Algorithmic Learning Theory, pp.91-129, 2013. ,
DOI : 10.1007/978-3-642-40935-6_17
Multi-armed bandit problems with dependent arms, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007. ,
DOI : 10.1145/1273496.1273587
URL : http://www.cs.cmu.edu/~spandey/publications/dependent-bandit.pdf
Some aspects of the sequential design of experiments. Bulletin of the, 1952. ,
Optimizing adaptive marketing experiments with the multi-armed bandit, pp.2013-2016 ,
Prediction with limited advice and multiarmed bandits with paid observations, International Conference on Machine Learning, pp.89-91, 2014. ,
Contextual bandits with similarity information, Conference on Learning Theory, p.18, 2009. ,
Gaussian process optimization in the bandit setting: No regret and experimental design, International Conference on Machine Learning, pp.2010-2028 ,
Path Kernels and Multiplicative Updates, Journal of Machine Learning Research, 2003. ,
DOI : 10.1007/3-540-45435-7_6
URL : http://www.cse.ucsc.edu/~manfred/pubs/J55.pdf
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, vol.6, issue.2, p.1, 1933. ,
Finite-time analysis of kernelised contextual bandits, In Uncertainty in Artificial Intelligence, pp.2013-2031 ,
URL : https://hal.archives-ouvertes.fr/hal-00826946
Spectral bandits for smooth graph functions, International Conference on Machine Learning, p.21, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00986818
AGGREGATING STRATEGIES, Proceedings of the third annual workshop on Computational learning theory, 1990. ,
DOI : 10.1016/B978-1-55860-146-8.50032-1
STAT 210B advanced mathematical statistics. Lecture notes, pp.2015-2064 ,
Online learning with Gaussian payoffs and side observations, Neural Information Processing Systems, p.100, 2015. ,
Unimodal bandits, International Conference on Machine Learning, pp.2011-2029 ,
Semi-supervised learning literature survey, 2008. ,