Skip to Main content Skip to Navigation

Data-driven evaluation of Contextual Bandit algorithms and applications to Dynamic Recommendation

Olivier Nicol 1, 2 
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
Abstract : The context of this thesis work is dynamic recommendation. Recommendation is the action, for an intelligent system, to supply a user of an application with personalized content so as to enhance what is refered to as ”user experience” e.g. recommending a product on a merchant website or even an article on a blog. Recommendation is considered dynamic when the content to recommend or user tastes evolve rapidly e.g. news recommendation. Many applications that are of interest to us generates a tremendous amount of data through the millions of online users they have. Nevertheless, using this data to evaluate a new recommendation technique or even compare two dynamic recommendation algorithms is far from trivial. This is the problem we consider here. Some approaches have already been proposed. Nonetheless they were not studied very thoroughly both from a theoretical point of view (unquantified bias, loose convergence bounds...) and from an empirical one (experiments on private data only). In this work we start by filling many blanks within the theoretical analysis. Then we comment on the result of an experiment of unprecedented scale in this area: a public challenge we organized. This challenge along with a some complementary experiments revealed a unexpected source of a huge bias: time acceleration. The rest of this work tackles this issue. We show that a bootstrap-based approach allows to significantly reduce this bias and more importantly to control it.
Document type :
Complete list of metadata

Cited literature [163 references]  Display  Hide  Download
Contributor : Preux Philippe Connect in order to contact the contributor
Submitted on : Monday, April 4, 2016 - 11:58:04 AM
Last modification on : Thursday, January 20, 2022 - 4:17:14 PM
Long-term archiving on: : Monday, November 14, 2016 - 3:45:24 PM


  • HAL Id : tel-01297407, version 1


Olivier Nicol. Data-driven evaluation of Contextual Bandit algorithms and applications to Dynamic Recommendation. Machine Learning [stat.ML]. Université de Lille I, 2014. English. ⟨tel-01297407⟩



Record views


Files downloads