Data-driven evaluation of Contextual Bandit algorithms and applications to Dynamic Recommendation

Olivier Nicol 1, 2
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
Abstract : The context of this thesis work is dynamic recommendation. Recommendation is the action, for an intelligent system, to supply a user of an application with personalized content so as to enhance what is refered to as ”user experience” e.g. recommending a product on a merchant website or even an article on a blog. Recommendation is considered dynamic when the content to recommend or user tastes evolve rapidly e.g. news recommendation. Many applications that are of interest to us generates a tremendous amount of data through the millions of online users they have. Nevertheless, using this data to evaluate a new recommendation technique or even compare two dynamic recommendation algorithms is far from trivial. This is the problem we consider here. Some approaches have already been proposed. Nonetheless they were not studied very thoroughly both from a theoretical point of view (unquantified bias, loose convergence bounds...) and from an empirical one (experiments on private data only). In this work we start by filling many blanks within the theoretical analysis. Then we comment on the result of an experiment of unprecedented scale in this area: a public challenge we organized. This challenge along with a some complementary experiments revealed a unexpected source of a huge bias: time acceleration. The rest of this work tackles this issue. We show that a bootstrap-based approach allows to significantly reduce this bias and more importantly to control it.
Document type :
Theses
Complete list of metadatas

Cited literature [163 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01297407
Contributor : Preux Philippe <>
Submitted on : Monday, April 4, 2016 - 11:58:04 AM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Monday, November 14, 2016 - 3:45:24 PM

Identifiers

  • HAL Id : tel-01297407, version 1

Citation

Olivier Nicol. Data-driven evaluation of Contextual Bandit algorithms and applications to Dynamic Recommendation. Machine Learning [stat.ML]. Université de Lille I, 2014. English. ⟨tel-01297407⟩

Share

Metrics

Record views

391

Files downloads

941