Skip to Main content Skip to Navigation

An ordinal generative model of Bayesian inference for human decision-making in continuous reward environments

Abstract : Our thesis aims at understanding how human behavior adapts to an environment where rewards are continuous. Many works have studied environments with binary rewards (win/lose) and have shown that human behavior could be accounted for by Bayesian inference algorithms. A Bayesian algorithm works in a continuous environment provided that it is based on a “generative” model of the environment, which is a structural assumption about environmental contingencies. The issue we address in this thesis is to characterize which kind of generative model of continuous rewards characterizes human decision-making. One hypothesis is to consider that each action attributes rewards as noisy samples of the true action value, typically distributed as a Gaussian distribution. We propose instead a generative model using assumptions about the relationship between the values of the different actions available and the existence of a reliable ordering of action values. This structural assumption enables to simulate mentally counterfactual rewards and to learn simultaneously reward distributions associated with all actions. To validate our model, we ran three behavioral experiments on healthy subjects in a setting where actions’ reward distributions were continuous and changed across time. Our proposed model described correctly participants’ behavior in all three tasks, while other competitive models, including Gaussian failed. The proposed model extends the implementation of Bayesian algorithms and establishes which rewards are “good” and desirable according to the current context. It answers to evolutionarily constraints by adapting quickly, while performing correctly in many different settings.
Complete list of metadata
Contributor : Abes Star :  Contact
Submitted on : Wednesday, October 17, 2018 - 11:19:45 AM
Last modification on : Thursday, December 10, 2020 - 12:38:33 PM
Long-term archiving on: : Friday, January 18, 2019 - 1:42:11 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01897446, version 1


Gabriel Sulem. An ordinal generative model of Bayesian inference for human decision-making in continuous reward environments. Cognitive Sciences. Université Pierre et Marie Curie - Paris VI, 2017. English. ⟨NNT : 2017PA066556⟩. ⟨tel-01897446⟩



Record views


Files downloads