Lexicographic refinements in possibilistic sequential decision-making models

Abstract : This work contributes to possibilistic decision theory and more specifically to sequential decision-making under possibilistic uncertainty, at both the theoretical and practical levels. Even though appealing for its ability to handle qualitative decision problems, possibilisitic decision theory suffers from an important drawback: qualitative possibilistic utility criteria compare acts through min and max operators, which leads to a drowning effect. To overcome this lack of decision power, several refinements have been proposed in the literature. Lexicographic refinements are particularly appealing since they allow to benefit from the expected utility background, while remaining "qualitative". However, these refinements are defined for the non-sequential decision problems only. In this thesis, we present results on the extension of the lexicographic preference relations to sequential decision problems, in particular, to possibilistic Decision trees and Markov Decision Processes. This leads to new planning algorithms that are more "decisive" than their original possibilistic counterparts. We first present optimistic and pessimistic lexicographic preference relations between policies with and without intermediate utilities that refine the optimistic and pessimistic qualitative utilities respectively. We prove that these new proposed criteria satisfy the principle of Pareto efficiency as well as the property of strict monotonicity. This latter guarantees that dynamic programming algorithm can be used for calculating lexicographic optimal policies. Considering the problem of policy optimization in possibilistic decision trees and finite-horizon Markov decision processes, we provide adaptations of dynamic programming algorithm that calculate lexicographic optimal policy in polynomial time. These algorithms are based on the lexicographic comparison of the matrices of trajectories associated to the sub-policies. This algorithmic work is completed with an experimental study that shows the feasibility and the interest of the proposed approach. Then we prove that the lexicographic criteria still benefit from an Expected Utility grounding, and can be represented by infinitesimal expected utilities. The last part of our work is devoted to policy optimization in (possibly infinite) stationary Markov Decision Processes. We propose a value iteration algorithm for the computation of lexicographic optimal policies. We extend these results to the infinite-horizon case. Since the size of the matrices increases exponentially (which is especially problematic in the infinite-horizon case), we thus propose an approximation algorithm which keeps the most interesting part of each matrix of trajectories, namely the first lines and columns. Finally, we reports experimental results that show the effectiveness of the algorithms based on the cutting of the matrices.
Document type :
Theses
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-01940908
Contributor : Abes Star <>
Submitted on : Friday, November 30, 2018 - 3:44:06 PM
Last modification on : Monday, April 29, 2019 - 4:54:41 PM
Long-term archiving on : Friday, March 1, 2019 - 2:55:56 PM

File

2017TOU30269b.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01940908, version 1

Collections

Citation

Zeineb El Khalfi. Lexicographic refinements in possibilistic sequential decision-making models. Artificial Intelligence [cs.AI]. Université Paul Sabatier - Toulouse III, 2017. English. ⟨NNT : 2017TOU30269⟩. ⟨tel-01940908⟩

Share

Metrics

Record views

48

Files downloads

11