Skip to Main content Skip to Navigation

Dynamique d'apprentissage pour Monte Carlo Tree Search : applications aux jeux de Go et du Clobber solitaire impartial

Abstract : Monte Carlo Tree Search (MCTS) has been initially introduced for the game of Go but has now been applied successfully to other games and opens the way to a range of new methods such as Multiple-MCTS or Nested Monte Carlo. MCTS evaluates game states through thousands of random simulations. As the simulations are carried out, the program guides the search towards the most promising moves. MCTS achieves impressive results by this dynamic, without an extensive need for prior knowledge. In this thesis, we choose to tackle MCTS as a full learning system. As a consequence, each random simulation turns into a simulated experience and its outcome corresponds to the resulting reinforcement observed. Following this perspective, the learning of the system results from the complex interaction of two processes : the incremental acquisition of new representations and their exploitation in the consecutive simulations. From this point of view, we propose two different approaches to enhance both processes. The first approach gathers complementary representations in order to enhance the relevance of the simulations. The second approach focuses the search on local sub-goals in order to improve the quality of the representations acquired. The methods presented in this work have been applied to the games of Go and Impartial Solitaire Clobber. The results obtained in our experiments highlight the significance of these processes in the learning dynamic and draw up new perspectives to enhance further learning systems such as MCTS
Document type :
Complete list of metadata

Cited literature [74 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Wednesday, December 2, 2015 - 9:37:07 AM
Last modification on : Tuesday, June 1, 2021 - 2:08:10 PM
Long-term archiving on: : Saturday, April 29, 2017 - 12:53:21 AM


Version validated by the jury (STAR)


  • HAL Id : tel-01234642, version 1


André Fabbri. Dynamique d'apprentissage pour Monte Carlo Tree Search : applications aux jeux de Go et du Clobber solitaire impartial. Intelligence artificielle [cs.AI]. Université Claude Bernard - Lyon I, 2015. Français. ⟨NNT : 2015LYO10183⟩. ⟨tel-01234642⟩



Record views


Files downloads