Skip to Main content Skip to Navigation
Theses

Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur

Abstract : This thesis is a study of practical methods to estimate value functions with feedforward neural networks in model-based reinforcement learning. Focus is placed on problems in continuous time and space, such as motor-control tasks. In this work, the continuous TD(lambda) algorithm is refined to handle situations with discontinuous states and controls, and the vario-eta algorithm is proposed as a simple but efficient method to perform gradient descent. The main contributions of this thesis are experimental successes that clearly indicate the potential of feedforward neural networks to estimate high-dimensional value functions. Linear function approximators have been often preferred in reinforcement learning, but successful value function estimations in previous works are restricted to mechanical systems with very few degrees of freedom. The method presented in this thesis was tested successfully on an original task of learning to swim by a simulated articulated robot, with 4 control variables and 12 independent state variables, which is significantly more complex than problems that have been solved with linear function approximators so far.
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-00004386
Contributor : Thèses Imag <>
Submitted on : Thursday, January 29, 2004 - 5:27:08 PM
Last modification on : Friday, November 6, 2020 - 4:12:51 AM
Long-term archiving on: : Friday, April 2, 2010 - 8:11:34 PM

Identifiers

  • HAL Id : tel-00004386, version 1

Collections

Citation

Rémi Coulom. Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur. Interface homme-machine [cs.HC]. Institut National Polytechnique de Grenoble - INPG, 2002. Français. ⟨tel-00004386⟩

Share

Metrics

Record views

1463

Files downloads

2129