Online trajectory generation for omnidirectional biped walking, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., 2006. ,
DOI : 10.1109/ROBOT.2006.1641935
Adaptive-resolution reinforcement learning with??polynomial exploration in deterministic domains, Machine Learning, pp.359397-359407, 1999. ,
DOI : 10.1115/1.3426922
Lambda-Policy Iteration: A Review and a New Implementation, Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, 2013. ,
DOI : 10.1109/TAC.2009.2022097
URL : http://arxiv.org/pdf/1507.01029
A survey on multi-output regression, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol.33, issue.5, 2015. ,
DOI : 10.18637/jss.v033.i01
URL : http://oa.upm.es/40804/1/INVE_MEM_2015_204213.pdf
Classication and Regression Trees, 1984. ,
Bagging predictors, Machine Learning, p.123140, 1996. ,
DOI : 10.2307/1403680
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning, 2010. ,
A tutorial on the cross-entropy method, Annals of Operations Research, vol.134, issue.1, 2005. ,
PILCO: A Model- Based and Data-Ecient Approach to Policy Search, Icml, p.465472, 2011. ,
Tree-Based Batch Mode Reinforcement Learning, Journal of Machine Learning Research, vol.6, issue.1, p.503556, 2005. ,
Extremely randomized trees, Machine Learning, pp.10-1007, 2006. ,
DOI : 10.1007/s10994-006-6226-1
URL : https://hal.archives-ouvertes.fr/hal-00341932
A dynamic allocation index for the discounted multiarmed bandit problem, Biometrika, vol.66, issue.3, pp.561565-561575, 1979. ,
DOI : 10.1093/biomet/66.3.561
Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary computation, 2003. ,
Completely Derandomized Self-Adaptation in Evolution Strategies, Evolutionary Computation, vol.9, issue.2, pp.159195-159205, 2001. ,
DOI : 10.1016/0004-3702(95)00124-7
URL : http://www.mitpressjournals.org/userimages/ContentEditor/1164817256746/lib_rec_form.pdf
Uncertainty handling CMA-ES for reinforcement learning, Proceedings of the 11th Annual conference on Genetic and evolutionary computation, GECCO '09, pp.1211-1221, 2009. ,
DOI : 10.1145/1569901.1570064
Online Reinforcement Learning for Real-Time Exploration in Continuous State and Action Markov Decision Processes, PlanRob2016, Proceedings of the 4th Workshop on Planning and Robotics at ICAPS2016. AAAI, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01416179
An Operational Method Toward Ecient Walk Control Policies for Humanoid Robots, ICAPS 2017, 2017. ,
Portfolio Allocation for Bayesian Optimization, Conference on Uncertainty in Articial Intelligence, p.327336, 2011. ,
Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors, Neural Computation, vol.2010, issue.11, pp.32873-32883, 2013. ,
DOI : 10.1109/AT-EQUAL.2009.32
URL : https://infoscience.epfl.ch/record/185437/files/neco_a_00393.pdf
Optimization by Simulated Annealing, Science, vol.220, issue.4598, 1983. ,
DOI : 10.1142/9789812799371_0035
URL : http://www.cs.virginia.edu/cs432/documents/sa-1983.pdf
Policy search for motor primitives in robotics, pp.849-856, 2008. ,
Reinforcement Learning in Robotics: A Survey, Reinforcement Learning: State-of-the-Art, p.579610, 2012. ,
The eect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms, Machine Learning, pp.227250-227260, 1996. ,
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, Appears in the International Joint Conference on Articial Intelligence (IJCAI), p.17, 1995. ,
Classication and Regression Tree Analysis in Public Health: Methodological Review and Comparison With Logistic Regression. The Society of, Behavioral Medicine, vol.26, issue.3, p.172181, 2003. ,
DOI : 10.1207/s15324796abm2603_02
Online exploration in least-squares policy iteration, The 8th International Conference on Autonomous Agents and Multiagent Systems, p.733739, 2009. ,
Classication and regression trees, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol.1, issue.1, 2011. ,
DOI : 10.1002/widm.8
Reinforcement learning with selective perception and hidden state, 1996. ,
Passive Dynamic Walking, The International Journal of Robotics Research, vol.2, issue.4, pp.6282-6292, 1990. ,
DOI : 10.1109/JRA.1986.1087060
A Heuristic Search Approach to Planning, J. Artif. Int. Res, vol.34, p.2759, 2009. ,
Learning of a ball-in-a-cup playing robot, 19th International Workshop on Robotics in Alpe-Adria-Danube Region (RAAD 2010), 2010. ,
DOI : 10.1109/RAAD.2010.5524570
PEGASUS: A Policy Search Method for Large MDPs and POMDPs, Conference on Uncertainty in Articial Intelligence, vol.94720, 2000. ,
Inverted autonomous helicopter ight via reinforcement learning, International Symposium on Experimental Robotics, 2004. ,
Multi-resolution Exploration in Continuous Spaces, Advances in Neural Information Processing Systems, p.12091216, 2009. ,
Binary action search for learning continuous-action control policies, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, p.793800, 2009. ,
DOI : 10.1145/1553374.1553476
URL : http://www.cs.mcgill.ca/~icml2009/papers/532.pdf
Gaussian processes for machine, 2006. ,
Online self-calibration for mobile robots, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C), p.22922297, 1999. ,
DOI : 10.1109/ROBOT.1999.770447
The Cross-Entropy Method for Combinatorial and Continuous Optimization, Methodology and Computing in Applied Probability, vol.1, issue.2, pp.127190-127200, 1999. ,
Symbolic Dynamic Programming for Discrete and Continuous State MDPs, Proceedings of the 26th Conference on Articial Intelligence, 2012. ,
Performance bounds for ? policy iteration and application to the game of tetris, Journal of Machine Learning Research, vol.14, issue.1, p.11811227, 2013. ,
URL : https://hal.archives-ouvertes.fr/inria-00185271
An approach to fuzzy control of nonlinear systems: stability and design issues, IEEE Transactions on Fuzzy Systems, vol.4, issue.1, 1996. ,
DOI : 10.1109/91.481841
Open-Loop Planning in Large-Scale Stochastic Domains, 27th AAAI Conference on Articial Intelligence, p.14361442, 2013. ,
Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions, Proceedings of the Eighth Yale Workshop on Adaptive and Learning Systems, p.108113, 1994. ,