Using inaccurate models in reinforcement learning, Proceedings of the International Conference on Machine Learning (ICML), 2006. ,
Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing, Proceedings of the International Conference on Intelligent Robots (IROS), 2018. ,
Optimal filtering, vol.21, pp.22-95, 1979. ,
Hindsight experience replay, Advances in Neural Information Processing Systems (NIPS), 2017. ,
Sample efficient optimization for learning controllers for bipedal locomotion, Proceedings of the International Conference on Humanoid Robots (Humanoids), 2016. ,
Deep Kernels for Optimizing Locomotion Controllers, Proceedings of Conference on Robot Learning (CoRL), 2017. ,
Dataefficient learning of feedback policies from image pixels using deep dynamical models, NIPS Deep Reinforcement Learning Workshop, 2015. ,
Introduction to stochastic control theory. Courier Corporation, 2012. ,
No falls, no resets: Reliable humanoid behavior in the DARPA robotics challenge, Proceedings of the International Conference on Humanoid Robots (Humanoids), 2015. ,
What happened at the DARPA robotics challenge, and why? DRC Finals Special Issue of the, Journal of Field Robotics, 2016. ,
A restart cma evolution strategy with increasing population size, Proceedings of IEEE Congress on Evolutionary Computation, 2005. ,
Sequential parameter optimization, Proceedings of IEEE Congress on Evolutionary Computation, 2005. ,
Dynamic Programming, 1957. ,
Safe Controller Optimization for Quadrotors with Gaussian Processes, Proceedings of the International Conference on Robotics and Automation (ICRA), 2016. ,
Robot programming by demonstration, Springer handbook of robotics, pp.1371-1394, 2008. ,
Policy search for learning robot control using sparse data, Proceedings of the International Conference on Robotics and Automation (ICRA), 2014. ,
DOI : 10.1109/icra.2014.6907422
URL : http://mediatum.ub.tum.de/doc/1281556/file.pdf
Kuka youbot: a mobile manipulator for research and education, Proceedings of the International Conference on Robotics and Automation (ICRA), 2011. ,
DOI : 10.1109/icra.2011.5980575
Resilient machines through continuous self-modeling, Science, vol.314, issue.5802, pp.1118-1121, 2006. ,
DOI : 10.1126/science.1133687
Evolving complete agents using artificial ontogeny, Morpho-functional Machines: The New Species, pp.237-258, 2003. ,
DOI : 10.1007/978-4-431-67869-4_12
, Probabilistic Integration: A Role for Statisticians in Numerical Analysis? arXiv, 2015.
A survey of iterative learning control, IEEE Control Systems, vol.26, issue.3, pp.96-114, 2006. ,
A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning, 2010. ,
Elephants don't play chess, Robotics and autonomous systems, vol.6, issue.1-2, pp.3-15, 1990. ,
DOI : 10.1016/s0921-8890(05)80025-9
URL : http://www.cs.sfu.ca/~vaughan/teaching/889/papers/elephants.pdf
Intelligence without representation, Artificial intelligence, vol.47, issue.1-3, pp.139-159, 1991. ,
DOI : 10.1016/0004-3702(91)90053-m
A survey of monte carlo tree search methods, IEEE Transactions on Computational Intelligence and AI in games, vol.4, issue.1, pp.1-43, 2012. ,
Learning variable impedance control, International Journal of Robotics Research, vol.30, issue.7, pp.820-833, 2011. ,
DOI : 10.1177/0278364911402527
Bayesian optimization for learning gaits under uncertainty, Annals of Mathematics and Artificial Intelligence, 2015. ,
DOI : 10.1007/s10472-015-9463-9
URL : http://spiral.imperial.ac.uk/bitstream/10044/1/24167/2/AMAI.pdf
A tutorial on task-parameterized movement learning and retrieval, Intelligent Service Robotics, vol.9, issue.1, pp.1-29, 2016. ,
DOI : 10.1007/s11370-015-0187-9
On Learning, Representing and Generalizing a Task in a Humanoid Robot, IEEE Transactions on Systems, Man, and Cybernetics, vol.37, issue.2, pp.286-298, 2007. ,
Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning, Robotics and Autonomous Systems, vol.61, issue.4, pp.369-379, 2013. ,
DOI : 10.1016/j.robot.2012.09.012
URL : http://kormushev.com/papers/Calinon-RAS2012.pdf
A task-parameterized probabilistic model with minimal intervention control, Proceedings of the International Conference on Robotics and Automation (ICRA), 2014. ,
DOI : 10.1109/icra.2014.6907339
Model predictive control, 2013. ,
DOI : 10.1002/oca.2167
URL : https://hal.archives-ouvertes.fr/hal-00256633
Incremental semiparametric inverse dynamics learning, Proceedings of the International Conference on Robotics and Automation (ICRA), 2016. ,
DOI : 10.1109/icra.2016.7487177
URL : http://arxiv.org/pdf/1601.04549
How UGVs physically fail in the field, IEEE Transactions on Robotics, vol.21, issue.3, pp.423-437, 2005. ,
DOI : 10.1109/tro.2004.838027
On the parallelization of UCT, Proceedings of the Computer Games Workshop, 2007. ,
Monte-carlo tree search: A new framework for game AI, Proceedings of Artificial Intelligence and Interactive Digital Entertainment (AIIDE), 2008. ,
Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics, Proceedings of the International Conference on Robotics and Automation (ICRA), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01768285
Towards semi-episodic learning for robot damage recovery, AILTA '16: Proceedings of the International Workshop "AI for Long-term Autonomy, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01376288
Black-Box Data-efficient Policy Search for Robotics, Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01576683
Reset-free trial-and-error learning for robot damage recovery, Robotics and Autonomous Systems, vol.100, pp.236-250, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01654641
A survey on policy search algorithms for learning robot controllers in a handful of trials, 2018. ,
Deep reinforcement learning in a handful of trials using probabilistic dynamics models, Advances in Neural Information Processing Systems (NIPS), 2018. ,
Expected policy gradients, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018. ,
Expected Policy Gradients for Reinforcement Learning, 2018. ,
Multi-column deep neural network for traffic sign classification, Neural Networks, vol.32, pp.333-338, 2012. ,
Learning to adapt in dynamic, real-world environments through metareinforcement learning, Proceedings of the International Conference on Learning Representations (ICLR), 2019. ,
Cocomopl: A novel approach for humanoid walking generation combining optimal control, movement primitives and learning and its transfer to the real robot hrp-2, IEEE Robotics and Automation Letters, vol.2, issue.2, pp.977-984, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01459840
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms, Proceedings of the International Conference on Machine Learning (ICML), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01840576
On Building Systems That Will Fail, ACM Turing award lectures, vol.34, issue.9, pp.72-81, 2007. ,
Continuous upper confidence trees, Proceedings of Learning and Intelligent Optimization ,
Continuous rapid action value estimates, Proceedings of Asian Conference on Machine Learning, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00642459
Quality and Diversity Optimization: A Unifying Modular Framework, IEEE Transactions on Evolutionary Computation, 2017. ,
Hierarchical behavioral repertoires with unsupervised descriptors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 2018. ,
Robots that can adapt like animals, Nature, vol.521, issue.7553, pp.503-507, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01158243
Limbo: A Flexible High-performance Library for Gaussian Processes modeling and Data-Efficient Optimization, The Journal of Open Source Software, vol.3, issue.26, p.545, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01884299
Efficient reinforcement learning for robots using informative simulated priors, Proceedings of the International Conference on Robotics and Automation (ICRA), 2015. ,
Hierarchical relative entropy policy search, Journal of Machine Learning Research, pp.1-50, 2016. ,
DARPA's ATLAS Robot Unveiled, 2013. ,
Multi-objective optimization, Search methodologies, pp.403-449, 2014. ,
Multi-Objective Optimization Using Evolutionary Algorithms, 2001. ,
A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary Computation, vol.6, issue.2, pp.182-197, 2002. ,
Team WPI-CMU: Achieving Reliable Humanoid Behavior in the DARPA Robotics Challenge, Journal of Field Robotics, vol.34, issue.2, pp.381-399, 2017. ,
Linear off-policy actor-critic, Proceedings of the International Conference on Machine Learning (ICML) ,
, , 2015.
PILCO: A model-based and dataefficient approach to policy search, Proceedings of the International Conference on Machine Learning (ICML), 2011. ,
Learning to control a low-cost manipulator using data-efficient reinforcement learning, Proceedings of Robotics: Science & Systems (RSS), 2011. ,
Toward fast policy search for learning legged locomotion, Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2012. ,
, A Survey on Policy Search for Robotics. Foundations and Trends in Robotics, vol.2, issue.1, pp.1-142, 2013.
Multi-task policy search for robotics, Proceedings of the International Conference on Robotics and Automation (ICRA), 2014. ,
Gaussian processes for data-efficient learning in robotics and control, IEEE Transactions Pattern Analysis and Machine Intelligence, vol.37, pp.408-423, 2015. ,
Learning and policy search in stochastic dynamical systems with bayesian neural networks, Proceedings of the International Conference on Learning Representations, 2017. ,
Decomposition of uncertainty in bayesian deep learning for efficient and risk-sensitive learning, Proceedings of the International Conference on Machine Learning (ICML), 2018. ,
Optimizing long-term predictions for model-based policy search, Proceedings of Conference on Robot Learning (CoRL), 2017. ,
Beyond black-box optimization: a review of selective pressures for evolutionary robotics, Evolutionary Intelligence, vol.7, issue.2, pp.71-93, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01150254
Learning velocity kinematics: Experimental comparison of on-line regression algorithms, Robotica, pp.15-20, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00719975
EvoRBC: evolutionary repertoire-based control for robots with arbitrary locomotion complexity, Proceedings of The Genetic and Evolutionary Computation Conference (GECCO), 2016. ,
Evolution of repertoire-based control for robots with complex locomotor systems, IEEE Transactions on Evolutionary Computation, 2017. ,
Simultaneous localization and mapping: part I, IEEE Robotics & Automation Magazine, vol.13, issue.2, pp.99-110, 2006. ,
An introduction to the bootstrap, 1994. ,
Reinforcement learning with Gaussian processes, Proceedings of the International Conference on Machine Learning (ICML), 2005. ,
Evolutionary multiobjective optimization in noisy problem environments, Journal of Heuristics, vol.15, issue.6, p.559, 2009. ,
, Evolution channels gradient descent in super neural networks, 2017.
Initializing bayesian hyperparameter optimization via meta-learning, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2015. ,
Learning ball acquisition on a physical robot, Proceedings of the International Symposium on Robotics and Automation (ISRA), 2004. ,
Model-agnostic meta-learning for fast adaptation of deep networks, Proceedings of the International Conference on Machine Learning (ICML), 2017. ,
Intrinsically motivated goal exploration processes with automatic curriculum learning, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01651233
Feature space modeling through surrogate illumination, Proceedings of The Genetic and Evolutionary Computation Conference, 2017. ,
Dropout as a Bayesian approximation: Representing model uncertainty in deep learning, Proceedings of the International Conference on Machine Learning (ICML) ,
Improving PILCO with Bayesian neural network dynamics models, Data-Efficient Machine Learning Workshop, 2016. ,
Convolutional face finder: A neural architecture for fast and robust face detection, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.11, pp.1408-1423, 2004. ,
Model predictive control: theory and practice-a survey, Automatica, vol.25, issue.3, pp.335-348, 1989. ,
Bayesian Optimization with Inequality Constraints, Proceedings of the International Conference on Machine Learning (ICML), 2014. ,
Information-seeking, curiosity, and attention: computational and neural mechanisms, Trends in Cognitive Sciences, vol.17, issue.11, pp.585-593, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00913646
Reinforcement learning for imitating constrained reaching movements, Advanced Robotics, Special Issue on Imitative Robots, vol.21, issue.13, pp.1521-1544, 2007. ,
Fukushima robot operator writes tell-all blog, IEEE Spectrum, 2011. ,
Recurrent World Models Facilitate Policy Evolution, Advances in Neural Information Processing Systems (NIPS), 2018. ,
Composable Deep Reinforcement Learning for Robotic Manipulation, Proceedings of the International Conference on Robotics and Automation (ICRA), 2018. ,
, Reliable Uncertainty Estimates in Deep Neural Networks using Noise Contrastive Priors, 2018.
The CMA Evolution Strategy: A Comparing Review, 2006. ,
Benchmarking a BI-population CMA-ES on the BBOB-2009 noisy testbed, Proceedings of The Genetic and Evolutionary Computation Conference, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00382101
Completely derandomized self-adaptation in evolution strategies, Evolutionary Computation, vol.9, issue.2, pp.159-195, 2001. ,
A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion, IEEE Transactions on Evolutionary Computation, vol.13, issue.1, pp.180-197, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00276216
Reinforcement learning in continuous action spaces, 2007. ,
, Emergence of locomotion behaviours in rich environments, 2017.
Hoeffding and bernstein races for selecting policies in evolutionary direct policy search, Proceedings of the International Conference on Machine Learning (ICML), 2009. ,
Entropy search for information-efficient global optimization, Journal of Machine Learning Research, vol.13, pp.1809-1837, 2012. ,
TEXPLORE: real-time sample-efficient reinforcement learning for robots, Machine Learning, vol.90, pp.385-429, 2013. ,
, Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning, 2018.
VIME: Variational information maximizing exploration, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Faster exact algorithms for computing expected hypervolume improvement, Proceedings of the International Conference on Evolutionary Multi-Criterion Optimization, 2015. ,
An Experimental Investigation of Model-based Parameter Optimisation: SPO and Beyond, Proceedings of The Genetic and Evolutionary Computation Conference (GECCO), 2009. ,
Dynamical movement primitives: Learning attractor models for motor behaviors, Neural Computation, vol.25, issue.2, pp.328-373, 2013. ,
Movement imitation with nonlinear dynamical systems in humanoid robots, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2002. ,
Learning attractor landscapes for learning motor primitives, Advances in Neural Information Processing Systems (NIPS), 2003. ,
The major determinants in normal and pathological gait, JBJS, vol.35, issue.3, pp.543-558, 1953. ,
Fault-diagnosis systems: an introduction from fault detection to fault tolerance, 2006. ,
Differential dynamic programming, 1970. ,
Evolutionary robotics and the radical envelope-of-noise hypothesis, Adaptive behavior, vol.6, issue.2, pp.325-368, 1997. ,
Noise and the reality gap: The use of simulation in evolutionary robotics, Proceedings of the European Conference on Artificial Life, 1995. ,
Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task, Proceedings of the Conference on Robot Learning (CoRL), 2017. ,
Evolutionary optimization in uncertain environments-a survey, IEEE Transactions on Evolutionary Computation, vol.9, issue.3, pp.303-317, 2005. ,
The NLopt nonlinear-optimization package ,
Learning state representations with robotic priors, Autonomous Robots, vol.39, issue.3, pp.407-428, 2015. ,
Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors, Proceedings of Robotics: Science and Systems (RSS), 2018. ,
Unscented filtering and nonlinear estimation, Proceedings of the IEEE, vol.92, issue.3, pp.401-422, 2004. ,
Reinforcement learning: A survey, Journal of Artificial Intelligence Research, vol.4, pp.237-285, 1996. ,
A natural policy gradient, Advances in Neural Information Processing Systems (NIPS), 2002. ,
Learning force control policies for compliant manipulation, Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2011. ,
A new approach to linear filtering and prediction problems, Journal of basic Engineering, vol.82, issue.1, pp.35-45, 1960. ,
Convergence guarantees for kernel-based quadrature rules in misspecified settings, Advances in Neural Information Processing Systems (NIPS), 2016. ,
High dimensional bayesian optimisation and bandits via additive models, International Conference on Machine Learning (ICML), 2015. ,
Sampling-based algorithms for optimal motion planning, The International Journal of Robotics Research, vol.30, issue.7, pp.846-894, 2011. ,
Multi-objective modelbased policy search for data-efficient learning with sparse rewards, Proceedings of the Conference on Robot Learning (CoRL), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01884294
Curse of dimensionality, Encyclopedia of Machine Learning, pp.257-258, 2011. ,
Overcoming catastrophic forgetting in neural networks, Proceedings of the National Academy of Sciences, 2017. ,
Gaussian processes and reinforcement learning for identification and control of an autonomous blimp, Proceedings of the International Conference on Robotics and Automation (ICRA), 2007. ,
Learning motor primitives for robotics, Proceedings of the International Conference on Robotics and Automation (ICRA), 2009. ,
Reinforcement learning in robotics: A survey, International Journal of Robotics Research, vol.32, issue.11, pp.1238-1274, 2013. ,
Whole-body model-predictive control applied to the HRP-2 humanoid, Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01137021
Policy gradient reinforcement learning for fast quadrupedal locomotion, Proceedings of the International Conference on Robotics and Automation (ICRA), 2004. ,
Actor-critic algorithms, Advances in Neural Information Processing Systems (NIPS), 2000. ,
Online discovery of locomotion modes for wheellegged hybrid robots: A transferability-based approach, Proceedings of the International Conference on Climbing and Walking Robots and Support Technologies for Mobile Machines, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00633930
Fast damage recovery in robotics with the t-resilience algorithm, The International Journal of Robotics Research, vol.32, issue.14, pp.1700-1723, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00932862
The transferability approach: Crossing the reality gap in evolutionary robotics, IEEE Transactions on Evolutionary Computation, vol.17, issue.1, pp.122-145, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00687617
Nonlinear and adaptive control design, vol.222, 1995. ,
Optimal control with learned local models: Application to dexterous manipulation, Proceedings of the International Conference on Robotics and Automation (ICRA), 2016. ,
Model-based contextual policy search for data-efficient generalization of robot skills, Artificial Intelligence, vol.247, pp.415-439, 2017. ,
A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise, Journal of Basic Engineering, vol.86, issue.1, pp.97-106, 1964. ,
Minimum jerk path generation, Proceedings of the International Conference on Robotics and Automation (ICRA), 1988. ,
Curiosity Driven Exploration of Learned Disentangled Goal Spaces, Proceedings of the Conference on Robot Learning (CoRL), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01891598
Face recognition: A convolutional neural-network approach, IEEE Transactions on Neural Networks, vol.8, issue.1, pp.98-113, 1997. ,
Deep learning, Nature, vol.521, issue.7553, pp.436-444, 2015. ,
Gp-ilqg: Data-driven robust optimal control for uncertain nonlinear dynamical systems, 2017. ,
DART: Dynamic Animation and Robotics Toolkit, The Journal of Open Source Software, vol.3, issue.22, 2018. ,
Model-based iterative learning control with a quadratic criterion for time-varying linear systems, Automatica, vol.36, issue.5, pp.641-657, 2000. ,
Model predictive control technique combined with iterative learning for batch processes, AIChE Journal, vol.45, issue.10, pp.2175-2187, 1999. ,
Universal intelligence: A definition of machine intelligence. Minds and Machines, vol.17, pp.391-444, 2007. ,
Exploiting open-endedness to solve problems through the search for novelty, Proceedings of the Conference on Artificial Life (ALIFE), 2008. ,
Abandoning objectives: Evolution through the search for novelty alone, Evolutionary Computation, vol.19, issue.2, pp.189-223, 2011. ,
Creative generation of 3D objects with deep learning and innovation engines, Proceedings of the International Conference on Computational Creativity, 2016. ,
, ES Is More Than Just a Traditional Finite-Difference Approximator, 2017.
Generation of whole-body optimal dynamic multi-contact motions, International Journal of Robotics Research, vol.32, issue.9, pp.1104-1119, 2013. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00819250
Unsupervised deep learning of state representation using robotic priors, Proceedings of the International Conference on Learning Representations (ICLR, 2017. ,
Unsupervised state representation learning with robotic priors: a robustness benchmark, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01644423
State representation learning for control: An overview, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01858558
Learning neural network policies with guided policy search under unknown dynamics, Advances in Neural Information Processing Systems (NIPS), 2014. ,
Guided policy search, Proceedings of the International Conference on Machine Learning (ICML), 2013. ,
End-to-end training of deep visuomotor policies, Journal of Machine Learning Research, vol.17, issue.39, pp.1-40, 2016. ,
Continuous control with deep reinforcement learning, Proceedings of the International Conference on Learning Representations (ICLR), 2016. ,
, When Gaussian Process Meets Big Data: A Review of Scalable GPs, 2018.
Automatic gait optimization with gaussian process regression, Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI), 2007. ,
Efficient reinforcement learning for humanoid whole-body control, Proceedings of the International Conference on Humanoid Robots (Humanoids), 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01377831
Optimizing Task Feasibility using ModelFree Policy Search and Model-Based Whole-Body Control, Proceedings of the International Conference on Robotics and Automation (ICRA), 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01620370
, Modern Robotics, 2017.
A versatile generalized inverted kinematics implementation for collaborative working humanoid robots: The stack of tasks, Proceedings of the International Conference on Advanced Robotics, 2009. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00796736
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization, Proceedings of the International Conference on Robotics and Automation (ICRA), 2017. ,
Active Policy Learning for Robot Planning and Exploration under Uncertainty, Proceedings of Robotics: Science & Systems (RSS), 2007. ,
DOI : 10.15607/rss.2007.iii.041
URL : https://doi.org/10.15607/rss.2007.iii.041
A bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot, Autonomous Robots, vol.27, issue.2, 2009. ,
Learning parametric dynamic movement primitives from multiple demonstrations, Neural Networks, vol.24, issue.5, pp.493-500, 2011. ,
DOI : 10.1007/978-3-642-17537-4_43
A second-order gradient method for determining optimal trajectories of non-linear discrete-time systems, International Journal of Control, vol.3, issue.1, pp.85-95, 1966. ,
Machine learning: An artificial intelligence approach, 2013. ,
Genetic algorithms, selection schemes, and the varying effects of noise, Evolutionary Computation, vol.4, issue.2, pp.113-131, 1996. ,
Human-level control through deep reinforcement learning, Nature, vol.518, issue.7540, p.529, 2015. ,
Asynchronous methods for deep reinforcement learning, Proceedings of the International Conference on Machine Learning (ICML), 2016. ,
Reset-free guided policy search: efficient deep reinforcement learning with stochastic initial states, Proceedings of the International Conference on Robotics and Automation (ICRA), 2017. ,
Iterative learning control: A survey and new results, Journal of Field Robotics, vol.9, issue.5, pp.563-594, 1992. ,
Exploration strategies in developmental robotics: a unified probabilistic framework, Proceedings of the Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL), 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00860641
Novelty-based Multiobjectivization, New Horizons in Evolutionary Robotics, pp.139-154, 2011. ,
DOI : 10.1007/978-3-642-18272-3_10
URL : https://hal.archives-ouvertes.fr/hal-01300711
Micro-data learning: The other end of the spectrum, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01374786
20 Years of Reality Gap: a few Thoughts about Simulators in Evolutionary Robotics, Workshop" Simulation in Evolutionary Robotics, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01518764
Illuminating search spaces by mapping elites, 2015. ,
Sferes v2 : Evolvin'in the multi-core world, Proceedings of Congress on Evolutionary Computation (CEC), 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00687633
Encouraging Behavioral Diversity in Evolutionary Robotics: an Empirical Study, Evolutionary Computation, vol.20, issue.1, pp.91-133, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00687609
A mathematical introduction to robotic manipulation, 2017. ,
Habitat utilization and migration in juvenile sea turtles. The biology of sea turtles, vol.1, pp.137-163, 1997. ,
Petman: A humanoid robot for testing chemical protective clothing, Journal of the Robotics Society of Japan, vol.30, issue.4, pp.372-377, 2012. ,
DOI : 10.7210/jrsj.30.372
URL : https://www.jstage.jst.go.jp/article/jrsj/30/4/30_30_372/_pdf
PEGASUS: a policy search method for large MDPs and POMDPs, Proceedings of Uncertainty in Artificial Intelligence, 2000. ,
Autonomous inverted helicopter flight via reinforcement learning, Experimental Robotics IX, pp.363-372, 2006. ,
DOI : 10.1007/11552246_35
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images, Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
DOI : 10.1109/cvpr.2015.7298640
URL : http://yosinski.com/media/papers/Nguyen__2014__arXiv__Deep_Neural_Networks_are_Easily_Fooled.pdf
Understanding Innovation Engines: Automated Creativity and Improved Stochastic Optimization via Deep Learning, Evolutionary Computation, vol.24, pp.545-572, 2016. ,
DOI : 10.1162/evco_a_00189
URL : https://www.mitpressjournals.org/userimages/ContentEditor/1164817256746/lib_rec_form.pdf
Innovation engines: Automated creativity and improved stochastic optimization via deep learning, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 2015. ,
DOI : 10.1162/evco_a_00189
URL : https://www.mitpressjournals.org/userimages/ContentEditor/1164817256746/lib_rec_form.pdf
Using model knowledge for learning inverse dynamics, Proceedings of the International Conference on Robotics and Automation (ICRA), 2010. ,
Model learning for robot control: a survey, Cognitive Processing, vol.12, issue.4, pp.319-340, 2011. ,
DOI : 10.1007/s10339-011-0404-1
Shakey the robot, 1984. ,
iCub whole-body control through force regulation on rigid non-coplanar contacts, Frontiers in Robotics and AI, vol.2, issue.6, 2015. ,
DOI : 10.3389/frobt.2015.00006
URL : https://hal.archives-ouvertes.fr/hal-01137239
Fade To Black The 1980s vision of "lights-out" manufacturing, where robots do all the work, is a dream no more, 2003. ,
Action-conditional video prediction using deep networks in atari games, Advances in Neural Information Processing Systems (NIPS), pp.2863-2871, 2015. ,
Bayes-hermite quadrature, Journal of Statistical Planning and Inference, 1991. ,
Curiosity and languages. Catalogue of the Exhibition, Fondation Cartier pour l'Art Contemporain, p.180, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00788574
, Computational Theories of Curiosity-Driven Learning, 2018.
DOI : 10.31234/osf.io/3p8f6
URL : http://arxiv.org/pdf/1802.10546
The playground experiment: Task-independent development of a curious robot, Proceedings of the AAAI Spring Symposium on Developmental Robotics, 2005. ,
Intrinsic Motivation Systems for Autonomous Mental Development, IEEE transactions on Evolutionary Computation, vol.11, issue.2, pp.265-286, 2007. ,
DOI : 10.1109/tevc.2006.890271
URL : http://cogprints.org/5473/1/ims.pdf
Whole-body multi-contact motion in humans and humanoids: Advances of the CoDyCo European project, Robotics and Autonomous Systems, vol.90, pp.97-117, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01399360
Safetyaware robot damage recovery using constrained bayesian optimization and simulated priors, BayesOpt'16: Proceedings of the International Workshop "Bayesian Optimization: Black-box Optimization and Beyond" at NIPS, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01407757
, Patchwork kriging for large-scale gaussian process regression, 2017.
Alternating Optimisation and Quadrature for Robust Control, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01644063
Bayesian optimization with automatic prior selection for data-efficient direct policy search, Proceedings of the International Conference on Robotics and Automation (ICRA), 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01768279
Sim-to-real transfer of robotic control with dynamics randomization, 2017. ,
How the body shapes the way we think: a new view of intelligence, 2006. ,
Learning predictive terrain models for legged robot locomotion, Proceedings of the International Conference on Intelligent Robots (IROS), 2008. ,
Survey of Model-Based Reinforcement Learning: Applications on Robotics, Journal of Intelligent & Robotic Systems, pp.1-21, 2017. ,
Quality diversity: A new frontier for evolutionary computation, Frontiers in Robotics and AI, vol.3, p.40, 2016. ,
Bootstrapping of Parameterized Skills Through Hybrid Optimization in Task and Policy Spaces. Frontiers in Robotics and AI, 2018. ,
A unifying view of sparse approximate Gaussian process regression, Journal of Machine Learning Research, vol.6, pp.1939-1959, 2005. ,
Bigdog, the roughterrain quadruped robot, Proceedings of IFAC, pp.10822-10825, 2008. ,
Epopt: Learning robust neural network policies using model ensembles, Proceedings of the International Conference on Learning Representations (ICLR, 2017. ,
Bayesian monte carlo, Advances in Neural Information Processing Systems (NIPS), 2003. ,
Gaussian processes for machine learning, vol.1, 2006. ,
Model predictive control: Theory and design, 2009. ,
, J. Rieffel and J.-B. Mouret. Soft tensegrity robots. Soft Robotics, 2018.
Boosting active learning to optimality: A tractable monte-carlo, billiard-based algorithm, Proceedings of the European Conference on Machine Learning (ECML), 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00433866
High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2018. ,
Functional stability analysis of numerical algorithms, 1990. ,
On-line Q-learning using connectionist systems, vol.37, 1994. ,
Artificial intelligence: a modern approach ,
Intrinsic and extrinsic motivations: Classic definitions and new directions, Contemporary educational psychology, vol.25, issue.1, pp.54-67, 2000. ,
Meta reinforcement learning with latent variable gaussian processes, Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2018. ,
Synthesis of complex humanoid whole-body behavior: a focus on sequencing and tasks transitions, Proceedings of the International Conference on Robotics and Automation (ICRA), 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00578073
Data-efficient control policy search using residual dynamics learning, Proc. of IROS, 2017. ,
Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts, Connection Science, vol.18, issue.2, pp.173-187, 2006. ,
Trust region policy optimization, International Conference on Machine Learning (ICML), pp.1889-1897, 2015. ,
Faster and smoother walking of humanoid hrp-2 with passive toe joints, Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2006. ,
Probability inequalities for the sum in sampling without replacement. The Annals of Statistics, pp.39-48, 1974. ,
Taking the human out of the loop: A review of bayesian optimization, Proceedings of the IEEE, vol.104, issue.1, pp.148-175, 2016. ,
Springer handbook of robotics, 2016. ,
, Policy search in continuous action domains: an overview, 2018.
URL : https://hal.archives-ouvertes.fr/hal-02182466
Monte-carlo planning in large POMDPs, Advances in Neural Information Processing Systems (NIPS), 2010. ,
Deterministic policy gradient algorithms, Proceedings of the International Conference on Machine Learning (ICML), 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00938992
Mastering the game of Go with deep neural networks and tree search, Nature, vol.529, issue.7587, pp.484-489, 2016. ,
Mastering chess and shogi by self-play with a general reinforcement learning algorithm, 2017. ,
Mastering the game of go without human knowledge, Nature, vol.550, issue.7676, p.354, 2017. ,
Evolving virtual creatures, Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 1994. ,
Input warping for bayesian optimization of non-stationary functions, Proceedings of the International Conference on Machine Learning (ICML), 2014. ,
Trial-and-error learning of repulsors for humanoid qp-based whole-body control, Proceedings of the International Conference on Humanoid Robots (Humanoids, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01569948
The pendubot: A mechatronic system for control research and education, Proceedings of Decision and Control, 1995. ,
Gaussian process optimization in the bandit setting: no regret and experimental design, Proceedings of the International Conference on Machine Learning (ICML), 2010. ,
Evolving neural networks through augmenting topologies, Evolutionary Computation, 2002. ,
Policy improvement: Between black-box optimization and episodic reinforcement learning, Journées Francophones Planification, Décision, et Apprentissage pour la conduite de systèmes, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00922133
Robot skill learning: From reinforcement learning to evolution strategies, Paladyn. Journal of Behavioral Robotics, vol.4, issue.1, pp.49-61, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00922132
Reinforcement learning with sequences of motion primitives for robust manipulation, IEEE Transactions on Robotics, vol.28, issue.6, pp.1360-1370, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00766177
Learning compact parameterized skills with a single regression, Proceedings of the International Conference on Humanoid Robots (Humanoids), 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00922135
Least-squares conditional density estimation, IEICE Transactions on Information and Systems, vol.93, issue.3, pp.583-594, 2010. ,
Dyna, an integrated architecture for learning, planning, and reacting, ACM SIGART Bulletin, vol.2, issue.4, pp.160-163, 1991. ,
Reinforcement learning: An introduction, vol.1, 1998. ,
Policy gradient methods for reinforcement learning with function approximation, Advances in Neural Information Processing Systems (NIPS), pp.1057-1063, 2000. ,
Deepface: Closing the gap to human-level performance in face verification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. ,
Modelbased policy gradients with parameter-based exploration by least-squares conditional density estimation, Neural Networks, vol.57, pp.128-140, 2014. ,
Transfer learning for reinforcement learning domains: A survey, Journal of Machine Learning Research, vol.10, pp.1633-1685, 2009. ,
A generalized iterative lqg method for locally-optimal feedback control of constrained nonlinear stochastic systems, Proceedings of the American Control Conference, 2005. ,
icub: the design and realization of an open humanoid platform for cognitive and neuroscience research, Advanced Robotics, vol.21, issue.10, pp.1151-1175, 2007. ,
Genetic algorithms with a robust solution searching scheme, IEEE transactions on Evolutionary Computation, vol.1, issue.3, pp.201-208, 1997. ,
Convolutional networks can learn to generate affinity graphs for image segmentation, Neural Computation, vol.22, issue.2, pp.511-538, 2010. ,
Original approach for the localisation of objects in images, IEE Proceedings-Vision, Image and Signal Processing, vol.141, pp.245-250, 1994. ,
A theoretical and empirical analysis of Expected Sarsa, Proceedings of the Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2009. ,
Using centroidal Voronoi tessellations to scale up the multi-dimensional archive of phenotypic elites algorithm, IEEE Transactions on Evolutionary Computation, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01630627
Real-time fault diagnosis, IEEE Robotics & Automation Magazine, vol.11, issue.2, pp.56-66, 2004. ,
Learning deep dynamical models from image pixels, Proceedings of the 17th IFAC Symposium on System Identification (SYSID), 2015. ,
Iterative learning model predictive control for multi-phase batch processes, Journal of Process Control, vol.18, issue.6, pp.543-557, 2008. ,
Bayesian optimization in a billion dimensions via random embeddings, Journal of Artificial Intelligence Research, vol.55, pp.361-387, 2016. ,
, Optimal Learning. Wiley Series in Probability and Statistics, 2012.
Q-learning, Machine learning, vol.8, issue.3-4, pp.279-292, 1992. ,
Learning to control a 6-degree-of-freedom walking robot, Proceedings of the International Conference on, 2007. ,
Approximate dynamic programming for realtime control and neural modelling. Handbook of intelligent control: neural, fuzzy and adaptive approaches, pp.493-525, 1992. ,
Sequential design of computer experiments to minimize integrated response functions, Statistica Sinica, 2000. ,
Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, 1992. ,
Using trajectory data to improve bayesian optimization for reinforcement learning, Journal of Machine Learning Research, vol.15, issue.1, pp.253-282, 2014. ,
Semi-parametric Gaussian process for robot system identification, Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2012. ,
Expected hypervolume improvement algorithm for PID controller tuning and the multiobjective dynamical control of a biogas plant, Proceedings of the IEEE Congress on Evolutionary Computation (CEC), 2015. ,
Preparing for the unknown: Learning a universal policy with online system identification, Proceedings of Robotics: Science and Systems, 2017. ,
Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search, Proceedings of the International Conference on Robotics and Automation (ICRA), 2016. ,
Fast Model Identification via Physics Engines for Data-Efficient Policy Search, Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI), 2018. ,
Neural Fitted Actor-Critic, Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01350651