An application of reinforcement learning to aerobatic helicopter flight, Advances in Neural Information Processing Systems 19, pp.1-8, 2007. ,
Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015430
URL : http://www.aicml.cs.ualberta.ca/banff04/icml/pages/papers/335.pdf
APRIL: Active Preference Learning-Based Reinforcement Learning, 2012. ,
DOI : 10.1007/978-3-642-33486-3_8
URL : https://hal.archives-ouvertes.fr/hal-00722744
Solving uncertain markov decision problems, 2001. ,
Advantage updating, 1993. ,
DOI : 10.21236/ADA280862
Markov Decision Processes with Applications to Finance, 2011. ,
DOI : 10.1007/978-3-642-18324-9
A Markovian Decision Process, Indiana University Mathematics Journal, vol.6, issue.4, 1957. ,
DOI : 10.1512/iumj.1957.6.56038
URL : http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD0606367&Location=U2&doc=GetTRDoc.pdf
Partitioning procedures for solving mixed-variables programming problems, Computational Management Science, vol.2, issue.1, pp.3-19, 2005. ,
DOI : 10.1007/s10287-004-0020-y
Neuro-Dynamic Programming, Athena Scientific, 1996. ,
DOI : 10.1007/0-306-48332-7_333
LeZi-update, Proceedings of the 5th annual ACM/IEEE international conference on Mobile computing and networking , MobiCom '99, pp.121-135, 2002. ,
DOI : 10.1145/313451.313457
A Planning System Based on Markov Decision Processes to Guide People With Dementia Through Activities of Daily Living, IEEE Transactions on Information Technology in Biomedicine, vol.10, issue.2, pp.323-333, 2006. ,
DOI : 10.1109/TITB.2006.864480
Relative Entropy Inverse Reinforcement Learning, Proceedings of the 14th International Con-ference on Artificial Intelligence and Statistics, pp.182-189, 2011. ,
Cooperative negotiation in autonomic systems using incremental utility elicitation, Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, UAI'03, pp.89-97, 2003. ,
Constraint-based optimization and utility elicitation using the minimax decision criterion, Artificial Intelligence, vol.170, issue.8-9, pp.686-713, 2006. ,
DOI : 10.1016/j.artint.2006.02.003
URL : https://doi.org/10.1016/j.artint.2006.02.003
Constraint-based optimization and utility elicitation using the minimax decision criterion, Artificial Intelligence, vol.170, issue.8-9, pp.8-9686, 2006. ,
DOI : 10.1016/j.artint.2006.02.003
URL : https://doi.org/10.1016/j.artint.2006.02.003
Making rational decisions using adaptive utility elicitation, Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, pp.363-369, 2000. ,
Strategy iteration algorithms for games and markov decision pro- cesses, 2010. ,
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Machine Learning, pp.123-156, 2012. ,
DOI : 10.1016/S0004-3702(01)00110-2
Reducing the Number of Queries in Interactive Value Iteration, Algorithmic Decision Theory -4th International Conference Proceedings, pp.139-152, 2015. ,
DOI : 10.1007/978-3-319-23114-3_9
URL : https://hal.archives-ouvertes.fr/hal-01213280
Concurrent markov decision processes for robust robot team learning under uncertainty, 2014. ,
DOI : 10.1016/j.engappai.2014.12.007
Bounded-parameter Markov decision processes, Artificial Intelligence, vol.122, issue.1-2, pp.71-109, 2000. ,
DOI : 10.1016/S0004-3702(00)00047-3
URL : https://doi.org/10.1016/s0004-3702(00)00047-3
Human-to-robot skill transfer using the SPORE approximation, Proceedings of IEEE International Conference on Robotics and Automation, pp.2962-2967, 1996. ,
DOI : 10.1109/ROBOT.1996.509162
Dynamic programming and markov processes, pp.296-297, 1960. ,
Approximately optimal approximate reinforcement learning, Machine Learning Proceedings of the Nineteenth International Conference, pp.267-274, 2002. ,
On the sample complexity of reinforcement learning, 2003. ,
Inverse Reinforcement Learning through Structured Classification, Advances in Neural Information Processing Systems, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00778624
Least-squares policy iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003. ,
On the complexity of solving markov decision problems, IN PROC. OF THE ELEVENTH INTERNATIONAL CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, p.394, 1995. ,
Planning in the presence of cost functions controlled by an adversary, Machine Learning, Proceedings of the Twentieth International Conference, pp.536-543, 2003. ,
Multi-objective reinforcement learning using sets of pareto dominating policies, Journal of Machine Learning Research, vol.15, pp.3663-3692, 2014. ,
Algorithms for inverse reinforcement learning, Proc. 17th International Conf. on Machine Learning, pp.663-670, 2000. ,
Robustness in markov decision problems with uncertain transition matrices, NIPS, 2004. ,
DOI : 10.1287/opre.1050.0216
URL : http://robotics.eecs.berkeley.edu/~elghaoui/Pubs/RobMDP_OR2005.pdf
On the approximability of trade-offs and optimal access of Web sources, Proceedings 41st Annual Symposium on Foundations of Computer Science, p.86, 2000. ,
DOI : 10.1109/SFCS.2000.892068
Decision making under conditions of uncertainty in agriculture: a case study of oil crops, Poljoprivreda, vol.15, issue.1, pp.45-50, 2009. ,
Approximation of lorenzoptimal solutions in multiobjective markov decision processes, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-01216091
Inverse reinforcement learning for interactive systems, Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems Bridging the Gap Between Perception, Action and Communication, MLIS '13, pp.71-75, 2013. ,
DOI : 10.1145/2493525.2493529
URL : https://hal.archives-ouvertes.fr/hal-00869812
Tractable planning under uncertainty: Exploiting structure, 2004. ,
Efficient Training of Artificial Neural Networks for Autonomous Navigation, Neural Computation, vol.3, issue.1, pp.88-97, 1991. ,
DOI : 10.1162/neco.1989.1.4.541
Markov decision processes: discrete stochastic dynamic programming, 1994. ,
DOI : 10.1002/9780470316887
Markov decision processes : discrete stochastic dynamic programming Wiley series in probability and mathematical statistics, J. Wiley & Sons J, 2005. ,
DOI : 10.1002/9780470316887
Regret-based reward elicitation for markov decision processes, NIPS-08 workshop on Model Uncertainty and Risk in Reinforcement Learning, 2008. ,
Regret-based reward elicitation for markov decision processes, UAI 2009, Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp.444-451, 2009. ,
Robust policy computation in reward-uncertain mdps using nondominated policies, Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, 2010. ,
Eliciting additive reward functions for markov decision processes, IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp.2159-2164, 2011. ,
Robust online optimization of reward-uncertain mdps, IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp.2165-2171, 2011. ,
Regret-based reward elicitation for markov decision processes, 1205. ,
A survey of multiobjective sequential decision-making, J. Artif. Intell. Res. (JAIR), vol.48, pp.67-113, 2013. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Optimal bayesian recommendation sets and myopically optimal choice query sets, 2010. ,
Vector-valued Markov decision processes and the systems of linear inequalities, Stochastic Processes and their Applications, pp.159-169, 1995. ,
DOI : 10.1016/0304-4149(94)00064-Z
Learning from Delayed Rewards King's College, 1989. ,
Markov decision processes with ordinal rewards: Reference pointbased preferences, Proceedings of the 21st International Conference on Automated Planning and Scheduling, ICAPS 2011, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-01285812
Ordinal Decision Models for Markov Decision Processes, European Conference on Artificial Intelligence, pp.828-833, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-01273056
Interactive Value Iteration for Markov Decision Processes with Unknown Rewards, Proc. 23th International Joint Conference Artificial Intelligence (IJCAI2013), 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00942290
Parametric regret in uncertain Markov decision processes, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference, pp.3606-3613, 2009. ,
DOI : 10.1109/CDC.2009.5400796
walkr: Mcmc sampling from nonnegative convex polytopes, 2015. ,
Navigate like a cabbie, Proceedings of the 10th international conference on Ubiquitous computing, UbiComp '08, pp.322-331, 2008. ,
DOI : 10.1145/1409635.1409678