Apprenticeshiplearningviainversereinforcementlearning, Proceedings of the twenty-first international conference on Machine learning, 2004. ,
A New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC), Journal of Dynamic Systems, Measurement, and Control, vol.97, issue.3, pp.97220-227, 1975. ,
DOI : 10.1115/1.3426922
Andhill-98: A RoboCup Team which Reinforces Positioning with Observation, pp.338-345, 1998. ,
DOI : 10.1007/3-540-48422-1_27
Robot learning from demonstration, Proc. 14th InternationalConferenceonMachineLearning, pp.12-20, 1997. ,
Robot see, robot do: An overview of robot imitation, AISB96WorkshoponLearninginRobotsandAnimals, 1996. ,
Dynamic Programming, 1957. ,
LearningHowtoBehavefromObservingOthers, 2002. ,
Humanoid robot learning and game playing using PC-based vision, IEEE/RSJ International Conference on Intelligent Robots and System, pp.2449-2454, 2002. ,
DOI : 10.1109/IRDS.2002.1041635
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.8175
Humanoid robot learning and game playing using PC-based vision, IEEE/RSJ International Conference on Intelligent Robots and System, 2002. ,
DOI : 10.1109/IRDS.2002.1041635
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.8175
Bandit Problems: Sequential Allocation of Experiments, 1985. ,
DOI : 10.1007/978-94-015-3711-7
Neuro-Dynamic Programming, Athena Scientific, p.512, 1996. ,
Automatic shaping and decomposition of reward functions, In ICML, vol.227, pp.601-608, 2007. ,
Sequential Optimality and Coordination in Multiagent Systems, 1999. ,
Generalization in reinforcement learning: Safely approximatingthevaluefunction, AdvancesinNeuralInformationProcessingSystems, vol.7, pp.369-376, 1995. ,
R-max -a general polynomial time algorithm for near-optimalreinforcementlearning, J.Mach.Learn.Res, vol.3, pp.213-231, 2003. ,
An Overview of the AI in Football Games from Cheating to Machine Learning, 2008. ,
The Dynamics of Reinforcement Learning in Cooperative MultiagentSystemsAAAI-97WorkshoponMultiagentLearningRobotshaping:TheHAMSTERExperiment, 1996. ,
Apprentissage artificiel Concepts et algorithmes, Editions Eyrolles, p.620, 2002. ,
Improving elevator performance using reinforcement learning, AdvancesinNeuralInformationProcessingSystems, vol.8, pp.1017-1023, 1996. ,
JellyFishBackgammon, 1998. ,
ApprentissageparrenforcementdanslesprocessusdedécisionMarkoviens factorisés The MAXQ method for hierarchical reinforcement learning, pp.118-126, 1998. ,
Robot shaping: developing autonomous agents through learning, Artificial Intelligence, vol.71, issue.2, pp.321-370, 1994. ,
DOI : 10.1016/0004-3702(94)90047-7
RobotShaping:AnExperimentinBehaviorEngineering. MITPress, p.300, 1997. ,
AI in Computer Games: AI for Beginners Discussion?RoundtableHandouts, Probabilisticpolicyreuseinareinforcementlearningagent. the fifth international joint conference on Autonomous agents and multiagent systems, 2001. ,
RobotiqueMobile, 2004. ,
Apprentissage de la coordination multiagent: Une méthode basée sur le Q-learning par jeu adaptatif. Revue d'intelligence artificielle, pp.2-3, 2006. ,
DOI : 10.3166/ria.20.383-410
Apreviewcontrolmodelofdriversteeringbehavior, pp.504-509, 1989. ,
Steering Behavior Model of Visitor NPCs in Virtual ExhibitionIn:AdvancesinArtificialRealityandTele-Existence, HeidelbergS.B.ed, vol.4282, pp.113-121, 2006. ,
Planning and acting in partially observablestochasticdomains, ArtificialIntelligence, vol.101, pp.99-134, 1998. ,
Finite-sample convergence rates for Q-learning and indirect algorithms, Proceedings of the 1998 conference on Advances in neural information processingsystemsII, pp.996-1002, 1999. ,
RoboCup, Proceedings of the first international conference on Autonomous agents , AGENTS '97, pp.19-24, 1995. ,
DOI : 10.1145/267658.267738
Autonomousshaping:knowledgetransferinreinforcement learning, InICML, vol.148, pp.489-496, 2006. ,
Value-Function-Based TransferforReinforcement LearningUsing Structure Mapping, Proceedings of the Twenty-First National Conference on Artificial Intelligence, pp.415-435, 2006. ,
Grid World, 1997. ,
Reward Functions for Accelerated Learning, Proceedings of the EleventhInternationalConferenceonMachineLearning, pp.181-189, 1994. ,
DOI : 10.1016/B978-1-55860-335-6.50030-1
ReinforcementLearningintheMulti-RobotDomain, Autonomous Robots, vol.4, issue.1, pp.73-83, 1997. ,
DOI : 10.1023/A:1008819414322
Analyzingteamsportstrategiesbymeansof graphicalsimulationTowardthedesignofasimulatortoanalyze teamsportstrategies, ICISP2003,InternationalConferenceonImageandSignalProcessing, Agadir(Maroc),Juin2003, 2003. ,
BOXES: An experiment in adaptive control, Machine Intelligence2Mappingbetweendissimilarbodies:Affordancesandthe algebraicfoundationsofimitation, E.Eds.Edinburgh:OliverandBoyd NehanivC.,DautenhahnK, pp.137-152, 1968. ,
Algorithms for Inverse Reinforcement Learning, ProceedingsoftheSeventeenthInternationalConferenceonMachineLearning, pp.663-670, 2000. ,
AdynamicchannelassignmentpolicythroughQ-learning, Neural NetworksIEEETransactionson, vol.10, issue.6, pp.1443-1455, 1999. ,
A Q-learning-based dynamic channel assignment technique for mobilecommunication systems, Vehicular Technology IEEE Transactions on, vol.48, issue.5, pp.1676-1687, 1999. ,
A day of great illumination: B. F. Skinner's discovery of shaping., Journal of the Experimental Analysis of Behavior, vol.82, issue.3, 2004. ,
DOI : 10.1901/jeab.2004.82-317
Implicitimitationinmultiagent reinforcementlearning, Proc. 16thInternationalConf.onMachineLearning, pp.325-334, 1999. ,
Imitation and Reinforcement learning in agents with heterogeneousactions, ProceedingsoftheSeventeenthInternationalConferenceonMachine Learning(ICML-2000), p.pp, 2000. ,
Accelerating Reinforcement Learning through Implicit Imitation, JournalofArtificialIntelligenceResearch(JAIR, vol.19, pp.569-629, 2003. ,
Learning to Drive a Bicycle Using Reinforcement Learning andShaping.InICMLSoftwareforRLinC++, MorganKaufmann. RatitchB, pp.463-471, 1998. ,
Artificial intelligence : a modern approach, p.932, 1995. ,
A Standard Interface for Reinforcement Learning Software, 1996. ,
Training and Tracking in Robotics, IJCAI, pp.670-672, 1985. ,
Reinforcement learning for dynamic channel allocation in cellulartelephonesystems, AdvancesinNeuralInformationProcessingSystems, vol.9, pp.974-980, 1997. ,
TheBehaviorofOrganisms:AnExperimentalAnalysis.p, 1938. ,
The Optimal Control of Partially Observable Markov Processes An architecture for action selection in robotic soccer, 1971. ,
KeepawaySoccer:AMachineLearningTestbed, pp.214-223, 2001. ,
Team-Partitioned,Opaque-TransitionReinforced Learning, 1998. ,
Generalization in reinforcement learning: Successful examples using sparsecoarsecoding, AdvancesinNeuralInformationProcessingSystems, vol.8, pp.1038-1044, 1996. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, p.322, 1998. ,
DOI : 10.1109/TNN.1998.712192
BetweenMDPsandsemi-MDPs:aframeworkfor temporalabstractioninreinforcementlearning, ArtificialIntelligence, vol.112, pp.1-2181, 1999. ,
ComparingEvolutionaryandTemporalDifference Methods for Reinforcement Learning, Proceedings of the Genetic and Evolutionary ComputationConference, pp.1321-1349, 2006. ,
Autonomous transfer for reinforcement learning. InternationalFoundationforAutonomousAgents andMultiagentSystems, p. (Proceedings of the 7th international joint conference on Autonomous agentsandmultiagentsystems), pp.283-290, 2008. ,
Behavior transfer for value-function-based reinforcement learning, Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems , AAMAS '05, 2005. ,
DOI : 10.1145/1082473.1082482
Transfer Learning via Inter-Task Mappings for TemporalDifferenceLearning, J.Mach.Learn.Res, vol.8, pp.2125-2167, 2007. ,
Transferviainter-taskmappingsinpolicysearch reinforcement learning, Proceedings of the 6th international joint conference on Autonomousagentsandmultiagentsystems, 2007. ,
Neurogammon: a neural-network backgammon program, 1990 IJCNN International Joint Conference on Neural Networks, pp.33-39, 1990. ,
DOI : 10.1109/IJCNN.1990.137821
Programming backgammon using self-teaching neural nets, Artificial Intelligence, vol.134, issue.1-2, pp.181-199, 2002. ,
DOI : 10.1016/S0004-3702(01)00110-2
URL : http://doi.org/10.1016/s0004-3702(01)00110-2
Memory-guided Exploration in Reinforcement Learning, INNS-IEEE International Joint Conference on Neural Networks, pp.1002-1007, 2001. ,
Aneuro-dynamicprogrammingapproach toretailerinventorymanagement.DecisionandControlAnticipationasakeyforcollaborationinateamof agents: A case study in robotic soccer. SPIE Sensor Fusion and Decentralized Control in RoboticSystemsII, Proceedingsofthe36thIEEE Conferenceon, p.1999, 1997. ,
Q-learning.MachineLearning, pp.279-292, 1992. ,
ModelsofDelayedReinforcementLearning Reinforcement learning for the adaptive control of perception and actionGameAI:TheStateoftheIndustry, 1989. ,
page mise à jour le Behavior Simulation for Autonomous Agents in Crowded Environment ,
A reinforcement learning approach to Job-shop Scheduling, Proceedings of the International Joint Conference on Artificial Intellience (IJCAI-95), pp.1114-1120, 1995. ,