Dialogua Act: askASlot-slot_3 Dialogue State Dialogua Act: explicitConfirm-slot_3 Dialogue State: [0.368238, 0, 1] Dialogua Act: implConfAskASlot-slot_1_ASK_slot_2 Dialogue State: [0.917882, 0.499996, 1] Dialogua Act: explicitConfirm-slot_2 Dialogue State: [0.958942, 0.5, 1] Dialogua Act: explicitConfirm-slot_2 Dialogue State: [0.958942, 1, 1] Dialogua Act: closingDialogue-null ========================================================= Intended goal of the simulated user: Type: Indian Price-range: Cheap Location: City-Centre User goal captured by the dialogue manager: Type: Indian Price-range: Cheap Location: City-Centre ========================================================= Dialogue statistics: ActualReward: 75 (for successful task completion) Dialogue length: 6, Dialogue State, 2010. ,
2011b] outlines an exciting direction of work with logue/user policy is generated. The following are a set of dialogue episodes are generated from these retrieved policies: ============================================== Step 1: HC-DialogueManager vs Train-RL-User, 2009. ,
CloseDia UserResponse: hangUp =============================================== Co-adaptation in dialogue systems ,
Apprenticeship learning via inverse reinforcement learning, Proc. of ICML, p.109, 2004. ,
Assessing user simulation for dialog systems using human judges and automatic evaluation measures, Proc. of the 46th meeting of the Association for Computational Linguistics, pp.622-629, 2008. ,
DOI : 10.1207/s15516709cog2805_8
Natural language understanding, p.12, 1995. ,
Social penetration: The development of interpersonal relationships, p.134, 1973. ,
Optimal control of Markov processes with incomplete state information, Journal of Mathematical Analysis and Applications, vol.10, issue.1, pp.174-205, 1965. ,
DOI : 10.1016/0022-247X(65)90154-X
How to Do Things with Words: Second Edition (William James Lectures), p.12, 1975. ,
DOI : 10.1093/acprof:oso/9780198245537.001.0001
Engaging Theories in Interpersonal Communication: Multiple Perspectives, p.134, 2008. ,
DOI : 10.4135/9781483329529
Dynamic Programming, pp.28-30, 1957. ,
A markovian decision process Functional approximation and dynamic programming, Mathematical Tables and Other Aids to Computation, pp.679-684, 1957. ,
Uncertain Outcome Values in Predicted Relationships Uncertainty Reduction Theory Then and Now, Human Communication Research, vol.5, issue.1, pp.34-38, 1986. ,
DOI : 10.1111/j.1468-2958.1985.tb00062.x
Relative entropy inverse reinforcement learning, Journal of Machine Learning Research -Proceedings Track, vol.15, issue.109, pp.182-189, 2011. ,
Linear least-squares algorithms for temporal difference learning, Machine Learning, pp.33-57, 1996. ,
User and Noise Adaptive Dialogue Management Using Hybrid System Actions, In Spoken Dialogue Systems for Ambient Environments Lecture Notes in Artificial Intelligence (LNAI), vol.6392, pp.13-24, 2010. ,
DOI : 10.1007/978-3-642-16202-2_2
URL : https://hal.archives-ouvertes.fr/hal-00552848
Optimizing Spoken Dialogue Management with Fitted Value Iteration, Proc. of InterSpeech, p.82, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00553184
Sparse Approximate Dynamic Programming for Dialog Management, Proc. of SIGDial, p.82, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00553180
User Simulation in Dialogue Systems using Inverse Reinforcement Learning, Proc. of Interspeech, p.122, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00652446
Apprentissage par Renforcement Inverse pour la Simulation d'Utilisateurs dans les Systèmes de Dialogue, InSixì emes Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes, p.7, 2011. ,
Behavior Specific User Simulation in Spoken Dialogue Systems, Proc. of the IEEE ITG Conference on Speech Communication, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00749421
Clustering behaviors of Spoken Dialogue Systems users, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012. ,
DOI : 10.1109/ICASSP.2012.6289038
URL : https://hal.archives-ouvertes.fr/hal-00685009
Automatic acquisition of names using speak and spell mode in spoken dialogue systems, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology , NAACL '03, 2003. ,
DOI : 10.3115/1073445.1073450
Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system, Proc. of Interspeech, pp.1301-1304, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00652194
Bayesian Q-Learning Automatic evaluation of machine translation quality using n-gram co-occurrence statistics, Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI) Proc. of the Human Language Technology Conference (HLT), pp.761-768, 1998. ,
An introduction to text-to-speech synthesis, p.13, 1997. ,
DOI : 10.1007/978-94-011-5730-8
User modeling for spoken dialogue system evaluation, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.47-101, 1997. ,
DOI : 10.1109/ASRU.1997.658991
User modeling for spoken dialogue system evaluation, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.80-87, 1997. ,
DOI : 10.1109/ASRU.1997.658991
The Kernel Recursive Least-Squares Algorithm, IEEE Transactions on Signal Processing, vol.52, issue.8, pp.2275-2285, 2004. ,
DOI : 10.1109/TSP.2004.830985
Model selection in reinforcement learning, Machine Learning, pp.1-34, 2011. ,
DOI : 10.1007/s10994-011-5254-7
Trains-95: Towards a mixedinitiative planning assistant, Proceedings of the 3rd Conference on AI Planning Systems, p.20, 1996. ,
Recent research advances in Reinforcement Learning in Spoken Dialogue Systems, The Knowledge Engineering Review, vol.16, issue.04, pp.375-408, 2009. ,
DOI : 10.1109/89.817450
Online policy optimisation of spoken dialogue systems via live interaction with human subjects, Proc. of ASRU 2011, p.133, 2011. ,
Optimisation des chanes de production dans l'industrie sidérurgique : une approche statistique de l'apprentissage par renforcement, p.92, 2009. ,
A Brief Survey of Parametric Value Function Approximation, p.62, 2010. ,
Kalman Temporal Differences, Journal of Artificial Intelligence Research, vol.39, issue.62, pp.483-532, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00351297
Managing Uncertainty within Value Function Approximation in Reinforcement Learning, Active Learning and Experimental Design workshop, p.92, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00554398
Kalman Temporal Differences: Uncertainty and Value Function Approximation, NIPS Workshop on Model Uncertainty and Risk in Reinforcement Learning, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00351298
Kalman Temporal Differences: The deterministic case, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp.185-192, 2009. ,
DOI : 10.1109/ADPRL.2009.4927543
URL : https://hal.archives-ouvertes.fr/hal-00380870
Learning user simulations for information state update dialogue systems, Proc. Interspeech '05, p.48, 2005. ,
User simulation for spoken dialogue systems: Learning and evaluation, Proc. International Conference on Spoken Language Processing (Interspeech/ICSLP), p.74, 2006. ,
Incremental leastsquares temporal difference learning, Proc. of AAAI, pp.356-361, 2006. ,
The application of the method of least squares to the interpolation of sequences, Proc. of the HLT-NAACL 2003 workshop on Research directions in dialogue processing, pp.439-447, 1974. ,
DOI : 10.1016/0315-0860(74)90034-2
User simulation for the evaluation of bus information systems, 2010 IEEE Spoken Language Technology Workshop, p.46, 2010. ,
DOI : 10.1109/SLT.2010.5700895
Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages, Audio, Speech, and Language Processing, pp.1569-1583, 2011. ,
DOI : 10.1109/TASL.2010.2093520
URL : https://hal.archives-ouvertes.fr/hal-00746965
Mixture model POMDPs for efficient handling of uncertainty in dialogue management, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies Short Papers, HLT '08, pp.73-76, 2008. ,
DOI : 10.3115/1557690.1557710
Learning Adaptive Referring Expression Generation Policies for Spoken Dialogue Systems using Reinforcement Learning, Proceedings SemDial'09, p.25, 2009. ,
Statistical Methods for Speech Recognition, p.22, 1998. ,
Data-driven user simulation for automated evaluation of spoken dialog systems, Computer Speech & Language, vol.23, issue.4, pp.479-509, 2009. ,
DOI : 10.1016/j.csl.2009.03.002
Dialogue management for natural language interfaces, Tech. rep., THE UNIVERSITY OF QUEENSLAND, 1993. ,
Reinforcement learning: a survey, Journal of Artificial Intelligence Research, vol.4, pp.237-285, 1996. ,
A New Approach to Linear Filtering and Prediction Problems, Journal of Basic Engineering, vol.82, issue.1, pp.35-45, 1960. ,
DOI : 10.1115/1.3662552
Parameter estimation for agenda-based user simulation, 2010. ,
User Simulation in the Development of Statistical Spoken Dialogue Systems, Data driven methods for Adaptive Spoken Dialogue Systems, 2012. ,
DOI : 10.1007/978-1-4614-4803-7_4
URL : https://hal.archives-ouvertes.fr/hal-00771701
Near-Bayesian exploration in polynomial time, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, p.92, 2009. ,
DOI : 10.1145/1553374.1553441
On Information and Sufficiency, The Annals of Mathematical Statistics, vol.22, issue.1, pp.79-86, 1951. ,
DOI : 10.1214/aoms/1177729694
Least-squares policy iteration, Journal of Machine Learning Research, vol.4, issue.114, pp.1107-1149, 2003. ,
Information state and dialogue management in the TRINDI dialogue move engine toolkit, Natural Language Engineering, vol.6, issue.3&4, pp.323-340, 2000. ,
DOI : 10.1017/S1351324900002539
NIST Machine translation evaluation official results. Official release of automatic evaluation scores for all submissions, p.54, 2005. ,
Dynamic Bayesian Networks and Discriminative Classifiers for Multi-Stage Semantic Interpretation, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07, p.22, 2007. ,
DOI : 10.1109/ICASSP.2007.367151
Unsupervised state clustering for stochastic dialog management Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation, Proc. of ASRU, pp.210-221, 2007. ,
Machine learning for spoken dialogue systems, Proc. of InterSpeech'07, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00216035
An ISU dialogue system exhibiting reinforcement learning of dialogue policies, Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations on, EACL '06, p.76, 2006. ,
DOI : 10.3115/1608974.1608986
Using Markov decision process for learning dialogue strategies, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), p.42, 1998. ,
DOI : 10.1109/ICASSP.1998.674402
A stochastic model of human-machine interaction for learning dialog strategies, IEEE Transactions on Speech and Audio Processing, vol.8, issue.1, pp.11-23, 2000. ,
DOI : 10.1109/89.817450
Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection, Proc. of the International Conference on Speech Communication and Technologies (InterSpeech'09), p.82, 2009. ,
Testing the performance of spoken dialogue systems by means of an artificially simulated user, Artificial Intelligence Review, vol.12, issue.2, pp.291-323, 2006. ,
DOI : 10.1007/s10462-007-9059-9
Probabilistic Ontology Trees for Belief Tracking in Dialog Systems, Proc. of the SIGDIAL 2010 Conference, pp.37-46, 2010. ,
Algorithms for inverse reinforcement learning BLEU: A method for automatic evaluation of machine translation, Proc. of ICML Proc. of the 40th Annual Meeting on Association for Computational Linguistics (ACL), p.54, 2000. ,
Universal Approximation Using Radial-Basis-Function Networks, Neural Computation, vol.2, issue.2, pp.246-257, 1991. ,
DOI : 10.1109/35.41401
A Probabilistic Description of Man-Machine Spoken Communication, 2005 IEEE International Conference on Multimedia and Expo, pp.410-413, 2005. ,
DOI : 10.1109/ICME.2005.1521447
Consistent goal-directed user model for realistic manmachine task-oriented spoken dialogue simulation, Proc. of ICME, pp.425-428, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00215968
A probabilistic framework for dialog simulation and optimal strategy learning, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.2, pp.589-599, 2006. ,
DOI : 10.1109/TSA.2005.855836
URL : https://hal.archives-ouvertes.fr/hal-00207952
A survey on metrics for the evaluation of user simulations. The Knowledge Engineering Review, p.116, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00771654
Sample Efficient On-line Learning of Optimal Dialogue Policies with Kalman Temporal Differences, Proc. of International Joint Conference on Artificial Intelligence (IJCAI), 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00618252
Sample-efficient batch reinforcement learning for dialogue management optimization, ACM Transactions on Speech and Language Processing, vol.7, issue.3, pp.1-7, 2011. ,
DOI : 10.1145/1966407.1966412
URL : https://hal.archives-ouvertes.fr/hal-00617517
Training Bayesian networks for realistic man-machine spoken dialogue simulation, Proc. of IWSDS, p.49, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00448636
Semantic graph clustering for pomdp-based spoken dialog systems, Proc. of Interspeech, pp.1321-1324, 2011. ,
Markov Decision Processes: Discrete Stochastic Dynamic Programming, pp.31-34, 1994. ,
DOI : 10.1002/9780470316887
Fundamentals of Speech Recognition, p.12, 1993. ,
Building natural language generation systems, p.13, 2000. ,
DOI : 10.1017/CBO9780511519857
URL : http://arxiv.org/abs/cmp-lg/9605002
Bootstrapping Reinforcement Learning-based Dialogue Strategies from Wizard-of-Oz data, p.102, 2008. ,
Simulations for learning dialogue strategies, Proc. of Interspeech, p.54, 2006. ,
Reinforcement Learning for Adaptive Dialogue Systems: A Data-driven Methodology for Dialogue Management and Natural Language Generation, Theory and Applications of Natural Language Processing, p.132, 2011. ,
DOI : 10.1007/978-3-642-24942-6
Spoken dialogue management using probabilistic reasoning, Proceedings of the 38th Annual Meeting on Association for Computational Linguistics , ACL '00, p.13, 2000. ,
DOI : 10.3115/1075218.1075231
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.8204
Reliability of internal prediction/estimation and its application. I. Adaptive action selection reflecting reliability of value function, Neural Networks, vol.17, issue.7, pp.935-952, 2004. ,
DOI : 10.1016/j.neunet.2004.05.004
Some studies in machine learning using the game of checkers, IBM Journal on Research and Development, pp.210-229, 1959. ,
Effects of the user model on simulation-based learning of dialogue strategies, 2005. ,
A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies, Proc. of ASRU'05, pp.97-126, 2006. ,
DOI : 10.1017/S0269888906000944
A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies, The Knowledge Engineering Review, vol.21, issue.02, pp.97-126, 2006. ,
DOI : 10.1017/S0269888906000944
Agenda-based user simulation for bootstrapping a POMDP dialogue system, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers on XX, NAACL '07, p.102, 2007. ,
DOI : 10.3115/1614108.1614146
Agenda-based user simulation for bootstrapping a POMDP dialogue system, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers on XX, NAACL '07, p.50, 2007. ,
DOI : 10.3115/1614108.1614146
Reinforcement learning for spoken dialogue systems, Proc. of NIPS, p.42, 1999. ,
The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs, Operations Research, vol.26, issue.2, pp.282-304, 1978. ,
DOI : 10.1287/opre.26.2.282
An analysis of model-based Interval Estimation for Markov Decision Processes, Journal of Computer and System Sciences, vol.74, issue.8, p.91, 2006. ,
DOI : 10.1016/j.jcss.2007.08.009
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, pp.35-36, 1998. ,
DOI : 10.1109/TNN.1998.712192
Transfer learning for reinforcement learning domains: A survey, Journal of Machine Learning Research, vol.10, pp.1633-1685, 2009. ,
Natural language generation for dialogue: system survey, p.25, 2003. ,
Information Retrieval, Butterworths, p.52, 1979. ,
D.: Developing pedagogically effective tutorial dialogue tactics: Experiments and a testbed, Proc. of SLaTE Workshop on Speech and Language Technology in Education, 2007. ,
PAR- ADISE: A framework for evaluating spoken dialogue agents, Proc. of the 35th Annual Meeting of the Association for Computational Linguistics (ACL'97, pp.271-280, 1997. ,
Q-learning, Machine Learning, pp.272-292, 1992. ,
Artificial companions, Proceedings of the 1st International Workshop on Machine Learning for Multimodal Interaction, p.19, 2004. ,
DOI : 10.1179/030801805X25945
Partially observable Markov decision processes for spoken dialog systems, Computer Speech & Language, vol.21, issue.2, pp.393-422, 2007. ,
DOI : 10.1016/j.csl.2006.06.008
Kernel-Based Least Squares Policy Iteration for Reinforcement Learning, IEEE Transactions on Neural Networks, vol.18, issue.4, pp.973-992, 2007. ,
DOI : 10.1109/TNN.2007.899161
Predictive statistical models for user modeling, User Modeling and User-Adapted Interaction, vol.11, issue.1/2, pp.5-18, 2001. ,
DOI : 10.1023/A:1011175525451