Abstract
This paper proposes a new technique to test the performance of spoken dialogue systems by artificially simulating the behaviour of three types of user (very cooperative, cooperative and not very cooperative) interacting with a system by means of spoken dialogues. Experiments using the technique were carried out to test the performance of a previously developed dialogue system designed for the fast-food domain and working with two kinds of language model for automatic speech recognition: one based on 17 prompt-dependent language models, and the other based on one prompt-independent language model. The use of the simulated user enables the identification of problems relating to the speech recognition, spoken language understanding, and dialogue management components of the system. In particular, in these experiments problems were encountered with the recognition and understanding of postal codes and addresses and with the lengthy sequences of repetitive confirmation turns required to correct these errors. By employing a simulated user in a range of different experimental conditions sufficient data can be generated to support a systematic analysis of potential problems and to enable fine-grained tuning of the system.
Similar content being viewed by others
References
Allen J (1995) Natural language understanding. The Benjamin/Cummings Publishing Company Inc
Allen J, Byron D, Dzikovska M, Ferguson G, Galescu L and Stent A (2001). Towards conversational human–computer interaction. AI Mag 22(4): 27–38
Aust H, Oerder M (1995) Dialogue control in automatic inquiry systems. In: ESCA workshop on spoken dialogue systems, Vigso, Denmark, June pp 121–124
Carbini S, Delphin-Poulat L, Perron L and Viallet JE (2006). From a Wizard of Oz experiments to a real time speech and gesture multimodal interface. Signal Processing 86(12): 3559–3577
Chung G (2004) Developing a flexible spoken dialog system using simulation. In Proceedings of ACL, Barcelona, Spain, pp 63–70
Cuayáhuitl H, Renals S, Lemon O, Shimodaira H (2005) Human-computer dialogue simulation using Hidden Markov models. In: Proc. of 2005 IEEE automatic speech recognition and understanding workshop (ASRU), San Juan, Puerto Rico, 28 November–1 December 2005, pp 290–295
Dow S, MacIntyre B, Lee J, Oezbek C, Bolter JD and Gandy M (2005). Wizard of Oz support throughout an interative design process. IEEE Pervasive Comput 4(8): 18–26
Eckert W, Levin E, Pieraccini R (1997) User modelling for spoken dialogue system evaluation. In: Proceedings of 1997 IEEE automatic speech recognition and understanding workshop (ASRU), Santa Barbara, California, 14–17 December 1997, pp 80–87
Georgila K, Henderson J, Lemon O (2005) Learning user simulations for information state update dialogue systems. In: Proceedings of Interspeech-eurospeech, Lisbon, Portugal, 4–8 September 2005, pp 893–896
Hain T, Woodland PC, Niesler TR, Whittaker EWD (1999) The 1998 HTK system for transcription of conversational telephone speech. In: Proceedings of the 1999 international conference on acoustics, speech and signal processing, pp 57–60
Huang X, Acero A, Hon H (2001) Spoken language processing: a guide to theory, Algorithm and System Development, Prentice-Hall
Levin E, Pieraccini R and Eckert W (2000). A stochastic model of human–machine interaction for learning dialog strategies. IEEE Trans Speech Audio Process 8(1): 11–23
Litman D and Pan S (2002). Designing and evaluating an adaptive spoken dialogue system. User Model User-adapt Interact 12(2/3): 111–137
Lin BS and Lee LS (2001). Computer aided analysis and design for spoken dialogue systems based on quantitative simulations. IEEE Trans Speech Audio Process 9(5): 534–548
López-Cózar R, Araki M (2005) Spoken, multilingual and multimodal dialogue systems. Development and Assessment, John Wiley & Sons Publishers
López-Cózar R and Callejas Z (2005). Combining language models in the input interface of a spoken dialogue system. Comput Speech Lang 20: 420–440
López-Cózar R, Milone DH (2001) A new technique based on augmented language models to improve the performance of spoken dialogue systems. Proceedings of Eurospeech, pp 741–744
López-Cózar R, García P, Díaz J, Rubio AJ (1997) A voice activated dialogue system for fast-food restaurant applications. Proceedings of Eurospeech, pp 1783–1786
López-Cózar R, Rubio AJ, García P, Segura JC (1998) A spoken dialogue system based on a dialogue corpus analysis. Proceedings of LREC, pp 55–58
López-Cózar R, Rubio AJ, Díaz Verdejo JE, De la Torre A (2000) Evaluation of a dialogue system based on a generic model that combines robust speech understanding and mixed-initiative control. Proceedings of LREC, pp 743–748
López-Cózar R, De la Torre A, Segura JC, Rubio AJ, López-Soler JM (2002) A new method for testing dialogue systems based on simulations of real-world conditions. Proceedings of ICSLP, pp 305–308
López-Cózar R, De la Torre A, Segura JC, Rubio AJ and Sánchez V (2003). Assessment of dialogue systems by means of a new simulation technique. Speech Communi 40(3): 387–407
McTear M (2004). Spoken dialogue technology: toward the conversational user interface. Springer, London
Möller S (2004) Quality of telephone-based spoken dialogue systems, Springer
Möller S, Englert R, Engelbrecht K, Hafner V, Jameson A, Oulasvirta A, Raake A, Reithinger N (2006) MeMo: towards automatic usability evaluation of spoken dialogue services by user error simulations. Proceedings of Interspeech, pp 1786–1789
Okamoto M, Cho K, Okamoto Y, Yamasaki T, Hattori M (2005) User-model-based adaptability evaluation for context-aware systems. In: Proceedings of the international conference of pervasive services, Santorini, Greece, 11–14 July 2005, pp 470–473
Pieraccini R, Huerta J (2005) Where do we go from here? Research and commercial spoken dialog systems. In: Proceedings of 6th SIGdial workshop on dialogue and discourse, Lisbon, Portugal, 2–3 September 2005, pp 1–10
Pietquin O and Dutoit T (2006a). A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Trans Audio Speech Lang Process 14(2): 589–599
Pietquin O, Dutoit T (2006b) Dynamic Bayesian networks for NLU simulation with applications to dialog optimal strategy learning. In: Proceedings of the 2006 IEEE international conference on acoustics, speech and signal processing (ICASSP), Toulouse, France, 15–19 May 2005, pp 49–52
Rabiner L, Juang BH (1993) Fundamentals of speech recognition, Prentice-Hall
Schatzmann J, Georgila K, Young S (2005a) Quantitative evaluation of user simulation techniques for spoken dialogue systems. In: Proceedings of the 6th SIGdial workshop on discourse and dialogue, Lisbon
Schatzmann J, Stuttle MN, Weilhammer K, Young S (2005b) Effects of the user model on simulation-based learning of dialogue strategies. In Proceedings of IEEE automatic speech recognition and understanding workshop (ASRU), San Juan, Puerto Rico 2005, pp 220–225
Scheffler K, Young S (2000) Probabilistic simulation of human-machine dialogues. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), Istanbul, Turkey, pp 1217–1220
Scheffler, K, Young S (2001) Corpus-based dialogue simulation for automatic strategy learning and evaluation. In: Proceedings of the NAACL workshop on adaptation in dialogue systems, Pittsburgh, pp 64–70
Singh S, Litman D, Kearns M and Walker M (2002). Optimizing dialogue managment with reinforcement learning: experiments with the NJFun system. J Artif Intell Res 16: 105–133
Walker M (2000). An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. J Artif Intell Res JAIR 12: 387–416
Walker M, Litman D, Kamm C, Abella A (1997) PARADISE: a framework for evaluating spoken dialogue agents. In: Proceedings of the 35th annual meeting of the association for computational linguistics (ACL-97), pp 271–280
Young S (2002) Talking to machines (Statistically Speaking) In: Proceedings of ICSLP’2002, 7th international conference on spoken language processing, Denver, Colorado
Young S, Kershaw D, Odell J, Ollason D, Valtchev V, Woodland P (2000) The HTK book (for HTK Version 3.0). Microsoft Corporation
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
López-Cózar, R., Callejas, Z. & McTear, M. Testing the performance of spoken dialogue systems by means of an artificially simulated user. Artif Intell Rev 26, 291–323 (2006). https://doi.org/10.1007/s10462-007-9059-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-007-9059-9