Skip to main content

A task-based evaluation of the TRAINS-95 dialogue system

  • Evaluation of Systems
  • Conference paper
  • First Online:
Dialogue Processing in Spoken Language Systems (DPSLS 1996)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1236))

Included in the following conference series:

Abstract

This paper describes a task-based evaluation methodology appropriate for dialogue systems such as the TRAINS-95 system, where a human and a computer interact and collaborate to solve a given problem. In task-based evaluations, techniques are measured in terms of their effect on task performance measures such as how long it takes to develop a solution using the system, and the quality of the final plan produced. We report recent experiment results which explore the effect of word recognition accuracy on task performance.

Funding was gratefully received from NSF under Grant IRI-90-13160 and from ONR/DARPA under Grant N00014-92-J-1512. Many thanks to George Ferguson for developing the on-line tutorial, Eric Ringger for compiling the word recognition accuracy figures, Amon Seagull for advice on statistical measures, and Peter Heeman for numerous helpful comments. Thanks also to Mike Tanenhaus and Joy Hanna for their suggestions on the experimental design.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. F. Allen, G. Ferguson, B. Miller, and E. Ringger. Spoken Dialogue and Interactive Planning. In Proceedings of the ARPA SLST Workshop, San Mateo California, January 1995. Morgan Kaufmann.

    Google Scholar 

  2. M. Boros, W. Eckert, F. Gallwitz, G. Görz, G. Hanrieder, H. Niemann. Towards Understanding Spontaneous Speech: Word Accuracy Vs. Concept Accuracy. In Proceedings of the International Conference on Spoken Language Processing, Philadelphia, Pennsylvania, October 1996.

    Google Scholar 

  3. P. Cohen and S. Oviatt. The Role of Voice Input for Human-Machine Communication. In Proceedings of the National Academy of Sciences, 1994.

    Google Scholar 

  4. L. Hirschman, M. Bates, D. Dahl, W. Fisher, J. Garofolo, D. Pallet, K. Hunicke-Smith, P. Price, A. Rudnicky and E. Tzoukermann. Multi-Site Data Collection and Evaluation in Spoken Language Understanding. In Proceedings of the ARPA Human Language Technology Workshop, Princeton, New Jersey, March 1993. Morgan Kaufmann.

    Google Scholar 

  5. X. D. Huang, F. Alleva, H.W. Hon, M. Y. Hwang, K. F. Lee, and R. Rosenfeld. The Sphinx-II Speech Recognition System: An Overview. Computer, Speech and Language, 1993.

    Google Scholar 

  6. S. Oviatt and P. Cohen. The Contributing Influence of Speech and Interaction on Human Discourse Patterns. In J. W. Sullivan and S. W. Tyler (eds), Intelligent User Interfaces. New York, New York. 1991. Addison-Wesley.

    Google Scholar 

  7. J. Polifroni, L. Hirschman, S. Seneff, and V. Zue. Experiments in Evaluating Interactive Spoken Language Systems. In Proceedings of the DARPA Speech and Natural Language Workshop, Harriman, New York, February 1992. Morgan Kaufmann.

    Google Scholar 

  8. E. Ringger and J. F. Allen. Error Correction Via A Post-Processor For Continuous Speech Recognition. Proceedings of ICASSP-96, Atlanta Georgia, May 1996.

    Google Scholar 

  9. A. Rudnicky. Mode Preferences in a Simple Data Retrieval Task. In Proceedings of the ARPA Human Language Technology Workshop, Princeton, New Jersey, March 1993. Morgan Kaufmann.

    Google Scholar 

  10. E. Shriberg, E. Wade, and P. Price. Human-Machine Problem Solving Using Spoken Language Systems (SLS): Factors Affecting Performance and User Satisfaction. In Proceedings of the DARPA Speech and Natural Language Workshop, Harriman, New York, February 1992. Morgan Kaufmann.

    Google Scholar 

  11. R. Smith and R. D. Hipp. Spoken Natural Language Dialog Systems: A Practical Approach, Oxford University Press. 1994.

    Google Scholar 

  12. S. Walter. Neal-Montgomery NLP System Evaluation Methodology. In Proceedings of the DARPA Speech and Natural Language Workshop, Harriman, New York, February 1992. Morgan Kaufmann.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elisabeth Maier Marion Mast Susann LuperFoy

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sikorski, T., Allen, J.F. (1997). A task-based evaluation of the TRAINS-95 dialogue system. In: Maier, E., Mast, M., LuperFoy, S. (eds) Dialogue Processing in Spoken Language Systems. DPSLS 1996. Lecture Notes in Computer Science, vol 1236. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63175-5_48

Download citation

  • DOI: https://doi.org/10.1007/3-540-63175-5_48

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63175-0

  • Online ISBN: 978-3-540-69206-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics