Abstract
The objective of this chapter is to identify the common knowledge and practice in research methodology and to apply it to the field of software evaluation, especially of embodied conversational agents. Relevant issues discussed are: how to formulate a good research question, what research strategy to use, which data collection methods are most appropriate and how to select the right participants. Reliability and validity of the data sets are dealt with and finally the chapter concludes with a list of guidelines that one should keep in mind when setting up and conducting empirical evaluation studies on embodied conversational agents.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andersson, G., Höök, K., Mourão, D., Paiva, A., and Costa, M. (2002). Using a Wizard of Oz study to inform the design of Sen Toy. Personal and ubiquitous computing, 6(5–6): 378–389.
Berg, B.L. (2001). Qualitative Research Methods for the Social Sciences. Allyn and Bacon, Boston.
Boehm, B. (1988). The spiral model of software development and enhancement. IEEE Computer, 21(5): 61–72.
Buisine, S., Abrilian, S., Rendu, C., and Martin, J. (2002). Towards experimental specification and evaluation of lifelike multimodal behaviour. In Proceedings of AAMAS 2002 workshop: Embodied conversational agents —-let’s specify and evaluate them!, Bologna, Italy.
Campbell, D.T. and Fiske, D.W. (1959). Convergent and discriminant validation by the multi trait-multi method matrix. Psychological Bulletin, 56: 81–105.
Christoph, L.H. and Van de Sande, J.P. (1999). Werkboek gedragsob-servatie: systematisch observeren en The Observer [Workbook observing behaviour: systematic observation and The Observer]. Wolters-Noordhoff, Groningen, The Netherlands.
Cohen, J.A. (1960). Coeffcient of agreement for nominal scales. Educational and Psychological measurement, 20: 37–46.
Cronbach, L.J. (1951). Coeffcient alpha and the internal structure of tests. Psychometrika., 16: 297–334.
Coolican, H. (1994). Research Methods and Statistics in Psychology. Hodder and Stoughton, London.
Cowell, A.J. and Stanney, K.M. (2003). On manipulating nonverbal interaction style to increase anthropomorphic computer character credibility. In Proceedings of AAMAS 2003 workshop: Embodied conversational characters as individuals, Melbourne, Australia.
De Furia, G.L. (1996). A behavioral model of interpersonal trust. Unpublished Ph.D. thesis. St. John’s University, Springfield, L.A., USA.
Dehn, D.M. and Van Mulken, S. (2000). The impact of animated interface agents: A review of empirical research. Int. J. human-computer studies, 52(1): 1–22.
Erdfelder, E., Faul, F., and Buchner, A. (1996). GPOWER: A general power analysis program. Behavior Research Methods, Instruments, and Computers, 28: 1–11.
Guilford, J.P. and Fruchter, B. (1978). Fundamental statistics in psychology and education. McGraw-Hill, New York.
Hix, D. and Harston, H.R. (1993). Developing user interfaces: ensuring usability through product and process. Wiley, New York, USA.
Holm, R., Priglinger, M., Stauder, E., Volkert, J., and Wagner, R. (2002). Automatic data acquisition and visulatization for usability evaluation of virtual reaity systems. In Proceedings of Eurographics Short Presentations, Saarbrücken, Germany.
Höök, K. (2002). Evaluation of affective interfaces. In Proceedings of AAMAS 2002 workshop: Embodied conversational agents — let’s specify and evaluate them!, Bologna, Italy.
Howel, D.C. (1982). Statistical methods for psychology. Duxbury Press, Boston, Mass.
Johnson, R. (1988). Elementary statistics. PWS-kent publishing company, Boston.
Kabel, S., De Hoog, R., and Sandberg, J. (1997). User interface evaluation and improvements: A framework and empirical results. Internal report SWI-UVA.
Krahmer, E., van Buuren, S., Ruttkay, Zs., and Wesselink, W. (2003). Audio-visual personality cues for embodied agents; an experimental evaluation. In Proceedings of AAMAS 2003 workshop: Embodied conversational characters as individuals, Melbourne, Australia.
Mangione, T.W. (1995). Mail surveys: Improving the quality. SAGE publications, Thousand Oakes, CA.
Morishima, S. and Nakamura, S. (2002). Multi-modal translation and evaluation of lip-synchronization using noise added voice. In Proceedings of AAMAS 2002 workshop: Embodied conversational agents — let’s specify and evaluate them!, Bologna, Italy.
Mosteller, F. and Rourke, R.E.K. (1973). Sturdy statistics: Nonparametrics and order statistics. Addison-Wesley, Massachusetts.
Moundridou, M. and Virvou, M. (2002). Evaluating the persona effect on an interface agent in an intelligent tutoring system. Journal of computer assisted learning, 18(3): 253–261.
Neter, J., Wasserman, W., and Kutner, M.H. (1990). Applied linear statistical models: regression, analysis of variance and experimental design. Irwin, Boston.
Neale, J.M. and Liebert, R.M (1986). Science and behavior. An introduction to methods of research. Prentice Hall International editions, New York.
Norman, D.A. (1986). Cognitive engineering. In, Norman, D.A. and Draper, S., editors. User Centered Systems Design: new perspectives on human-computer interaction, pp. 31–61, Erlbaum Associates, Hillsdale, NJ.
Nielsen, J. (1993). Usability engineering. Morgan Kaufmann, San Francisco.
Norusis, M.J. (2002). SPSS 11.0, guide to data analysis. Prentice Hall, New Jersey.
Oates, J., Gove, J., Goudge, A., Hill, R., Littleton, K., Christoph, L.H., Edwards, N., Gardner, R., Grayson, A., and Manners, P. (2000). fOCUS: a CD-ROM based application for developing observation skills. Winner of the European Academic Software Awards (EASA), November 2000, Rotterdam, The Netherlands.
Preece, J., Rogers, R., Sharp, H., Benyon, D., Holland, S., and Carey, T. (1994). Human-computer interaction. Addison-Wesley, England.
Reeves, T.C. and Hedberg, J.G. (2003). Interactive learning systems evaluation, Educational Technology Publications, Englewood Cliffs, NJ.
Rempel, J.K. and Holmes, J.G. (1986). How do I trust thee. Psychology Today, 20: 28–34.
Ruttkay, Zs., Dormann, C., and Noot, H. (2002). Evaluating ECAs — what and how?. In Proceedings of AAMAS 2002 workshop: Embodied conversational agents — let’s specify and evaluate them!, Bologna, Italy.
Sande, J.P., van de (1999). Gedragsobservatie: een inleiding tot systematisch observeren [Observing behaviour: an introduction to systematic observation]. Wolters-Noordhoff, Groningen, The Netherlands.
Silverman, D. (2000). Doing qualitative research: a practical handbook. SAGE publications, London.
Spradley, J.P. (1980). Participant observation. Holt Rinehart and Winston, New York.
SPSS Inc. (2002). SPSS version 11.0 for Windows. SPSS Inc., Chicago IL.
STATDISK (2003). STATDISK version 9.5 for Windows. Addison-Wesley, Boston.
Swanborn, P.G. (1997). Basisboek social onderzoek [Handbook of social research]. Boom, Meppel, Amsterdam, The Netherlands.
Triola, M.F. (2002). Essentials of statistics. Addison-Wesley, Boston.
Verschuren, P. and Doorewaard, H. (1999). Designing a research project. Lemma, Utrecht, The Netherlands.
Vocht, de, A. (2002). Basishandbook SPSS 11 voor Windows (Handbook SPSS 11 for Windows). Bijleveld press, Utrecht, The Netherlands.
Wilkinson, J. (1995). Direct observation. In. G.M. Breakwell, S. Hammond, and C. Fife-Schaw (Eds). Research methods in psychology. London, Sage publications.
Xiao, J., Stasko, J. and Catrambone, R. (2002). Embodied conversational agents as a UI paradigm: a framework for evaluation. In Proceedings of AAMAS 2002 workshop: Embodied conversational agents —let’s specify and evaluate them!, Bologna, Italy.
Editor information
Rights and permissions
Copyright information
© 2004 Kluwer Academic Publishers
About this chapter
Cite this chapter
Christoph, N. (2004). Empirical Evaluation Methodology for Embodied Conversational Agents. In: Ruttkay, Z., Pelachaud, C. (eds) From Brows to Trust. Human-Computer Interaction Series, vol 7. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2730-3_3
Download citation
DOI: https://doi.org/10.1007/1-4020-2730-3_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-2729-1
Online ISBN: 978-1-4020-2730-7
eBook Packages: Computer ScienceComputer Science (R0)