skip to main content
10.1145/3422392.3422421acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbesConference Proceedingsconference-collections
research-article

Usability and User eXperience Evaluation of Conversational Systems: A Systematic Mapping Study

Published:21 December 2020Publication History

ABSTRACT

Conversational Systems (CSs) are software that uses the user's voice to perform some action. Before these systems were available to users, they must be evaluated. In this sense, Usability and User eXperience (UX) evaluations contribute to the verification of software quality, since they evaluate several aspects such as efficiency, effectiveness, immersion, and user satisfaction. Therefore, the goal of our Systematic Mapping Study (SMS) is to identify the evaluation technologies (methods, techniques, models, among others) used by researchers and professionals to evaluate Usability and UX of CSs. We selected 39 papers for data extraction and, based on these works, we identified 31 different evaluation technologies. Besides, our SMS extracted the characteristics of technologies, CSs, and empirical studies described in the papers. Our results identify a lack of evaluation technologies of CSs that unite the concepts of Usability and UX and undergo empirical evaluations. Moreover, we observed researchers tend to create their questionnaires according to the needs of the study. Overall, our SMS presents data about the researched topic, describing the gaps, and contributing to the scientific community that evaluates Usability and UX of CSs.

References

  1. M. Amith, A. Zhu, R. Cunningham, R. Lin, L. Savas, L. Shay, Y. Chen, Y. Gong, J. Boom, K. Roberts, and C. Tao. 2019. Early Usability Assessment of a Conversational Agent for HPV Vaccination. Studies in Health Technology and Informatics 257 (2019), 17--23.Google ScholarGoogle Scholar
  2. A. Ampatzoglou, S. Bibi, P. Avgeriou, M. Verbeek, and A. Chatzigeorgiou. 2019. Identifying, categorizing and mitigating threats to validity in software engineering secondary studies. Information and Software Technology 106 (2019), 201--230.Google ScholarGoogle ScholarCross RefCross Ref
  3. J. N. Anderson, N. Davidson, H. Morton, and M. A.Jack. 2008. Language learning with interactive virtual agent scenarios and speech recognition: Lessons learned. Computer Animation and Virtual Worlds 19 (2008), 605--619.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Aquino, J. Vanderdonckt, N. Condori-Fernández, Ó. Dieste, and Ó. Pastor. 2010. Usability Evaluation of Multi-Device/Platform User Interfaces Generated by Model-Driven Engineering. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (Bolzano-Bozen, Italy) (ESEM '10). ACM, New York, NY, USA, 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. A. Azeta, I. A. Inam, and O. Daramola. 2018. A Voice-Based E-Examination Framework for Visually Impaired Students in Open and Distance Learning. Turkish Online Journal of Distance Education 19, 2 (2018), 34--46.Google ScholarGoogle ScholarCross RefCross Ref
  6. P. M. Bach and J. Lai. 2006. Usability and Learning in a Speech-Enabled Reading Tutor: A Field Study. In CHI '06 Extended Abstracts on Human Factors in Computing Systems (Montréal, Québec, Canada) (CHI EA '06). ACM, New York, NY, USA, 502--507.Google ScholarGoogle Scholar
  7. V. R. Basili and H. D. Rombach. 1988). Towards a Comprehensive Framework for Reuse: A Reuse-Enabling Software Evolution Environment. Technical Report. University of Maryland.Google ScholarGoogle Scholar
  8. R. Bernhaupt, P. Palanque, M. Winckler, and D. Navarre. 2007. Usability Study of Multi-modal Interfaces Using Eye-Tracking. In Human-Computer Interaction - INTERACT 2007, Cécilia Baranauskas, Philippe Palanque, Julio Abascal, and Simone Diniz Junqueira Barbosa (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 412--424.Google ScholarGoogle Scholar
  9. J. Brooke. 1996. SUS: A "quick and dirty" usability scale. In Usability evaluation in industry, P. W. JORDAN and et al. (Eds.). Taylor&Francis, London, 1--7.Google ScholarGoogle Scholar
  10. T. Brusie, T. Fijal, A. Keller, C. Lauff, K. Barker, J. Schwinck, J. F. Calland, and S. Guerlain. 2015. Usability evaluation of two smart glass systems. In 2015 Systems and Information Engineering Design Symposium. IEEE Computer Society, USA, 336--341.Google ScholarGoogle Scholar
  11. S. Carrino, A. Péclat, E. Mugellini, O. Abou Khaled, and R. Ingold. 2011. Humans and Smart Environments: A Novel Multimodal Interaction Approach. In Proceedings of the 13th International Conference on Multimodal Interfaces (Alicante, Spain) (ICMI '11). ACM, New York, NY, USA, 105--112.Google ScholarGoogle Scholar
  12. J. C. Chang, A. Lien, B. Lathrop, and H. Hees. 2009. Usability Evaluation of a Volkswagen Group In-Vehicle Speech System. In Proceedings of the 1st International Conference on Automotive User Interfaces and Interactive Vehicular Applications (Essen, Germany) (AutomotiveUI '09). ACM, New York, NY, USA, 137--144.Google ScholarGoogle Scholar
  13. M. H. Cohen, J. P. Giangola, and Jennifer Balogh. 2004. Voice User Interface Design. Addison-Wesley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. A. Coleti, M. Morandini, and F. de Lourdes dos Santos Nunes. 2014. ErgoSV: An Environment to Support Usability Evaluation Using Face and Speech Recognition. In Human-Computer Interaction. Theories, Methods, and Tools, Masaaki Kurosu (Ed.). Springer International Publishing, Cham, 554--564.Google ScholarGoogle Scholar
  15. N. T. Dang, M. Tavanti, I. Rankin, and M. Cooper. 2007. A Comparison of Different Input Devices for a 3D Environment. In Proceedings of the 14th European Conference on Cognitive Ergonomics: Invent! Explore! (London, United Kingdom) (ECCE '07). ACM, New York, NY, USA, 153--160.Google ScholarGoogle Scholar
  16. N. Davidson, F. McInnes, and M. A. Jack. 2004. Usability of dialogue design strategies for automated surname capture. Speech Communication 43, 1 (2004), 55--70.Google ScholarGoogle ScholarCross RefCross Ref
  17. V. Farinazzo, M. Salvador, A. L. S. Kawamoto, and J. S. de O. Neto. 2010. An Empirical Approach for the Evaluation of Voice User Interfaces. In User Interfaces, Rita Matrai (Ed.). IntechOpen, Rijeka.Google ScholarGoogle Scholar
  18. M. Federico. 1999. Usability Evaluation of a Spoken Data-Entry Interface. In Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2 (ICMCS '99). IEEE Computer Society, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. Fernandez Martinez, J. Blazquez, J. Ferreiros, R. Barra, J. Macias-Guarasa, and J. M. Lucas-Cuesta. 2008. Evaluation of a spoken dialogue system for controlling a Hifi audio system. In 2008 IEEE Spoken Language Technology Workshop. IEEE Computer Society, USA, 137--140.Google ScholarGoogle Scholar
  20. A. Ferracani, M. Faustino, G. X. Giannini, L. Landucci, and A. Del Bimbo. 2017. Natural Experiences in Museums through Virtual Reality and Voice Commands. In Proceedings of the 25th ACM International Conference on Multimedia (Mountain View, California, USA) (MM '17). ACM, New York, NY, USA, 1233----1234.Google ScholarGoogle Scholar
  21. K. Georgila, M. Wolters, J. D. Moore, and R. H. Logie. 2010. The MATCH corpus: a corpus of older and younger users' interactions with spoken dialogue systems. Language Resources & Evaluation 44 (2010), 221--261.Google ScholarGoogle ScholarCross RefCross Ref
  22. H. Gürkök, G. Hakvoort, M. Poel, and A. Nijholt. 2011. User Expectations and Experiences of a Speech and Thought Controlled Computer Game. In Proceedings of the 8th International Conference on Advances in Computer Entertainment Technology (Lisbon, Portugal) (ACE '11). ACM, New York, NY, USA, 1--6.Google ScholarGoogle Scholar
  23. H. Gürkök, G. Hakvoort, M. Poel, and A. Nijholt. 2017. Meeting the Expectations from Brain-Computer Interfaces. Computers in Entertainment 15, 3 (2017), 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Hara, N. Kitaoka, and K. Takeda. 2010. Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (Makuhari, Japan) (INTERSPEECH'10). 3034--3037.Google ScholarGoogle Scholar
  25. S. Hara, N. Kitaoka, and K. Takeda. 2010. Estimation Method of User Satisfaction Using N-gram-based Dialog History Model for Spoken Dialog System. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC '10). European Language Resources Association (ELRA), Valletta, Malta, 78--83.Google ScholarGoogle Scholar
  26. M. Hassenzahl and N. Tractinsky. 2006. User Experience - A Research Agenda. Behaviour & Information Technology 25, 2 (2006), 91--97.Google ScholarGoogle ScholarCross RefCross Ref
  27. K. S. Hone and R. Graham. 2000. Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI). Natural Language Engineering 6, 3-4 (2000), 287--303.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. O. Huerta, J. A. Sánchez, S. Fuentes, and O. Cervantes. 2011. Speak up Your Mind: Using Speech to Capture Innovative Ideas on Interactive Surfaces. In Proceedings of the 10th Brazilian Symposium on Human Factors in Computing Systems and the 5th Latin American Conference on Human-Computer Interaction (Porto de Galinhas, Pernambuco, Brazil) (IHC+CLIHC '11). Brazilian Computer Society, Porto Alegre, Brazil, 202--211.Google ScholarGoogle Scholar
  29. ISO 9241--210. 2019. Ergonomics of Human System Interaction - Part 210: Human-Centered Design for Interactive Systems. International Organization for Standardization.Google ScholarGoogle Scholar
  30. ISO/IEC 25010. 2011. Systems and Software Engineering - SquaRE - Software product Quality Requirements and Evaluation: System and Software Quality Models). International Organization for Standardization.Google ScholarGoogle Scholar
  31. S. Kafle and M. Huenerfauth. 2017. Evaluating the Usability of Automatically Generated Captions for People Who Are Deaf or Hard of Hearing. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility (Baltimore, Maryland, USA) (ASSETS '17). ACM, New York, NY, USA, 165--174.Google ScholarGoogle Scholar
  32. B. Kitchenham and S. Charters. 2007. Guidelines for performing Systematic Literature Reviews in Software Engineering. Technical Report. University of Durham, Durham.Google ScholarGoogle Scholar
  33. A. B. Kocaballi, L. Laranjo, and E. Coiera. 2018. Measuring User Experience in Conversational Interfaces: A Comparison of Six Questionnaires. In Proceedings of the 32nd International BCS Human Computer Interaction Conference (HCI). 1--12.Google ScholarGoogle Scholar
  34. K. Komine, N. Hiruma, T. Ishihara, E. Makino, T. Tsuda, T. Ito, and H. Isono. 2000. Usability evaluation of remote controllers for digital television receivers. In Human Vision and Electronic Imaging V, Bernice E. Rogowitz and Thrasyvoulos N. Pappas (Eds.), Vol. 3959. International Society for Optics and Photonics, SPIE, 458--467.Google ScholarGoogle Scholar
  35. F. Kusumaningayu and M. A. Ayu. 2017. A web accessing tool for blind and visually impaired people using Bahasa Indonesia. In 2017 Second International Conference on Informatics and Computing (ICIC). IEEE Computer Society, USA, 1--6.Google ScholarGoogle Scholar
  36. L. Laranjo, A. G. Dunn, H. L. Tong, A. B. Kocaballi, J. Chen, R. Bashir, D. Surian, B. Gallego, F. Magrabi, A. Y. S. Lau, and E. Coiera. 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (07 2018), 1248--1258.Google ScholarGoogle ScholarCross RefCross Ref
  37. J. Lumsden, N. Langton, and I. Kondratova. 2008. Evaluating the Appropriateness of Speech Input in Marine Applications: A Field Evaluation. In Proceedings of the 10th International Conference on Human Computer Interaction with Mobile Devices and Services (Amsterdam, The Netherlands) (MobileHCI '08). ACM, New York, NY, USA, 343--346.Google ScholarGoogle Scholar
  38. A. Madan and S. K. Dubey. 2012. Usability Evaluation Methods: A Literature Review. International Journal of Engineering Science and Technology (IJEST) 4, 2 (February 2012), 590--599.Google ScholarGoogle Scholar
  39. M. McTear, Z. Callejas, and D. Griol Barres. 2016. The Conversational Interface: Talking to Smart Devices. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. O. Miguel-Hurtado, R. Blanco-Gonzalo, R. Guest, and C. Lunerti. 2016. Interaction evaluation of a mobile voice authentication system. In 2016 IEEE International Carnahan Conference on Security Technology (ICCST). IEEE Computer Society, USA, 1--8.Google ScholarGoogle Scholar
  41. K. Moustakas, D. Tzovaras, L. Dybkjaer, N. Bernsen, and O. Aran. 2011. Using Modality Replacement to Facilitate Communication between Visually and Hearing-Impaired People. IEEE MultiMedia 18, 2 (2011), 26--37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. S. Narayanan and A. Potamianos. 2002. Creating conversational interfaces for children. IEEE Transactions on Speech and Audio Processing 10, 2 (2002), 65--78.Google ScholarGoogle ScholarCross RefCross Ref
  43. A. T. Neto, T. J. Bittar, R. P. M. Fortes, and K. Felizardo. 2009. Developing and Evaluating Web Multimodal Interfaces - a Case Study with Usability Principles. In Proceedings of the 2009 ACM Symposium on Applied Computing (Honolulu, Hawaii) (SAC '09). ACM, New York, NY, USA, 116--120.Google ScholarGoogle Scholar
  44. J. Nielsen. 1993. Usability Engineering. Academic Press, Boston.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. F. Paz and J. A. Pow-Sang. 2015. Usability Evaluation Methods for Software Development: A Systematic Mapping Review. In 2015 8th International Conference on Advanced Software Engineering Its Applications (ASEA). IEEE, USA, 1--4.Google ScholarGoogle Scholar
  46. T. Reis, M. de Sa, and L. Carrico. 2008. Multimodal Artefact Manipulation: Evaluation in Real Contexts. In 2008 Third International Conference on Pervasive Computing and Applications, Vol. 2. 570--575.Google ScholarGoogle Scholar
  47. R. Ren, J. W. Castro, S. T. Acuña, and J. de Lara. 2019. Evaluation Techniques for Chatbot Usability: A Systematic Mapping Study. International Journal of Software Engineering and Knowledge Engineering 29, 11 (2019), 1673--1702.Google ScholarGoogle ScholarCross RefCross Ref
  48. L. Rivero and T. Conte. 2017. A Systematic Mapping Study on Research Contributions on UX Evaluation Technologies. In Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems (Joinville, Brazil). ACM, New York, NY, USA, 10.Google ScholarGoogle Scholar
  49. G. Santos, A. R. Rocha, T. Conte, M. P. Barcellos, and R. Prikladnicki. 2012. Strategic Alignment between Academy and Industry: A Virtuous Cycle to Promote Innovation in Technology. In 2012 26th Brazilian Symposium on Software Engineering. IEEE, USA, 196--200.Google ScholarGoogle Scholar
  50. S. Schaffer, R. Schleicher, and S. Möller. 2015. Modeling input modality choice in mobile graphical and speech interfaces. International Journal of Human-Computer Studies 75 (2015), 21--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Spiliotopoulos, P. Stavropoulou, and G. Kouroupetroglou. 2009. Spoken Dialogue Interfaces: Integrating Usability. In HCI and Usability for e-Inclusion, Holzinger A. and Miesenberger K. (Eds.). Springer International Publishing, Berlin, Heidelberg, 484--499.Google ScholarGoogle Scholar
  52. A. Stedmon, V. Bayon, and G. Griffiths. 2011. Expanding Interaction Potentials within Virtual Environments: Investigating the Usability of Speech and Manual Input Modes for Decoupled Interaction. Advances in Human-Computer Interaction 2011 (2011).Google ScholarGoogle Scholar
  53. P. Tchankue, D. Vogts, and J. Wesson. 2010. Design and Evaluation of a Multimodal Interface for In-Car Communication Systems. In Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists (Bela Bela, South Africa) (SAICSIT '10). ACM, New York, NY, USA, 314--321.Google ScholarGoogle Scholar
  54. P. Tchankue, J. Wesson, and D. Vogts. 2012. Are Mobile In-Car Communication Systems Feasible? A Usability Study. In Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference (Pretoria, South Africa). ACM, New York, NY, USA, 262--269.Google ScholarGoogle Scholar
  55. D. T. Toledano, R. Fernández Pozo, Á. Hernández Trapote, and L. Hernández Gómez. 2006. Usability evaluation of multi-modal biometric verification systems. Interacting with Computers 18, 5 (2006), 1101--1122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. M. Turunen, J. Hakulinen, A. Melto, T. Heimonen, T. Laivo, and J. Hella. 2009. SUXES - User Experience Evaluation Method for Spoken and Multimodal Interaction. In Proceedings of INTERSPEECH 2009. INTERSPEECH, 2567--2570.Google ScholarGoogle ScholarCross RefCross Ref
  57. P. Wargnier, S. Benveniste, P. Jouvelot, and A. Rigaud. 2018. Usability assessment of interaction management support in LOUISE, an ECA-based user interface for elders with cognitive impairment. Technology and Disability 30, 3 (2018), 105--126.Google ScholarGoogle ScholarCross RefCross Ref
  58. S.J. M. C. Wenceslao and M. R. J. E. Estuar. 2019. Using CTAKES to Build a Simple Speech Transcriber Plugin for an EMR. In Proceedings of the Third International Conference on Medical and Health Informatics 2019 (Xiamen, China) (ICMHI 2019). ACM, New York, NY, USA, 78--86.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Usability and User eXperience Evaluation of Conversational Systems: A Systematic Mapping Study

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering
      October 2020
      901 pages
      ISBN:9781450387538
      DOI:10.1145/3422392

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 December 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate147of427submissions,34%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader