A Comprehensive Methodology for Evaluating Conversation-Based Interfaces to Relational Databases (C-BIRDs)

Owda, Majdi; Owda, Amani Yousef; Gasir, Fathi

doi:10.1007/978-3-030-55187-2_17

Majdi Owda¹⁷,
Amani Yousef Owda¹⁸ &
Fathi Gasir¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1251))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

890 Accesses
1 Citations

Abstract

Evaluation can be defined as a process of determining the significance of a research output. This is usually done by devising a well-structured study on this output using one or more evaluation measures in which a careful inspection is performed. This paper presents a review of evaluation techniques for Conversational Agents (CAs) and Natural Language Interfaces to Databases (NLIDBs). It then introduces the developed customized evaluation methodology for Conversation-Based Interface to Relational Databases (C-BIRDs). The evaluation methodology created has been divided into two groups of measures. The first is based on quantitative measures, including two measures: task success and dialogue length. The second group is based on a number of qualitative measures, including: prototype ease of use, naturalness of system responses, positive/negative emotion, appearance, text on screen, organization of information, and error message clarity. Then an elaboration is carried out on the devised methodology by adding a discussion and recommendations on the sample size, the experimental setup and the scaling in order to provide a comprehensive evaluation methodology for C-BIRDs. In conclusion the evaluation methodology created is better way for identifying the strengths and weaknesses of C-BIRDs in comparison to the usage of single measure evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Litman, D., Pan, S.: Designing and evaluating an adaptive spoken dialogue system. User Model. User-Adapted Interact. 12(2–3), 111–137 (2002)
Article MATH Google Scholar
Walker, M., Hirschman, L., Aberdeen, J.: Evaluation for DARPA communicator spoken dialogue systems. In: Proceedings Second International Conference on Language Resources and Evaluation (2000)
Google Scholar
Sanders, G., Scholtz, J.: Measurement and evaluation of embodied conversational agents. In: Embodied Conversational Agents, pp. 346–373. MIT Press (2000)
Google Scholar
Allen, J., Byron, D., Dzikovska, M., Ferguson, G., Galescu, L., Stent, A.: Toward conversational human-computer interaction. Am. Assoc. Artif. Intell. 22(4), 27–37 (2001)
Google Scholar
López-Cózar, R., Callejas, Z., Espejo, G., Griol, D.: Enhancement of conversational agents by means of multimodal interaction. In: Perez-Marin, D., Pascual-Nieto, I. (eds.) Conversational Agents and Natural Language Interaction: Techniques and Effective Practices, pp. 223–252 (2011)
Google Scholar
Hung, V., Elvir, M., Gonzalez, A., DeMara, R.: Towards a method for evaluating naturalness in conversational dialog systems. In: Proceedings of the 2009 IEEE International Conference on Systems, Man and Cybernetics, San Antonio, TX, USA, pp. 1236–1241. IEEE Press (2009)
Google Scholar
Lamel, L., Bennacef, S., Gauvain, J.L., Dartigues, H., Temem, J.N.: User evaluation of the MASK kiosk. Speech Commun. 38(1), 131–139 (2002)
Article MATH Google Scholar
Cassell, J., Bickmore, T.: Negotiated collusion: modeling social language and its relationship effects in intelligent agents. User Model. User-Adapted Interact. 13(1–2), 89–132 (2003)
Article Google Scholar
Semeraro, G., Andersen, H.H., Andersen, V., Lops, P., Abbattista, F.: Evaluation and validation of a conversational agent embodied in a bookstore. In: Proceedings of the User Interfaces for all 7th International Conference on Universal Access: Theoretical Perspectives, Practice, and Experience, Paris, France, pp. 360–371. Springer (2003)
Google Scholar
Bernsen, N.O., Dybkjær, L.: User interview-based progress evaluation of two successive conversational agent prototypes. In: INTETAIN, pp. 220–224. Springer (2005)
Google Scholar
Bouwman, G., Hulstijn, J.: Dialogue strategy redesign with reliability measures. In: Proceedings of First International Conference on Language Resources and Evaluation, pp. 191–198 (1998)
Google Scholar
Foster, M.E., Giuliani, M., Knoll, A.: Comparing objective and subjective measures of usability in a human-robot dialogue system. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 879–887. Association for Computational Linguistics (2009)
Google Scholar
Bigot, L., Jamet, E., Rouet, J.-F.: Searching information with a natural language dialogue system: a comparison of spoken vs. written modalities. Appl. Ergon. 35(6), 557–564 (2004)
Article Google Scholar
Artstein, R., Gandhe, S., Gerten, J., Leuski, A., Traum, D.: Semi-formal evaluation of conversational characters. In: Orna, G., Michael, K., Shmuel, K., Shuly, W. (eds.) Languages: From Formal to Natural, pp. 22–35. Springer (2009)
Google Scholar
Silvervarg, A., Jönsson, A.: Subjective and objective evaluation of conversational agents in learning environments for young teenagers. In: The Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain. AAAI Press/International Joint Conferences on Artificial Intelligence (2011)
Google Scholar
Kopp, S., Gesellensetter, L., Kramer, N., Wachsmuth, I.: A conversational agent as museum guide: design and evaluation of a real-world application. In: Lecture Notes in Computer Science, pp. 329–343. Springer (2005)
Google Scholar
McKevitt, P., Partridge, D., Wilks, Y.: Why machines should analyse intention in natural language dialogue. Int. J. Hum.-Comput. Stud. 51(5), 947–989 (1999)
Article Google Scholar
Bickmore, T., Giorgino, T.: Health dialog systems for patients and consumers. J. Biomed. Inform. 39(5), 556–571 (2006)
Article Google Scholar
Yuan, X., Chee, Y.S.: Design and evaluation of Elva: an embodied tour guide in an interactive virtual art gallery: research Articles. Comput. Animat. Virtual Worlds 16(2), 109–119 (2005)
Article Google Scholar
Palmer, M., Finin, S.T.: Workshop on the evaluation of natural language processing systems. Comput. Linguist. 16, 175–181 (1990)
Google Scholar
Forsmark, M.: Evaluating Natural Language Access to Relational Databases. UMEA University, Computing Science, Sweden (2005)
Google Scholar
Jung, H., Lee, G.G.: Multilingual question answering with high portability on relational databases. In: Proceedings of the 2002 Conference on Multilingual Summarization and Question Answering - Volume 19, pp. 1–8. Association for Computational Linguistics (2002)
Google Scholar
Popescu, A.-M., Armanasu, A., Etzioni, O., Ko, D., Yates, A.: Modern natural language interfaces to databases: composing statistical parsing with semantic tractability. In: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland. Association for Computational Linguistics (2004)
Google Scholar
Sharma, H., Kumar, N., Jha, G.K., Sharma, K.G., Wyld, D.C., Wozniak, M., Chaki, N., Meghanathan, N., Nagamalai, D.: A natural language interface based on machine learning approach. In: Communications in Computer and Information Science, vol. 197 Trends in Network and Communications, pp. 549–557. Springer, Heidelberg (2011)
Google Scholar
Tang, L., Mooney, R.: Automated construction of database interfaces: integrating statistical and relational learning for semantic parsing. In: Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora: Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13, Hong Kong, pp. 133–141. Association for Computational Linguistics (2000)
Google Scholar
Yates, A., Etzioni, O., Weld, D.: A reliable natural language interface to household appliances. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, Miami, Florida, USA, pp. 189–196. ACM (2003)
Google Scholar
Minock, M.: A phrasal approach to natural language interfaces over databases. In: Lecture Notes in Computer Science, Volume 3513, 2005 Natural Language Processing and Information Systems, pp. 333–336. Springer, Heidelberg (2005)
Google Scholar
Minock, M.: C-Phrase: a system for building robust natural language interfaces to databases. Data Knowl. Eng. 69(3), 290–302 (2010)
Article Google Scholar
Xiao, J., Stasko, J., Catrambone, R.: Embodied conversational agents as a UI paradigm: a framework for evaluation. In: Proceedings of AAMAS 2002 workshop: Embodied Conversational Agents Let’s Specify and Evaluate Them!, Bologna, Italy (2002)
Google Scholar
Molich, R., Nielsen, J.: Improving a human-computer dialogue. Commun. ACM 33(3), 338–348 (1990)
Article Google Scholar
Nielsen, J., Molich, R.: Heuristic evaluation of user interfaces. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Empowering People, Seattle, Washington, United States, pp. 249–256. ACM (1990)
Google Scholar
Nielsen, J., Landauer, T.: A mathematical model of the finding of usability problems. In: Proceedings of the INTERACT 1993 and CHI 1993 Conference on Human Factors in Computing Systems, Amsterdam, The Netherlands, pp. 206–213. ACM (1993)
Google Scholar
Blackmon, M.H., Polson, P.G., Kitajima, M., Lewis, C.: Cognitive walkthrough for the web. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Changing Our World, Changing Ourselves, Minneapolis, Minnesota, USA, pp. 463–470. ACM (2002)
Google Scholar
Blackmon, M.H., Kitajima, M., Polson, P.G.: Repairing usability problems identified by the cognitive walkthrough for the web. In: Proceedings of the SIGCHI conference on Human factors in computing systems, Ft. Lauderdale, Florida, USA, pp. 497–504. ACM (2003)
Google Scholar
Gabrielli, S., Mirabella, V., Kimani, S., Catarci, T.: Supporting cognitive walkthrough with video data: a mobile learning evaluation study. In: Proceedings of the 7th International Conference on Human Computer Interaction with Mobile Devices & Services, Salzburg, Austria, pp. 77–82. ACM (2005)
Google Scholar
Mahatody, T., Sagar, M., Kolski, C.: State of the art on the cognitive walkthrough method, its variants and evolutions. Int. J. Hum. Comput. Interact. 26(8), 741–785 (2010)
Article Google Scholar
Baik, C., Jagadish, H.V., Li, Y.: Bridging the semantic gap with SQL query logs in natural language interfaces to databases. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, Macao, pp. 374–385 (2019)
Google Scholar
Owda, M., Bandar, Z., Crockett, K.: Information extraction for SQL query generation in the conversation-based interfaces to relational databases (C-BIRD). In: Agent and Multi-Agent Systems: Technologies and Applications, pp. 44–53. Springer, Heidelberg (2011)
Google Scholar
Yuan, C., Ryan, P., Ta, C., et al.: Criteria2Query: a natural language interface to clinical databases for cohort definition. J. Am. Med. Inform. Assoc. 26(4), 294–305 (2019)
Article Google Scholar
Xu, B.: NADAQ: natural language database querying based on deep learning. IEEE Access 7, 35012–35017 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Mathematics, Manchester Metropolitan University, Chester Street, Manchester, M1 5GD, UK
Majdi Owda
Department of Electrical and Electronic Engineering, The University of Manchester, Sackville Street, Manchester, M13 9PL, UK
Amani Yousef Owda
Computer Science Department, Faculty of Information Technology, Misurata University, Misurata, Libya
Fathi Gasir

Authors

Majdi Owda
View author publications
You can also search for this author in PubMed Google Scholar
Amani Yousef Owda
View author publications
You can also search for this author in PubMed Google Scholar
Fathi Gasir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Majdi Owda .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Owda, M., Owda, A.Y., Gasir, F. (2021). A Comprehensive Methodology for Evaluating Conversation-Based Interfaces to Relational Databases (C-BIRDs). In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-55187-2_17
Published: 25 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55186-5
Online ISBN: 978-3-030-55187-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics