skip to main content
research-article

Report on the 8th Workshop on Search-Oriented Conversational Artificial Intelligence (SCAI 2024) at CHIIR 2024

Published: 07 August 2024 Publication History

Abstract

Conversational Agents are increasingly integrated into our daily routines, assisting us with various tasks, from simple commands such as scheduling events to more complex conversational search interactions. Such conversational search systems are traditionally evaluated with word-overlap metrics such as F1 score and accuracy. The full-day workshop on Search-Oriented Conversational Artificial Intelligence (SCAI) at CHIIR 2024 explored the evaluation of conversational search systems from the user's perspective. This interactive workshop included multiple panel discussions and working groups focused on developing and discussing innovative, user-centered evaluation methods for these systems. This paper, co-authored by both organizers and participants of the workshop, presents a summary of the insights gathered from the panel discussions and working groups.
Date: 14 March 2024.
Website: https://scai.info/scai-2024/.

References

[1]
Mahyar Abbasian, Elahe Khatibi, Iman Azimi, David Oniani, Zahra Shakeri Hossein Abad, Alexander Thieme, Ram Sriram, Zhongqi Yang, Yanshan Wang, Bryant Lin, Olivier Gevaert, Li-Jia Li, Ramesh Jain, and Amir M. Rahmani. Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative ai. npj Digital Medicine, 7(1): 82, Mar 2024. ISSN 2398-6352.
[2]
Jeremy Ang, Rajdip Dhillon, Ashley Krupski, Elizabeth Shriberg, and Andreas Stolcke. Prosody-based automatic detection of annoyance and frustration in human-computer dialog. In Proceedings of the International Conference on Spoken Language Processing, 2002.
[3]
Krisztian Balog and ChengXiang Zhai. User simulation for evaluating information access systems. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, SIGIR-AP '23, page 302--305, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400704086.
[4]
Satanjeev Banerjee and Alon Lavie. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In Jade Goldstein, Alon Lavie, Chin-Yew Lin, and Clare R. Voss, editors, Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization@ACL 2005, Ann Arbor, Michigan, USA, June 29, 2005, pages 65--72. Association for Computational Linguistics, 2005. URL https://aclanthology.org/W05-0909/.
[5]
Mikhail Burtsev, Aleksandr Chuklin, Julia Kiseleva, and Alexey Borisov. Search-oriented conversational AI (SCAI). In Jaap Kamps, Evangelos Kanoulas, Maarten de Rijke, Hui Fang, and Emine Yilmaz, editors, Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2017, Amsterdam, The Netherlands, October 1-4, 2017, pages 333--334. ACM, 2017.
[6]
Aleksandr Chuklin, Jeff Dalton, Julia Kiseleva, Alexey Borisov, and Mikhail Burtsev, editors. Proceedings of the 2nd International Workshop on Search-Oriented Conversational AI, SCAI@EMNLP 2018, Brussels, Belgium, October 31, 2018, 2018. Association for Computational Linguistics. ISBN 978-1-948087-75-9. URL https://aclanthology.org/volumes/W18-57/.
[7]
Jeff Dalton, Aleksandr Chuklin, Julia Kiseleva, and Mikhail Burtsev, editors. Proceedings of the 5th International Workshop on Search-Oriented Conversational AI (SCAI), Online, November 2020. Association for Computational Linguistics. URL https://aclanthology.org/2020.scai-1.0.
[8]
Alexander Frummet, Andrea Papenmeier, Maik Fröbe, and Johannes Kiesel. The eighth workshop on search-oriented conversational artificial intelligence (scai'24). In Paul D. Clough, Morgan Harvey, and Frank Hopfgartner, editors, Proceedings of the 2024 ACM SIGIR Conference on Human Information Interaction and Retrieval, CHIIR 2024, Sheffield, United Kingdom, March 10-14, 2024, pages 433--435. ACM, 2024.
[9]
Kaixin Ji, Damiano Spina, Danula Hettiachchi, Flora Dilys Salim, and Falk Scholer. Examining the impact of uncontrolled variables on physiological signals in user studies for information processing activities. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '23, page 1971--1975, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9781450394086.
[10]
Kaixin Ji, Danula Hettiachchi, Flora D. Salim, Falk Scholer, and Damiano Spina. Characterizing information seeking processes with multiple physiological signals. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '24, New York, NY, USA, 2024. ACM.
[11]
Chin-Yew Lin. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74--81, Barcelona, Spain, July 2004. Association for Computational Linguistics. URL https://aclanthology.org/W04-1013.
[12]
Daniel McDuff, Paul Thomas, Kael Rowan, Nick Craswell, and Mary Czerwinski. Do affective cues validate behavioural metrics for search? In SIGIR 2021. ACM, July 2021. URL https://www.microsoft.com/en-us/research/publication/do-affective-cues-validate-behavioural-metrics-for-search/.
[13]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL '02, page 311--318, USA, 2002. Association for Computational Linguistics.
[14]
Gustavo Penha, Svitlana Vakulenko, Ondrej Dusek, Leigh Clark, Vaishali Pal, and Vaibhav Adlakha. The seventh workshop on search-oriented conversational artificial intelligence (scai'22). In Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai, editors, SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, pages 3466--3469. ACM, 2022.
[15]
Johanne R. Trippas, Sara Fahad Dawood Al Lawati, Joel Mackenzie, and Luke Gallagher. What do users really ask large language models? an initial log analysis of google bard interactions in the wild. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '24), SIGIR '24, New York, NY, USA, 2024. ACM.
[16]
Svitlana Vakulenko, Johannes Kiesel, and Maik Fröbe. Scai-qrecc shared task on conversational question answering. In Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, Marseille, France, 20-25 June 2022, pages 4913--4922. European Language Resources Association, 2022. URL https://aclanthology.org/2022.lrec-1.525.
[17]
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=SkeHuCVFDr.

Cited By

View all
  • (2024)Towards Investigating Biases in Spoken Conversational SearchCompanion Proceedings of the 26th International Conference on Multimodal Interaction10.1145/3686215.3690156(61-66)Online publication date: 4-Nov-2024
  • (2024)Towards Detecting and Mitigating Cognitive Bias in Spoken Conversational SearchAdjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction10.1145/3640471.3680245(1-10)Online publication date: 21-Sep-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGIR Forum
ACM SIGIR Forum  Volume 58, Issue 1
June 2024
182 pages
DOI:10.1145/3687273
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2024
Published in SIGIR Volume 58, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)51
  • Downloads (Last 6 weeks)11
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Investigating Biases in Spoken Conversational SearchCompanion Proceedings of the 26th International Conference on Multimodal Interaction10.1145/3686215.3690156(61-66)Online publication date: 4-Nov-2024
  • (2024)Towards Detecting and Mitigating Cognitive Bias in Spoken Conversational SearchAdjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction10.1145/3640471.3680245(1-10)Online publication date: 21-Sep-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media