research-article

A Study of Explainability Features to Scrutinize Faceted Filtering Results

Authors:
Jiaming Qu

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
Jaime Arguello

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
Yue Wang

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 1498–1507https://doi.org/10.1145/3459637.3482409

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 1498–1507

ABSTRACT

Faceted search systems enable users to filter results by selecting values along different dimensions or facets. Traditionally, facets have corresponded to properties of information items that are part of the document metadata. Recently, faceted search systems have begun to use machine learning to automatically associate documents with facet-values that are more subjective and abstract. Examples include search systems that support topic-based filtering of research articles, concept-based filtering of medical documents, and tag-based filtering of images. While machine learning can be used to infer facet-values when the collection is too large for manual annotation, machine-learned classifiers make mistakes. In such cases, it is desirable to have a scrutable system that explains why a filtered result is relevant to a facet-value. Such explanations are missing from current systems. In this paper, we investigate how explainability features can help users interpret results filtered using machine-learned facets. We consider two explainability features: (1) showing prediction confidence values and (2) highlighting rationale sentences that played an influential role in predicting a facet-value. We report on a crowdsourced study involving 200 participants. Participants were asked to scrutinize movie plot summaries predicted to satisfy multiple genres and indicate their agreement or disagreement with the system. Participants were exposed to four interface conditions. We found that both explainability features had a positive impact on participants' perceptions and performance. While both features helped, the sentence-highlighting feature played a more instrumental role in enabling participants to reject false positive cases. We discuss implications for designing tools to help users scrutinize automatically assigned facet-values.

Supplemental Material

CIKM21-fp1720.mp4

mp4

153.3 MB

Download

References

Krisztian Balog, Filip Radlinski, and Shushan Arakelyan. 2019. Transparent, scrutable and explainable user models for personalized recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 265--274. Google ScholarDigital Library
David Bamman, Brendan O'Connor, and Noah A Smith. 2013. Learning latent personas of film characters. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 352--361.Google Scholar
Gagan Bansal, Besmira Nushi, Ece Kamar, Daniel S Weld, Walter S Lasecki, and Eric Horvitz. 2019. Updates in human-ai teams: Understanding and addressing the performance/compatibility tradeoff. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 2429--2437.Google ScholarDigital Library
Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the whole exceed its parts? the effect of ai explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1--16. Google ScholarDigital Library
John Brooke. 2013. SUS: a retrospective. Journal of usability studies 8, 2 (2013), 29--40. Google ScholarDigital Library
Claudio Carpineto, Stanislaw Osi'ski, Giovanni Romano, and Dawid Weiss. 2009. A survey of web clustering engines. ACM Computing Surveys (CSUR) 41, 3 (2009), 1--38. Google ScholarDigital Library
Samuel Carton, Qiaozhu Mei, and Paul Resnick. 2020. Feature-Based Explanations Don't Help People Detect Misclassifications of Online Toxicity. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 95--106.Google ScholarCross Ref
Mon Chu Chen, John R. Anderson, and Myeong Ho Sohn. 2001. What Can a Mouse Cursor Tell Us More? Correlation of Eye/Mouse Movements on Web Browsing. In CHI '01 Extended Abstracts on Human Factors in Computing Systems. ACM, New York, NY, USA, 281--282. Google ScholarDigital Library
Ian Covert, Scott Lundberg, and Su-In Lee. 2020. Feature Removal Is a Unifying Principle for Model Explanation Methods. arXiv preprint arXiv:2011.03623 (2020).Google Scholar
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
Jody Condit Fagan. 2010. Usability studies of faceted browsing: A literature review. Information Technology and Libraries 29, 2 (2010), 58--66.Google ScholarCross Ref
Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA). IEEE, 80--89.Google ScholarCross Ref
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International Conference on Machine Learning. PMLR, 1321--1330. Google ScholarDigital Library
Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Human Mental Workload. Advances in Psychology, Vol. 52. North-Holland, 139--183.Google Scholar
Marti Hearst. 2009. Integrating Navigation with Search. In Search User Interfaces. Cambridge University Press, Chapter 8.Google Scholar
Marti A Hearst. 1995. Tilebars: Visualization of term distribution information in full text information access. In Proceedings of the SIGCHI conference on Human factors in computing systems. 59--66. Google ScholarDigital Library
Orland Hoeber, Daniel Schroeder, and Michael Brooks. 2009. Real-world user evaluations of a visual and interactiveWeb search interface. In 2009 13th International Conference Information Visualisation. IEEE, 119--126. Google ScholarDigital Library
Tom Hope, Jason Portenoy, Kishore Vasan, Jonathan Borchardt, Eric Horvitz, Daniel Weld, Marti Hearst, and Jevin West. 2020. SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search. In Proceedings of the 2020conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 135--143. https://doi.org/10.18653/v1/2020.emnlp-demos.18Google ScholarCross Ref
Ece Kamar. 2016. Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence.. In IJCAI. 4070--4073. Google ScholarDigital Library
Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting Interpretability: Understanding Data Scientists' Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020cHI Conference on Human Factors in Computing Systems. 1--14. Google ScholarDigital Library
Been Kim, Cynthia Rudin, and Julie A Shah. 2014. The bayesian case model: A generative approach for case-based reasoning and prototype classification. In Advances in neural information processing systems. 1952--1960. Google ScholarDigital Library
Weize Kong and James Allan. 2014. Extending faceted search to the general web. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 839--848. Google ScholarDigital Library
Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 29--38. Google ScholarDigital Library
Benjamin CG Lee and Daniel S Weld. 2020. Newspaper Navigator: Open Faceted Search for 1.5 Million Images. In Adjunct Publication of the 33rd Annual ACM Symposium on User Interface Software and Technology. 120--122. Google ScholarDigital Library
Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing Neural Predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 107--117.Google ScholarCross Ref
Piyawat Lertvittayakumjorn and Francesca Toni. 2019. Human-grounded Evaluations of Explanation Methods for Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 5198--5208.Google ScholarCross Ref
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems 30 (2017), 4765--4774. Google ScholarDigital Library
Yuqing Mao and Zhiyong Lu. 2017. MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank. Journal of biomedical semantics 8, 1 (2017), 1--9.Google ScholarCross Ref
Iain J Marshall, Joël Kuiper, and Byron C Wallace. 2016. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association 23, 1 (2016), 193--201.Google ScholarCross Ref
Siyu Mi and Jiepu Jiang. 2019. Understanding the Interpretability of Search Result Summaries. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 989--992. Google ScholarDigital Library
Christoph Molnar. 2020. Interpretable machine learning.Google Scholar
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In Proceedings of the 22nd international conference on Machine learning. 625--632. Google ScholarDigital Library
National Library of Medicine. 2017. Frequently Asked Questions about Indexing for MEDLINE. https://www.nlm.nih.gov/bsd/indexfaq.html. Accessed: 2021-05.Google Scholar
Jerome Ramos and Carsten Eickhoff. 2020. Search Result Explanations Improve Efficiency and Trust. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1597--1600. Google ScholarDigital Library
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. " Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135--1144. Google ScholarDigital Library
Axel J Soto, Piotr Przybya, and Sophia Ananiadou. 2019. Thalia: semantic search engine for biomedical abstracts. Bioinformatics 35, 10 (2019), 1799--1801.Google ScholarCross Ref
Harini Suresh, Steven R Gomez, Kevin K Nam, and Arvind Satyanarayan. 2021. Beyond Expertise and Roles: A Framework to Characterize the Stakeholders of Interpretable Machine Learning and their Needs. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1--16. Google ScholarDigital Library
Kuansan Wang, Zhihong Shen, Chiyuan Huang, Chieh-Han Wu, Yuxiao Dong, and Anshul Kanakia. 2020. Microsoft academic graph: When experts are not enough. Quantitative Science Studies 1, 1 (2020), 396--413.Google ScholarCross Ref
Chih-Hsuan Wei, Alexis Allot, Robert Leaman, and Zhiyong Lu. 2019. PubTator central: automated concept annotation for biomedical full text articles. Nucleic acids research 47, W1 (2019), W587--W593.Google Scholar
Chih-Hsuan Wei, Hung-Yu Kao, and Zhiyong Lu. 2013. PubTator: a web-based text mining tool for assisting biocuration. Nucleic acids research 41, W1 (2013), W518--W522.Google Scholar
Honghan Wu, Giulia Toti, Katherine I Morley, Zina M Ibrahim, Amos Folarin, Richard Jackson, Ismail Kartoglu, Asha Agrawal, Clive Stringer, Darren Gale, et al. 2018. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. Journal of the American Medical Informatics Association 25, 5 (2018), 530--537.Google ScholarCross Ref
Yongfeng Zhang, Xu Chen, et al. 2020. Explainable Recommendation: A Survey and New Perspectives. Foundations and Trends® in Information Retrieval 14, 1 (2020), 1--101.Google ScholarDigital Library
Yunfeng Zhang, Q Vera Liao, and Rachel KE Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in ai-assisted decision making. In Proceedings of the 2020conference on Fairness, Accountability, and Transparency. 295--305. Google ScholarDigital Library

Index Terms

A Study of Explainability Features to Scrutinize Faceted Filtering Results
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
2. Information systems
  1. Information retrieval
    1. Users and interactive retrieval
      1. Search interfaces

Recommendations

Understanding Faceted Search from Data Science and Human Factor Perspectives

Faceted search has become a common feature on most search interfaces in e-commerce websites, digital libraries, government’s open information portals, and so on. Beyond the existing studies on developing algorithms for faceted search and empirical ...
Read More
A Study of Snippet Length and Informativeness: Behaviour, Performance and User Experience
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

The design and presentation of a Search Engine Results Page (SERP) has been subject to much research. With many contemporary aspects of the SERP now under scrutiny, work still remains in investigating more traditional SERP components, such as the result ...
Read More
A Comparative Study of Query-biased and Non-redundant Snippets for Structured Search on Mobile Devices
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

To investigate what kind of snippets are better suited for structured search on mobile devices, we built an experimental mobile search application and conducted a task-oriented interactive user study with 36 participants. Four different versions of a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
explainable machine learning
faceted filtering
user study
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 94
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Study of Explainability Features to Scrutinize Faceted Filtering Results

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Understanding Faceted Search from Data Science and Human Factor Perspectives

A Study of Snippet Length and Informativeness: Behaviour, Performance and User Experience

A Comparative Study of Query-biased and Non-redundant Snippets for Structured Search on Mobile Devices