research-article

Crowdsourcing and Evaluating Concept-driven Explanations of Machine Learning Models

Authors:
Swati Mishra

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

,
Jeffrey M. Rzeszotarski

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

Proceedings of the ACM on Human-Computer Interaction Volume 5 Issue CSCW1Article No.: 139pp 1–26https://doi.org/10.1145/3449213

Published:22 April 2021Publication History

Proceedings of the ACM on Human-Computer Interaction

Abstract

An important challenge in building explainable artificially intelligent (AI) systems is designing interpretable explanations. AI models often use low-level data features which may be hard for humans to interpret. Recent research suggests that situating machine decisions in abstract, human understandable concepts can help. However, it is challenging to determine the right level of conceptual mapping. In this research, we explore granularity (of data features) and context (of data instances) as dimensions underpinning conceptual mappings. Based on these measures, we explore strategies for designing explanations in classification models. We introduce an end-to-end concept elicitation pipeline that supports gathering high-level concepts for a given data set. Through crowd-sourced experiments, we examine how providing conceptual information shapes the effectiveness of explanations, finding that a balance between coarse and fine-grained explanations help users better estimate model predictions. We organize our findings into systematic themes that can inform design considerations for future systems.

References

Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence, Vol. 34, 11 (2012), 2274--2282.Google ScholarDigital Library
Adrian Albert, Jasleen Kaur, and Marta C. Gonzalez. 2017. Using Convolutional Networks and Satellite Imagery to Identify Patterns in Urban Environments at a Large Scale. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD '17). Association for Computing Machinery, New York, NY, USA, 1357--1366. https://doi.org/10.1145/3097983.3098070Google ScholarDigital Library
Ahmed Alqaraawi, Martin Schuessler, Philipp Weiß, Enrico Costanza, and Nadia Berthouze. 2020. Evaluating Saliency Map Explanations for Convolutional Neural Networks: A User Study. In Proceedings of the 25th International Conference on Intelligent User Interfaces (Cagliari, Italy) (IUI '20). Association for Computing Machinery, New York, NY, USA, 275--285. https://doi.org/10.1145/3377325.3377519Google ScholarDigital Library
D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba. 2017. Network Dissection: Quantifying Interpretability of Deep Visual Representations. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, HI, USA, 3319--3327. https://doi.org/10.1109/CVPR.2017.354Google ScholarCross Ref
Ashwin Bhandare, Maithili Bhide, Pranav Gokhale, and Rohan Chandavarkar. 2016. Applications of convolutional neural networks. International Journal of Computer Science and Information Technologies, Vol. 7, 5 (2016), 2206--2215.Google Scholar
Mustafa Bilgic and Raymond J. Mooney. 2005. Explaining Recommendations: Satisfaction vs. Promotion. In Proceedings of Beyond Personalization 2005: A Workshop on the Next Stage of Recommender Systems Research at the 2005 International Conference on Intelligent User Interfaces, Vol. 5. IUI, San Diego, CA, 153. http://www.cs.utexas.edu/users/ai-lab?bilgic:iui-bp05Google Scholar
Alexander Binder, Sebastian Bach, Gregoire Montavon, Klaus-Robert Müller, and Wojciech Samek. 2016. Layer-Wise Relevance Propagation for Deep Neural Network Architectures. In Information Science and Applications (ICISA) 2016, Kuinam J. Kim and Nikolai Joukov (Eds.). Springer Singapore, Singapore, 913--922.Google ScholarCross Ref
Chris Callison-Burch and Mark Dredze. 2010. Creating Speech and Language Data with Amazon's Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (Los Angeles, California) (CSLDAMT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 1--12. http://dl.acm.org/citation.cfm?id=1866696.1866697Google ScholarDigital Library
Susan Carey. 2011. Précis of The Origin of Concepts. Behavioral and Brain Sciences, Vol. 34, 3 (2011), 113--124. https://doi.org/10.1017/S0140525X10000919Google ScholarCross Ref
Joel Chan, Steven Dang, and Steven P. Dow. 2016. Improving Crowd Innovation with Expert Facilitation. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (San Francisco, California, USA) (CSCW '16). Association for Computing Machinery, New York, NY, USA, 1223--1235. https://doi.org/10.1145/2818048.2820023Google ScholarDigital Library
Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets. Association for Computing Machinery, New York, NY, USA, 2334--2346. https://doi.org/10.1145/3025453.3026044Google ScholarDigital Library
X. Chen, R. Mottaghi, X. Liu, S. Fidler , R. Urtasun, and A. Yuille. 2014. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 1979--1986. https://doi.org/10.1109/CVPR.2014.254Google ScholarDigital Library
Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing Taxonomy Creation. Association for Computing Machinery, New York, NY, USA, 1999--2008. https://doi.org/10.1145/2470654.2466265Google ScholarDigital Library
M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi. 2014. Describing Textures in the Wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 3606--3613. https://doi.org/10.1109/CVPR.2014.461Google ScholarDigital Library
Abhishek Das, Harsh Agrawal, Larry Zitnick, Devi Parikh, and Dhruv Batra. 2016. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 932--937. https://doi.org/10.18653/v1/D16-1092Google ScholarCross Ref
J. Deng, W. Dong , R. Socher, L. Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Miami, FL, USA, 248--255. https://doi.org/10.1109/CVPR.2009.5206848Google ScholarCross Ref
J. Deng, J. Krause, and L. Fei-Fei. 2013. Fine-Grained Crowdsourcing for Fine-Grained Recognition. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Portland, OR, USA, 580--587. https://doi.org/10.1109/CVPR.2013.81Google ScholarDigital Library
Chris Ding and Xiaofeng He. 2004. K-Means Clustering via Principal Component Analysis. In Proceedings of the Twenty-First International Conference on Machine Learning (Banff, Alberta, Canada) (ICML '04). Association for Computing Machinery, New York, NY, USA, 29. https://doi.org/10.1145/1015330.1015408Google ScholarDigital Library
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning.Google Scholar
Anthony Elliott. 2019. The culture of AI: Everyday life and the digital revolution .Routledge, Australia.Google Scholar
Motahhare Eslami, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. 2015. "I Always Assumed That I Wasn't Really That Close to [Her]": Reasoning about Invisible Algorithms in News Feeds .Association for Computing Machinery, New York, NY, USA, 153--162. https://doi.org/10.1145/2702123.2702556Google ScholarDigital Library
Tim Finin, Will Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze. 2010. Annotating Named Entities in Twitter Data with Crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (Los Angeles, California) (CSLDAMT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 80--88. http://dl.acm.org/citation.cfm?id=1866696.1866709Google ScholarDigital Library
Nils Gehlenborg and Bang Wong. 2012. Points of view: heat maps.Google Scholar
L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. 2018. Explaining Explanations: An Overview of Interpretability of Machine Learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, Turin, Italy, Italy, 80--89. https://doi.org/10.1109/DSAA.2018.00018Google ScholarCross Ref
R. Girshick, J. Donahue, T. Darrell, and J. Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 580--587. https://doi.org/10.1109/CVPR.2014.81Google ScholarDigital Library
Wael H Gomaa and Aly A Fahmy. 2013. A survey of text similarity approaches. International Journal of Computer Applications, Vol. 68, 13 (2013), 13--18.Google ScholarCross Ref
David Gunning and David Aha. 2019. DARPA's Explainable Artificial Intelligence (XAI) Program. AI Magazine, Vol. 40, 2 (Jun. 2019), 44--58. https://doi.org/10.1609/aimag.v40i2.2850Google ScholarDigital Library
Daniel Holliday, Stephanie Wilson, and Simone Stumpf. 2016. User Trust in Intelligent Systems: A Journey Over Time. In Proceedings of the 21st International Conference on Intelligent User Interfaces (Sonoma, California, USA) (IUI '16). Association for Computing Machinery, New York, NY, USA, 164--168. https://doi.org/10.1145/2856767.2856811Google ScholarDigital Library
Andreas Holzinger, Chris Biemann, Constantinos S Pattichis, and Douglas B Kell. 2017. What do we need to build explainable AI systems for the medical domain?Google Scholar
John Joseph Horton and Lydia B. Chilton. 2010. The Labor Economics of Paid Crowdsourcing. In Proceedings of the 11th ACM Conference on Electronic Commerce (Cambridge, Massachusetts, USA) (EC '10). ACM, New York, NY, USA, 209--218. https://doi.org/10.1145/1807342.1807376Google ScholarDigital Library
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications.Google Scholar
M. Jiang, S. Huang , J. Duan, and Q. Zhao. 2015. SALICON: Saliency in Context. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Boston, MA, USA, 1072--1080. https://doi.org/10.1109/CVPR.2015.7298710Google ScholarCross Ref
Minsuk Kahng, Nikhil Thorat, Duen Horng Polo Chau, Fernanda B Viégas, and Martin Wattenberg. 2018. Gan lab: Understanding complex deep generative models using interactive visual experimentation. IEEE transactions on visualization and computer graphics, Vol. 25, 1 (2018), 1--11.Google Scholar
Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory sayres. 2018. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, Stockholmsmässan, Stockholm Sweden, 2668--2677. http://proceedings.mlr.press/v80/kim18d.htmlGoogle Scholar
Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing User Studies with Mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy) (CHI '08). Association for Computing Machinery, New York, NY, USA, 453--456. https://doi.org/10.1145/1357054.1357127Google ScholarDigital Library
Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell Me More? The Effects of Mental Model Soundness on Personalizing an Intelligent Agent. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). Association for Computing Machinery, New York, NY, USA, 1--10. https://doi.org/10.1145/2207676.2207678Google ScholarDigital Library
T. Liu, C. Rosenberg, and H. A. Rowley. 2007. Clustering Billions of Images with Large Scale Nearest Neighbor Search. In 2007 IEEE Workshop on Applications of Computer Vision (WACV '07). IEEE, Austin, TX, USA, 28--28. https://doi.org/10.1109/WACV.2007.18Google ScholarDigital Library
N. Luhmann, H. Davis, J. Raffan, K. Rooney, M. King, and C. Morgner. 1979. Trust and Power. Wiley, USA.Google Scholar
Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 4768--4777.Google Scholar
Daniel L Marino, Chathurika S Wickramasinghe, and Milos Manic. 2018. An adversarial approach for explainable ai in intrusion detection systems. In IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society. IEEE, Washington, DC, USA, 3237--3243.Google ScholarCross Ref
Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust. Academy of management review, Vol. 20, 3 (1995), 709--734.Google ScholarCross Ref
Masahiro Mitsuhara, Hiroshi Fukui, Yusuke Sakashita, Takanori Ogata, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fujiyoshi. 2019. Embedding Human Knowledge in Deep Neural Network via Attention Map.Google Scholar
Sina Mohseni, Niloofar Zarei, and Eric D. Ragan. 2020. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems. arxiv: cs.HC/1811.11839Google Scholar
R. Mottaghi, X. Chen, X. Liu, N. Cho, S. Lee, S. Fidler, R. Urtasun, and A. Yuille. 2014. The Role of Context for Object Detection and Semantic Segmentation in the Wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 891--898. https://doi.org/10.1109/CVPR.2014.119Google ScholarDigital Library
Gregory Murphy. 2004. The big book of concepts. MIT press, USA.Google Scholar
Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do humans understand explanations from machine learning systems? an evaluation of the human-interpretability of explanation.Google Scholar
Dong Nguyen. 2018. Comparing Automatic and Human Evaluation of Local Explanations for Text Classification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 1069--1078. https://doi.org/10.18653/v1/N18-1097Google ScholarCross Ref
David Oleson, Alexander Sorokin, Greg Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing. In Proceedings of the 11th AAAI Conference on Human Computation (AAAIWS'11-11). AAAI Press, San Francisco, USA, 43--48.Google Scholar
Forough Poursabzi-Sangdeh, Dan Goldstein, Jake Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2018a. Manipulating and Measuring Model Interpretability. https://www.microsoft.com/en-us/research/publication/manipulating-and-measuring-model-interpretability/Google Scholar
Forough Poursabzi-Sangdeh, Dan Goldstein, Jake Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2018b. Manipulating and Measuring Model Interpretability. https://www.microsoft.com/en-us/research/publication/manipulating-and-measuring-model-interpretability/Google Scholar
Daryl Pregibon et almbox. 1981. Logistic regression diagnostics. The Annals of Statistics, Vol. 9, 4 (1981), 705--724.Google Scholar
J. Ross Quinlan. 1986. Induction of decision trees. Machine learning, Vol. 1, 1 (1986), 81--106.Google Scholar
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 1135--1144. https://doi.org/10.1145/2939672.2939778Google ScholarDigital Library
Eleanor H Rosch. 1973. Natural categories. Cognitive psychology, Vol. 4, 3 (1973), 328--350.Google Scholar
Denise M Rousseau, Sim B Sitkin, Ronald S Burt, and Colin Camerer. 1998. Not so different after all: A cross-discipline view of trust. Academy of management review, Vol. 23, 3 (1998), 393--404.Google Scholar
Helena Russello. 2018. Convolutional neural networks for crop yield prediction using satellite images.Google Scholar
Jeffrey M. Rzeszotarski and Aniket Kittur. 2011. Instrumenting the Crowd: Using Implicit Behavioral Measures to Predict Task Performance. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (Santa Barbara, California, USA) (UIST '11). Association for Computing Machinery, New York, NY, USA, 13--22. https://doi.org/10.1145/2047196.2047199Google ScholarDigital Library
Swati Sachan, Jian-Bo Yang, Dong-Ling Xu, David Eraso Benavides, and Yang Li. 2020. An explainable AI decision-support-system to automate loan underwriting. Expert Systems with Applications, Vol. 144 (2020), 113100. https://doi.org/10.1016/j.eswa.2019.113100Google ScholarDigital Library
Philipp Schmidt and Felix Biessmann. 2019. Quantifying Interpretability and Trust in Machine Learning Systems. arxiv: cs.LG/1901.08558Google Scholar
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, Venice, Italy, 618--626. https://doi.org/10.1109/ICCV.2017.74Google ScholarCross Ref
R. R. Selvaraju, S. Lee, Y. Shen, H. Jin, S. Ghosh, L. Heck, D. Batra, and D. Parikh. 2019. Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), Korea (South), 2591--2600. https://doi.org/10.1109/ICCV.2019.00268Google ScholarCross Ref
K Simonyan, A Vedaldi, and A Zisserman. 2014. Deep inside convolutional networks: visualising image classification models and saliency maps.Google Scholar
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arxiv: cs.CV/1409.1556Google Scholar
Lianzhi Tan, Kaipeng Zhang, Kai Wang, Xiaoxing Zeng, Xiaojiang Peng, and Yu Qiao. 2017. Group Emotion Recognition with Individual Facial Emotion CNNs and Global Image Based CNNs. In Proceedings of the 19th ACM International Conference on Multimodal Interaction (Glasgow, UK) (ICMI '17). Association for Computing Machinery, New York, NY, USA, 549--552. https://doi.org/10.1145/3136755.3143008Google ScholarDigital Library
Elisabeth Kersten van Dijk, Wijnand IJsselsteijn, and Joyce Westerink. 2016. Deceptive Visualizations and User Bias: A Case for Personalization and Ambiguity in PI Visualizations. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct (Heidelberg, Germany) (UbiComp '16). Association for Computing Machinery, New York, NY, USA, 588--593. https://doi.org/10.1145/2968219.2968326Google ScholarDigital Library
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. 2015. Show and tell: A neural image caption generator. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Boston, MA, USA, 3156--3164. https://doi.org/10.1109/CVPR.2015.7298935Google ScholarCross Ref
Peter Willett. 2006. The Porter stemming algorithm: Then and now. Program electronic library and information systems, Vol. 40 (07 2006). https://doi.org/10.1108/00330330610681295Google ScholarCross Ref
Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems, Vol. 2, 1--3 (1987), 37--52.Google Scholar
Zhibiao Wu and Martha Palmer. 1994. Verbs Semantics and Lexical Selection. In Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics (Las Cruces, New Mexico) (ACL '94). Association for Computational Linguistics, USA, 133--138. https://doi.org/10.3115/981732.981751Google ScholarDigital Library
Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. In Computer Vision -- ECCV 2014,, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 818--833.Google Scholar
Quan-shi Zhang and Song-chun Zhu. 2018. Visual interpretability for deep learning: a survey. Frontiers of Information Technology & Electronic Engineering, Vol. 19, 1 (Jan. 2018), 27--39. https://doi.org/10.1631/FITEE.1700808Google ScholarCross Ref
Bolei Zhou, David Bau, Aude Oliva, and Antonio Torralba. 2018a. Interpreting deep visual representations via network dissection. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 9 (2018), 2131--2145.Google Scholar
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016. Learning Deep Features for Discriminative Localization. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Las Vegas, NV, USA, 2921--2929. https://doi.org/10.1109/CVPR.2016.319Google ScholarCross Ref
Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba. 2018b. Interpretable Basis Decomposition for Visual Explanation. In Computer Vision -- ECCV 2018,, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 122--138.Google ScholarDigital Library
Radim ?eh??ek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. pam??ový nosi?. In Proceedings of LREC 2010 workshop New Challenges for NLP Frameworks (Valletta, Malta). University of Malta, Valletta, Malta, 46--50. http://www.fi.muni.cz/usr/sojka/presentations/lrec2010-poster-rehurek-sojka.pdfGoogle Scholar

Index Terms

Crowdsourcing and Evaluating Concept-driven Explanations of Machine Learning Models
1. Human-centered computing
  1. Collaborative and social computing
    1. Empirical studies in collaborative and social computing
  2. Visualization
    1. Empirical studies in visualization

Recommendations

Do Explanations Improve the Quality of AI-assisted Human Decisions? An Algorithm-in-the-Loop Analysis of Factual & Counterfactual Explanations
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

The increased use of AI algorithmic aids in high-stakes decision making has prompted interest in explainable AI (xAI), and the role of counterfactual explanations to increase trust in human-algorithm collaborations and to mitigate unfair outcomes. ...
Read More
Minimalistic Explanations: Capturing the Essence of Decisions
CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems

The use of complex machine learning models can make systems opaque to users. Machine learning research proposes the use of post-hoc explanations. However, it is unclear if they give users insights into otherwise uninterpretable models. One minimalistic ...
Read More
Quod erat demonstrandum? - Towards a typology of the concept of explanation for the design of explainable AI
Abstract
In this paper, we present a fundamental framework for defining different types of explanations of AI systems and the criteria for evaluating their quality. Starting from a structural view of how explanations can be constructed, i.e., ...
Highlights
- We propose a framework for defining different types of explanations of AI systems.
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the ACM on Human-Computer Interaction Volume 5, Issue CSCW1
CSCW
April 2021
5016 pages
EISSN:2573-0142
DOI:10.1145/3460939
Editor:
Jeff Nichols
Apple Inc., United States
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 April 2021
Published in pacmhci Volume 5, Issue CSCW1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
classification
concepts
explanations
machine learning
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 453
  Total Downloads
- Downloads (Last 12 months)72
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Crowdsourcing and Evaluating Concept-driven Explanations of Machine Learning Models

Proceedings of the ACM on Human-Computer Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Do Explanations Improve the Quality of AI-assisted Human Decisions? An Algorithm-in-the-Loop Analysis of Factual & Counterfactual Explanations

Minimalistic Explanations: Capturing the Essence of Decisions

Quod erat demonstrandum? - Towards a typology of the concept of explanation for the design of explainable AI

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Crowdsourcing and Evaluating Concept-driven Explanations of Machine Learning Models

Proceedings of the ACM on Human-Computer Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Do Explanations Improve the Quality of AI-assisted Human Decisions? An Algorithm-in-the-Loop Analysis of Factual & Counterfactual Explanations

Minimalistic Explanations: Capturing the Essence of Decisions

Quod erat demonstrandum? - Towards a typology of the concept of explanation for the design of explainable AI

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media