skip to main content
research-article

Crowdsourcing and Evaluating Concept-driven Explanations of Machine Learning Models

Published:22 April 2021Publication History
Skip Abstract Section

Abstract

An important challenge in building explainable artificially intelligent (AI) systems is designing interpretable explanations. AI models often use low-level data features which may be hard for humans to interpret. Recent research suggests that situating machine decisions in abstract, human understandable concepts can help. However, it is challenging to determine the right level of conceptual mapping. In this research, we explore granularity (of data features) and context (of data instances) as dimensions underpinning conceptual mappings. Based on these measures, we explore strategies for designing explanations in classification models. We introduce an end-to-end concept elicitation pipeline that supports gathering high-level concepts for a given data set. Through crowd-sourced experiments, we examine how providing conceptual information shapes the effectiveness of explanations, finding that a balance between coarse and fine-grained explanations help users better estimate model predictions. We organize our findings into systematic themes that can inform design considerations for future systems.

References

  1. Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence, Vol. 34, 11 (2012), 2274--2282.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Adrian Albert, Jasleen Kaur, and Marta C. Gonzalez. 2017. Using Convolutional Networks and Satellite Imagery to Identify Patterns in Urban Environments at a Large Scale. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD '17). Association for Computing Machinery, New York, NY, USA, 1357--1366. https://doi.org/10.1145/3097983.3098070Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ahmed Alqaraawi, Martin Schuessler, Philipp Weiß, Enrico Costanza, and Nadia Berthouze. 2020. Evaluating Saliency Map Explanations for Convolutional Neural Networks: A User Study. In Proceedings of the 25th International Conference on Intelligent User Interfaces (Cagliari, Italy) (IUI '20). Association for Computing Machinery, New York, NY, USA, 275--285. https://doi.org/10.1145/3377325.3377519Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba. 2017. Network Dissection: Quantifying Interpretability of Deep Visual Representations. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, HI, USA, 3319--3327. https://doi.org/10.1109/CVPR.2017.354Google ScholarGoogle ScholarCross RefCross Ref
  5. Ashwin Bhandare, Maithili Bhide, Pranav Gokhale, and Rohan Chandavarkar. 2016. Applications of convolutional neural networks. International Journal of Computer Science and Information Technologies, Vol. 7, 5 (2016), 2206--2215.Google ScholarGoogle Scholar
  6. Mustafa Bilgic and Raymond J. Mooney. 2005. Explaining Recommendations: Satisfaction vs. Promotion. In Proceedings of Beyond Personalization 2005: A Workshop on the Next Stage of Recommender Systems Research at the 2005 International Conference on Intelligent User Interfaces, Vol. 5. IUI, San Diego, CA, 153. http://www.cs.utexas.edu/users/ai-lab?bilgic:iui-bp05Google ScholarGoogle Scholar
  7. Alexander Binder, Sebastian Bach, Gregoire Montavon, Klaus-Robert Müller, and Wojciech Samek. 2016. Layer-Wise Relevance Propagation for Deep Neural Network Architectures. In Information Science and Applications (ICISA) 2016, Kuinam J. Kim and Nikolai Joukov (Eds.). Springer Singapore, Singapore, 913--922.Google ScholarGoogle ScholarCross RefCross Ref
  8. Chris Callison-Burch and Mark Dredze. 2010. Creating Speech and Language Data with Amazon's Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (Los Angeles, California) (CSLDAMT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 1--12. http://dl.acm.org/citation.cfm?id=1866696.1866697Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Susan Carey. 2011. Précis of The Origin of Concepts. Behavioral and Brain Sciences, Vol. 34, 3 (2011), 113--124. https://doi.org/10.1017/S0140525X10000919Google ScholarGoogle ScholarCross RefCross Ref
  10. Joel Chan, Steven Dang, and Steven P. Dow. 2016. Improving Crowd Innovation with Expert Facilitation. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (San Francisco, California, USA) (CSCW '16). Association for Computing Machinery, New York, NY, USA, 1223--1235. https://doi.org/10.1145/2818048.2820023Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Joseph Chee Chang, Saleema Amershi, and Ece Kamar. 2017. Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets. Association for Computing Machinery, New York, NY, USA, 2334--2346. https://doi.org/10.1145/3025453.3026044Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. X. Chen, R. Mottaghi, X. Liu, S. Fidler , R. Urtasun, and A. Yuille. 2014. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 1979--1986. https://doi.org/10.1109/CVPR.2014.254Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing Taxonomy Creation. Association for Computing Machinery, New York, NY, USA, 1999--2008. https://doi.org/10.1145/2470654.2466265Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi. 2014. Describing Textures in the Wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 3606--3613. https://doi.org/10.1109/CVPR.2014.461Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Abhishek Das, Harsh Agrawal, Larry Zitnick, Devi Parikh, and Dhruv Batra. 2016. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 932--937. https://doi.org/10.18653/v1/D16-1092Google ScholarGoogle ScholarCross RefCross Ref
  16. J. Deng, W. Dong , R. Socher, L. Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Miami, FL, USA, 248--255. https://doi.org/10.1109/CVPR.2009.5206848Google ScholarGoogle ScholarCross RefCross Ref
  17. J. Deng, J. Krause, and L. Fei-Fei. 2013. Fine-Grained Crowdsourcing for Fine-Grained Recognition. In 2013 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Portland, OR, USA, 580--587. https://doi.org/10.1109/CVPR.2013.81Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chris Ding and Xiaofeng He. 2004. K-Means Clustering via Principal Component Analysis. In Proceedings of the Twenty-First International Conference on Machine Learning (Banff, Alberta, Canada) (ICML '04). Association for Computing Machinery, New York, NY, USA, 29. https://doi.org/10.1145/1015330.1015408Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning.Google ScholarGoogle Scholar
  20. Anthony Elliott. 2019. The culture of AI: Everyday life and the digital revolution .Routledge, Australia.Google ScholarGoogle Scholar
  21. Motahhare Eslami, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. 2015. "I Always Assumed That I Wasn't Really That Close to [Her]": Reasoning about Invisible Algorithms in News Feeds .Association for Computing Machinery, New York, NY, USA, 153--162. https://doi.org/10.1145/2702123.2702556Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tim Finin, Will Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze. 2010. Annotating Named Entities in Twitter Data with Crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (Los Angeles, California) (CSLDAMT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 80--88. http://dl.acm.org/citation.cfm?id=1866696.1866709Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nils Gehlenborg and Bang Wong. 2012. Points of view: heat maps.Google ScholarGoogle Scholar
  24. L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. 2018. Explaining Explanations: An Overview of Interpretability of Machine Learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, Turin, Italy, Italy, 80--89. https://doi.org/10.1109/DSAA.2018.00018Google ScholarGoogle ScholarCross RefCross Ref
  25. R. Girshick, J. Donahue, T. Darrell, and J. Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 580--587. https://doi.org/10.1109/CVPR.2014.81Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wael H Gomaa and Aly A Fahmy. 2013. A survey of text similarity approaches. International Journal of Computer Applications, Vol. 68, 13 (2013), 13--18.Google ScholarGoogle ScholarCross RefCross Ref
  27. David Gunning and David Aha. 2019. DARPA's Explainable Artificial Intelligence (XAI) Program. AI Magazine, Vol. 40, 2 (Jun. 2019), 44--58. https://doi.org/10.1609/aimag.v40i2.2850Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Daniel Holliday, Stephanie Wilson, and Simone Stumpf. 2016. User Trust in Intelligent Systems: A Journey Over Time. In Proceedings of the 21st International Conference on Intelligent User Interfaces (Sonoma, California, USA) (IUI '16). Association for Computing Machinery, New York, NY, USA, 164--168. https://doi.org/10.1145/2856767.2856811Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Andreas Holzinger, Chris Biemann, Constantinos S Pattichis, and Douglas B Kell. 2017. What do we need to build explainable AI systems for the medical domain?Google ScholarGoogle Scholar
  30. John Joseph Horton and Lydia B. Chilton. 2010. The Labor Economics of Paid Crowdsourcing. In Proceedings of the 11th ACM Conference on Electronic Commerce (Cambridge, Massachusetts, USA) (EC '10). ACM, New York, NY, USA, 209--218. https://doi.org/10.1145/1807342.1807376Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications.Google ScholarGoogle Scholar
  32. M. Jiang, S. Huang , J. Duan, and Q. Zhao. 2015. SALICON: Saliency in Context. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Boston, MA, USA, 1072--1080. https://doi.org/10.1109/CVPR.2015.7298710Google ScholarGoogle ScholarCross RefCross Ref
  33. Minsuk Kahng, Nikhil Thorat, Duen Horng Polo Chau, Fernanda B Viégas, and Martin Wattenberg. 2018. Gan lab: Understanding complex deep generative models using interactive visual experimentation. IEEE transactions on visualization and computer graphics, Vol. 25, 1 (2018), 1--11.Google ScholarGoogle Scholar
  34. Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory sayres. 2018. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. PMLR, Stockholmsmässan, Stockholm Sweden, 2668--2677. http://proceedings.mlr.press/v80/kim18d.htmlGoogle ScholarGoogle Scholar
  35. Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing User Studies with Mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Florence, Italy) (CHI '08). Association for Computing Machinery, New York, NY, USA, 453--456. https://doi.org/10.1145/1357054.1357127Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell Me More? The Effects of Mental Model Soundness on Personalizing an Intelligent Agent. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). Association for Computing Machinery, New York, NY, USA, 1--10. https://doi.org/10.1145/2207676.2207678Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Liu, C. Rosenberg, and H. A. Rowley. 2007. Clustering Billions of Images with Large Scale Nearest Neighbor Search. In 2007 IEEE Workshop on Applications of Computer Vision (WACV '07). IEEE, Austin, TX, USA, 28--28. https://doi.org/10.1109/WACV.2007.18Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. N. Luhmann, H. Davis, J. Raffan, K. Rooney, M. King, and C. Morgner. 1979. Trust and Power. Wiley, USA.Google ScholarGoogle Scholar
  39. Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 4768--4777.Google ScholarGoogle Scholar
  40. Daniel L Marino, Chathurika S Wickramasinghe, and Milos Manic. 2018. An adversarial approach for explainable ai in intrusion detection systems. In IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society. IEEE, Washington, DC, USA, 3237--3243.Google ScholarGoogle ScholarCross RefCross Ref
  41. Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust. Academy of management review, Vol. 20, 3 (1995), 709--734.Google ScholarGoogle ScholarCross RefCross Ref
  42. Masahiro Mitsuhara, Hiroshi Fukui, Yusuke Sakashita, Takanori Ogata, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fujiyoshi. 2019. Embedding Human Knowledge in Deep Neural Network via Attention Map.Google ScholarGoogle Scholar
  43. Sina Mohseni, Niloofar Zarei, and Eric D. Ragan. 2020. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems. arxiv: cs.HC/1811.11839Google ScholarGoogle Scholar
  44. R. Mottaghi, X. Chen, X. Liu, N. Cho, S. Lee, S. Fidler, R. Urtasun, and A. Yuille. 2014. The Role of Context for Object Detection and Semantic Segmentation in the Wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Columbus, OH, USA, 891--898. https://doi.org/10.1109/CVPR.2014.119Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Gregory Murphy. 2004. The big book of concepts. MIT press, USA.Google ScholarGoogle Scholar
  46. Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do humans understand explanations from machine learning systems? an evaluation of the human-interpretability of explanation.Google ScholarGoogle Scholar
  47. Dong Nguyen. 2018. Comparing Automatic and Human Evaluation of Local Explanations for Text Classification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 1069--1078. https://doi.org/10.18653/v1/N18-1097Google ScholarGoogle ScholarCross RefCross Ref
  48. David Oleson, Alexander Sorokin, Greg Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing. In Proceedings of the 11th AAAI Conference on Human Computation (AAAIWS'11-11). AAAI Press, San Francisco, USA, 43--48.Google ScholarGoogle Scholar
  49. Forough Poursabzi-Sangdeh, Dan Goldstein, Jake Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2018a. Manipulating and Measuring Model Interpretability. https://www.microsoft.com/en-us/research/publication/manipulating-and-measuring-model-interpretability/Google ScholarGoogle Scholar
  50. Forough Poursabzi-Sangdeh, Dan Goldstein, Jake Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2018b. Manipulating and Measuring Model Interpretability. https://www.microsoft.com/en-us/research/publication/manipulating-and-measuring-model-interpretability/Google ScholarGoogle Scholar
  51. Daryl Pregibon et almbox. 1981. Logistic regression diagnostics. The Annals of Statistics, Vol. 9, 4 (1981), 705--724.Google ScholarGoogle Scholar
  52. J. Ross Quinlan. 1986. Induction of decision trees. Machine learning, Vol. 1, 1 (1986), 81--106.Google ScholarGoogle Scholar
  53. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 1135--1144. https://doi.org/10.1145/2939672.2939778Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Eleanor H Rosch. 1973. Natural categories. Cognitive psychology, Vol. 4, 3 (1973), 328--350.Google ScholarGoogle Scholar
  55. Denise M Rousseau, Sim B Sitkin, Ronald S Burt, and Colin Camerer. 1998. Not so different after all: A cross-discipline view of trust. Academy of management review, Vol. 23, 3 (1998), 393--404.Google ScholarGoogle Scholar
  56. Helena Russello. 2018. Convolutional neural networks for crop yield prediction using satellite images.Google ScholarGoogle Scholar
  57. Jeffrey M. Rzeszotarski and Aniket Kittur. 2011. Instrumenting the Crowd: Using Implicit Behavioral Measures to Predict Task Performance. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (Santa Barbara, California, USA) (UIST '11). Association for Computing Machinery, New York, NY, USA, 13--22. https://doi.org/10.1145/2047196.2047199Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Swati Sachan, Jian-Bo Yang, Dong-Ling Xu, David Eraso Benavides, and Yang Li. 2020. An explainable AI decision-support-system to automate loan underwriting. Expert Systems with Applications, Vol. 144 (2020), 113100. https://doi.org/10.1016/j.eswa.2019.113100Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Philipp Schmidt and Felix Biessmann. 2019. Quantifying Interpretability and Trust in Machine Learning Systems. arxiv: cs.LG/1901.08558Google ScholarGoogle Scholar
  60. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, Venice, Italy, 618--626. https://doi.org/10.1109/ICCV.2017.74Google ScholarGoogle ScholarCross RefCross Ref
  61. R. R. Selvaraju, S. Lee, Y. Shen, H. Jin, S. Ghosh, L. Heck, D. Batra, and D. Parikh. 2019. Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), Korea (South), 2591--2600. https://doi.org/10.1109/ICCV.2019.00268Google ScholarGoogle ScholarCross RefCross Ref
  62. K Simonyan, A Vedaldi, and A Zisserman. 2014. Deep inside convolutional networks: visualising image classification models and saliency maps.Google ScholarGoogle Scholar
  63. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arxiv: cs.CV/1409.1556Google ScholarGoogle Scholar
  64. Lianzhi Tan, Kaipeng Zhang, Kai Wang, Xiaoxing Zeng, Xiaojiang Peng, and Yu Qiao. 2017. Group Emotion Recognition with Individual Facial Emotion CNNs and Global Image Based CNNs. In Proceedings of the 19th ACM International Conference on Multimodal Interaction (Glasgow, UK) (ICMI '17). Association for Computing Machinery, New York, NY, USA, 549--552. https://doi.org/10.1145/3136755.3143008Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Elisabeth Kersten van Dijk, Wijnand IJsselsteijn, and Joyce Westerink. 2016. Deceptive Visualizations and User Bias: A Case for Personalization and Ambiguity in PI Visualizations. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct (Heidelberg, Germany) (UbiComp '16). Association for Computing Machinery, New York, NY, USA, 588--593. https://doi.org/10.1145/2968219.2968326Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. 2015. Show and tell: A neural image caption generator. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Boston, MA, USA, 3156--3164. https://doi.org/10.1109/CVPR.2015.7298935Google ScholarGoogle ScholarCross RefCross Ref
  67. Peter Willett. 2006. The Porter stemming algorithm: Then and now. Program electronic library and information systems, Vol. 40 (07 2006). https://doi.org/10.1108/00330330610681295Google ScholarGoogle ScholarCross RefCross Ref
  68. Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems, Vol. 2, 1--3 (1987), 37--52.Google ScholarGoogle Scholar
  69. Zhibiao Wu and Martha Palmer. 1994. Verbs Semantics and Lexical Selection. In Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics (Las Cruces, New Mexico) (ACL '94). Association for Computational Linguistics, USA, 133--138. https://doi.org/10.3115/981732.981751Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. In Computer Vision -- ECCV 2014,, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 818--833.Google ScholarGoogle Scholar
  71. Quan-shi Zhang and Song-chun Zhu. 2018. Visual interpretability for deep learning: a survey. Frontiers of Information Technology & Electronic Engineering, Vol. 19, 1 (Jan. 2018), 27--39. https://doi.org/10.1631/FITEE.1700808Google ScholarGoogle ScholarCross RefCross Ref
  72. Bolei Zhou, David Bau, Aude Oliva, and Antonio Torralba. 2018a. Interpreting deep visual representations via network dissection. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 9 (2018), 2131--2145.Google ScholarGoogle Scholar
  73. B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016. Learning Deep Features for Discriminative Localization. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Las Vegas, NV, USA, 2921--2929. https://doi.org/10.1109/CVPR.2016.319Google ScholarGoogle ScholarCross RefCross Ref
  74. Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba. 2018b. Interpretable Basis Decomposition for Visual Explanation. In Computer Vision -- ECCV 2018,, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 122--138.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Radim ?eh??ek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. pam??ový nosi?. In Proceedings of LREC 2010 workshop New Challenges for NLP Frameworks (Valletta, Malta). University of Malta, Valletta, Malta, 46--50. http://www.fi.muni.cz/usr/sojka/presentations/lrec2010-poster-rehurek-sojka.pdfGoogle ScholarGoogle Scholar

Index Terms

  1. Crowdsourcing and Evaluating Concept-driven Explanations of Machine Learning Models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Human-Computer Interaction
        Proceedings of the ACM on Human-Computer Interaction  Volume 5, Issue CSCW1
        CSCW
        April 2021
        5016 pages
        EISSN:2573-0142
        DOI:10.1145/3460939
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 April 2021
        Published in pacmhci Volume 5, Issue CSCW1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader