Skip to main content

XAI with Machine Teaching When Humans Are (Not) Informed About the Irrelevant Features

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Research Track (ECML PKDD 2023)

Abstract

Exemplar-based explainable artificial intelligence (XAI) aims at creating human understanding about the behaviour of an AI system, usually a machine learning model, through examples. The advantage of this approach is that the human creates their own explanation in their own internal language. However, what examples should be chosen? Existing frameworks fall short in capturing all the elements that contribute to this process. In this paper, we propose a comprehensive XAI framework based on machine teaching. The traditional trade-off between the fidelity and the complexity of the explanation is transformed here into a trade-off between the complexity of the examples and the fidelity the human achieves about the behaviour of the ML system to be explained. We analyse a concept class of Boolean functions that is learned by a convolutional neural network classifier over a dataset of images of possibly rotated and resized letters. We assume the human learner has a strong prior (Karnaugh maps over Boolean functions). Our explanation procedure then behaves like a machine teaching session optimising the trade-off between examples and fidelity. We include an experimental evaluation and several human studies where we analyse the capacity of teaching humans these Boolean function by means of the explanatory examples generated by our framework. We explore the effect of telling the essential features to the human and the priors, and see that the identification is more successful than by randomly sampling the examples.

A preliminary version of this work was presented as a poster at AAIP@IJCLR2022. Supported by the Norwegian Research Council, project Machine Teaching for XAI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/BrigtHaavardstun/ExplainableAI.

  2. 2.

    We conducted t-tests for all pairs of groups to test whether means differ statistically significantly and got p-values 0.0297, 0.0013, and 0.0747 for pairs (III), (IIII) and (IIIII), respectively.

  3. 3.

    We use \(\delta ^* = \textit{Number of present letters}\).

  4. 4.

    We say that the system is aligned when the prior of \(L_M\) is similar to the prior of \(L_H\).

References

  1. Basu, S., Christensen, J.: Teaching classification boundaries to humans. In: Twenty-Seventh AAAI Conference on Artificial Intelligence (2013)

    Google Scholar 

  2. Chen, Y., Mac Aodha, O., Su, S., Perona, P., Yue, Y.: Near-optimal machine teaching via explanatory teaching sets. In: International Conference on Artificial Intelligence and Statistics, pp. 1970–1978 (2018)

    Google Scholar 

  3. Domingos, P.: Knowledge discovery via multiple models. Intell. Data Anal. 2(1–4), 187–202 (1998)

    Article  Google Scholar 

  4. Feldman, J.: Minimization of Boolean complexity in human concept learning. Nature 407(4), 630–633 (2000)

    Article  Google Scholar 

  5. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 93 (2018)

    Google Scholar 

  6. Hadfield-Menell, D., Russell, S.J., Abbeel, P., Dragan, A.: Cooperative inverse reinforcement learning. In: NIPS, pp. 3909–3917 (2016)

    Google Scholar 

  7. Hernández-Orallo, J.: Gazing into clever hans machines. Nat. Mach. Intell. 1(4), 172 (2019)

    Article  MathSciNet  Google Scholar 

  8. Hernández-Orallo, J., Ferri, C.: Teaching and explanations: aligning priors between machines and humans. In: Human-Like Machine Intelligence, pp. 171–198 (2021)

    Google Scholar 

  9. Ho, M.K., Littman, M., MacGlashan, J., Cushman, F., Austerweil, J.L.: Showing versus doing: teaching by demonstration. In: NIPS, pp. 3027–3035. Curran (2016). www.papers.nips.cc/paper/6413-showing-versus-doing-teaching-by-demonstration.pdf

  10. Hoosain, R.: The processing of negation. J. Verbal Learn. Verbal Behav. 12(6), 618–626 (1973). https://doi.org/10.1016/S0022-5371(73)80041-6, www.sciencedirect.com/science/article/pii/S0022537173800416

  11. Karnaugh, M.: The map method for synthesis of combinational logic circuits. Trans. Am. Inst. Electr. Engineers Part I: Commun. Electron. 72(5), 593–599 (1953). https://doi.org/10.1109/TCE.1953.6371932

    Article  MathSciNet  Google Scholar 

  12. Khan, F., Mutlu, B., Zhu, J.: How do humans teach: on curriculum learning and teaching dimension. In: NIPS, pp. 1449–1457 (2011)

    Google Scholar 

  13. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  14. Lipton, P.: Contrastive explanation. Roy. Inst. Phil. Suppl. 27, 247–266 (1990)

    Article  Google Scholar 

  15. Liu, W., Dai, B., Li, X., Liu, Z., Rehg, J.M., Song, L.: Towards black-box iterative machine teaching. arXiv preprint arXiv:1710.07742 (2017)

  16. Molnar, C.: Interpretable machine learning. https://lulu.com/ (2020)

  17. Ortega, A., Fierrez, J., Morales, A., Wang, Z., Ribeiro, T.: Symbolic AI for XAI: evaluating LFIT inductive programming for fair and explainable automatic recruitment. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 78–87 (2021)

    Google Scholar 

  18. Ouyang, L.: Bayesian inference of regular expressions from human-generated example strings. arXiv:1805.08427 (2018)

  19. Pisano, G., Ciatto, G., Calegari, R., Omicini, A.: Neuro-symbolic computation for xai: towards a unified model. In: WOA, vol. 1613, p. 101 (2020)

    Google Scholar 

  20. Rahwan, I., et al.: Machine behaviour. Nature 568(7753), 477 (2019)

    Article  Google Scholar 

  21. Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: High-precision model-agnostic explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  22. Samek, W., Müller, K.-R.: Towards explainable artificial intelligence. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 5–22. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_1

    Chapter  Google Scholar 

  23. Telle, J.A., Hernández-Orallo, J., Ferri, C.: The teaching size: computable teachers and learners for universal languages. Mach. Learn. 108, 1653–1675 (2019). https://doi.org/10.1007/s10994-019-05821-2

    Article  MathSciNet  MATH  Google Scholar 

  24. van der Waa, J., Nieuwburg, E., Cremers, A., Neerincx, M.: Evaluating XAI: a comparison of rule-based and example-based explanations. Artif. Intell. 291, 103404 (2021)

    Article  MathSciNet  Google Scholar 

  25. Yang, S.C.H., Vong, W.K., Sojitra, R.B., Folke, T., Shafto, P.: Mitigating belief projection in explainable artificial intelligence via Bayesian teaching. Sci. Rep. 11(1), 9863 (2021). https://doi.org/10.1038/s41598-021-89267-4. www.nature.com/articles/s41598-021-89267-4

  26. Zhu, X.: Machine teaching: an inverse problem to machine learning and an approach toward optimal education. In: AAAI, pp. 4083–4087 (2015)

    Google Scholar 

  27. Zhu, X., Singla, A., Zilles, S., Rafferty, A.N.: An overview of machine teaching (2018). arxiv.org/abs/1801.05927

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brigt Arve Toppe Håvardstun .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 425 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Håvardstun, B.A.T., Ferri, C., Hernández-Orallo, J., Parviainen, P., Telle, J.A. (2023). XAI with Machine Teaching When Humans Are (Not) Informed About the Irrelevant Features. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14171. Springer, Cham. https://doi.org/10.1007/978-3-031-43418-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43418-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43417-4

  • Online ISBN: 978-3-031-43418-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics