Skip to main content

Towards Explainable Artificial Intelligence in Financial Fraud Detection: Using Shapley Additive Explanations to Explore Feature Importance

  • Conference paper
  • First Online:
Book cover Advanced Information Systems Engineering (CAiSE 2022)

Abstract

As the number of organizations and their complexity have increased, a tremendous amount of manual effort has to be invested to detect financial fraud. Therefore, powerful machine learning methods have become a critical factor to reduce the workload of financial auditors. However, as most machine learning models have become increasingly complex over the years, a significant need for transparency of artificial intelligence systems in the accounting domain has emerged. In this paper, we propose a novel approach using Shapley additive explanations to improve the transparency of models in the field of financial fraud detection. Our information systems engineering procedure follows the cross industry standard process for data mining including a systematic literature review of machine learning methods in fraud detection, a systematic development process and an explainable artificial intelligence analysis. By training a downstream Logistic Regression, Support Vector Machine and eXtreme Gradient Boosting classifier on a dataset of publicly traded companies convicted of financial statement fraud by the United States Securities and Exchange Commission, we show how the key items for financial statement fraud detection and their directionality can be identified using Shapley additive explanations. Finally, we contribute to the current state of research with this work by increasing model transparency and by generating insights on important financial statement fraud detection variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/JarFraud/FraudDetection.

References

  1. Bouazza, I., Ameur, E.B., Ameur, F.: Datamining for fraud detecting, state of the art. In: Ezziyyani, M. (ed.) Advanced Intelligent Systems for Sustainable Development (AI2SD’2018), pp. 205–219. Springer, Cham (2018)

    Google Scholar 

  2. Kokina, J., Davenport, T.H.: The emergence of artificial intelligence: how automation is changing auditing. J. Emerg. Technol. Account. 14, 115–122 (2017)

    Article  Google Scholar 

  3. Downar, B., Fischer, D.: Wirtschaftsprüfung im Zeitalter der Digitalisierung. In: Obermaier, R. (ed.) Handbuch Industrie 4.0 und Digitale Transformation, pp. 753–779. Springer, Wiesbaden (2019). https://doi.org/10.1007/978-3-658-24576-4_32

    Chapter  Google Scholar 

  4. Issa, H., Sun, T., Vasarhelyi, M.A.: Research ideas for artificial intelligence in auditing: the formalization of audit and workforce supplementation. J. Emerg. Technol. Account. 13(2), 1–20 (2016)

    Article  Google Scholar 

  5. Munoko, I., Brown-Liburd, H.L., Vasarhelyi, M.: The ethical implications of using artificial intelligence in auditing. J. Bus. Ethics 167(2), 209–234 (2020). https://doi.org/10.1007/s10551-019-04407-1

    Article  Google Scholar 

  6. Fukas, P., Rebstadt, J., Remark, F., Thomas, O.: Developing an artificial intelligence maturity model for auditing. In: ECIS 2021 Research Papers, 133 (2021)

    Google Scholar 

  7. Rebstadt, J., Remark, F., Fukas, P., Meier, P., Thomas, O.: Towards personalized explanations for AI systems: designing a role model for explainable AI in auditing. In: Wirtschaftsinformatik 2022 Proceedings, 2 (2022)

    Google Scholar 

  8. Santos, R.N., et al.: Gradient boosting and Shapley additive explanations for fraud detection in electricity distribution grids. Int. Trans. Electr. Energy Syst. 31, e13046 (2021)

    Google Scholar 

  9. Severino, M.K., Peng, Y.: Machine learning algorithms for fraud prediction in property insurance: empirical evidence using real-world microdata. Mach. Learn. with Appl. 5, 100074 (2021)

    Article  Google Scholar 

  10. Psychoula, I., Gutmann, A., Mainali, P., Lee, S.H., Dunphy, P., Petitcolas, F.: Explainable machine learning for fraud detection. Computer 54(10), 49–59 (2021)

    Article  Google Scholar 

  11. Webster, J., Watson, R.T.: Analyzing the past to prepare for the future: writing a literature review. Manag. Inf. Syst. Q. 26, xiii–xxiii (2002)

    Google Scholar 

  12. Cecchini, M., Aytug, H., Koehler, G.J., Pathak, P.: Detecting management fraud in public companies. Manag. Sci. 56(7), 1146–1160 (2010)

    Article  MATH  Google Scholar 

  13. Dechow, P.M., Ge, W., Larson, C.R., Sloan, R.G.: Predicting material accounting misstatements. Contemp. Account. Res. 28(1), 17–82 (2011)

    Article  Google Scholar 

  14. Bao, Y., Ke, B., Li, B., Yu, Y.J., Zhang, J.: Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach. J. Account. Res. 58(1), 199–235 (2020)

    Google Scholar 

  15. Reurink, A.: Financial fraud: a literature review. J. Econ. Surv. 32(5), 1292–1325 (2018)

    Article  Google Scholar 

  16. Green, S.P.: Lying, Cheating, and Stealing: A Moral Theory of White-Collar Crime. Oxford University Press, Oxford (2006)

    Google Scholar 

  17. Laleh, N., Azgomi, M.A.: A taxonomy of frauds and fraud detection techniques. In: Prasad, S.K., Routray, S., Khurana, R., Sahni, S. (eds.) Information Systems, Technology and Management, pp. 256–267. Springer, Berlin, Heidelberg (2009)

    Chapter  Google Scholar 

  18. Fligstein, N., Roehrkasse, A.: All of the incentives were wrong: opportunism and the financial crisis. In: American Sociology Annual Meeting, New York (2013)

    Google Scholar 

  19. Lomnicka, E.: Investor protection in securities markets. In: Cane, P., Conaghan, J. (eds.) The New Oxford Companion to Law, pp. 40–65. Oxford University Press, Oxford (2008)

    Google Scholar 

  20. Selden, S.R.: (Self-)policing the market: congress’s Flawed approach to securities law reform. J. Legis. 33(1), 3 (2007)

    Google Scholar 

  21. Guttentag, M.: An argument for imposing disclosure requirements on public companies. Florida State Univ. Law Rev. 32(1), 3 (2004)

    Google Scholar 

  22. ISA 200: ISA 200: Overall Objectives of the independent auditor and the conduct of an audit in accordance with international standards on auditing. In: International Federation of Accountants (IFAC) (ed.) International Standards on Auditing (ISAs). Institut der Wirtschaftsprüfer (IDW) Verlag GmbH, Düsseldorf (2009)

    Google Scholar 

  23. FBI: Financial Crimes Report to the Public (2012)

    Google Scholar 

  24. Barman, S., Mandal, P., Mahata, A., Biswas, B., Pal, U., Sarfaraj, M.A.: A complete literature review on financial fraud detection applying data mining techniques. Int. J. Trust Manag. Comput. Commun. 3(4), 336–359 (2016)

    Article  Google Scholar 

  25. Ngai, E.W.T., Hu, Y., Wong, Y.H., Chen, Y., Sun, X.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis. Support Syst. 50(3), 559–569 (2011)

    Article  Google Scholar 

  26. West, J., Bhattacharya, M.: Intelligent financial fraud detection: a comprehensive review. Comput. Secur. 57, 47–66 (2016)

    Article  Google Scholar 

  27. Lipton, Z.C.: The mythos of model interpretability. Commun. ACM. 61(10), 36–43 (2018)

    Article  Google Scholar 

  28. Tomsett, R., Braines, D., Harborne, D., Preece, A., Chakraborty, S.: Interpretable to whom? A role-based model for analyzing interpretable machine learning systems. arXiv (2018)

    Google Scholar 

  29. Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)

    Article  Google Scholar 

  30. Dhurandhar, A., Iyengar, V., Luss, R., Shanmugam, K.: TIP: typifying the interpretability of procedures. arXiv (2017)

    Google Scholar 

  31. European Commission: High-Level Expert Group on Artificial Intelligence: Ethics Guidelines for Trustworthy AI, Brüssel (2018)

    Google Scholar 

  32. Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: von Luxburg, U., Guyon, I., Bengio, S., Wallach, H., Fergus, R. (eds.) Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), pp. 4768–4777. Curran Associates Inc., Red Hook (2017)

    Google Scholar 

  33. Shapley, L.S.: A Value for N-Person Games. RAND Corporation, Santa Monica (1952)

    MATH  Google Scholar 

  34. Gianini, G., GhemmogneFossi, L., Mio, C., Caelen, O., Brunie, L., Damiani, E.: Managing a pool of rules for credit card fraud detection by a Game Theory based approach. Futur. Gener. Comput. Syst. 102, 549–561 (2020)

    Article  Google Scholar 

  35. Lundberg, S.M., Erion, G.G., Lee, S.-I.: Consistent individualized feature attribution for tree ensembles. arXiv (2018)

    Google Scholar 

  36. Wangler, B., Backlund, A.: Information systems engineering: what is it? In: Castro, J., Teniente, E. (eds.) CAiSE 2005 Workshops, pp. 427–437. FEUP Edições, Porto (2005)

    Google Scholar 

  37. Chapman, P., et al.: CRISP-DM 1.0 (2000)

    Google Scholar 

  38. Thomas, O., Bruckner, A., Leimkühler, M., Remark, F., Thomas, K.: Konzeption, Implementierung und Einführung von KI-Systemen in der Wirtschaftsprüfung. Die Wirtschaftsprüfung. 74, 551–562 (2021)

    Google Scholar 

  39. vom Brocke, J., Niehaves, B., Simons, A., Riemer, K.: Reconstructing the giant : on the importance of rigour in documenting the literature search process. In: ECIS 2009 Proceedings, vol. 161 (2009)

    Google Scholar 

  40. Kitchenham, B.A., Charters, S.M.: Guidelines for performing systematic literature reviews in software engineering (2007)

    Google Scholar 

  41. Wong, N., Ray, P., Stephens, G., Lewis, L.: Artificial immune systems for the detection of credit card fraud: an architecture, prototype and preliminary results. Inf. Syst. J. 22(1), 53–76 (2012)

    Article  Google Scholar 

  42. Sahin, Y., Bulkan, S., Duman, E.: A cost-sensitive decision tree approach for fraud detection. Expert Syst. Appl. 40(15), 5916–5923 (2013)

    Article  Google Scholar 

  43. Fiore, U., De Santis, A., Perla, F., Zanetti, P., Palmieri, F.: Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf. Sci. 479, 448–455 (2019)

    Article  Google Scholar 

  44. Yang, W.S., Hwang, S.Y.: A process-mining framework for the detection of healthcare fraud and abuse. Expert Syst. Appl. 31(1), 56–68 (2006)

    Article  MathSciNet  Google Scholar 

  45. Pinquet, J., Ayuso, M., Guillén, M.: Selection bias and auditing policies for insurance claims. J. Risk Insur. 74(2), 425–440 (2007)

    Article  Google Scholar 

  46. Bermúdez, L., Pérez, J.M., Ayuso, M., Gómez, E., Vázquez, F.J.: A Bayesian dichotomous model with asymmetric link for fraud in insurance. Insur. Math. Econ. 42, 779–786 (2008)

    Article  MATH  Google Scholar 

  47. Caudill, S.B., Ayuso, M., Guillen, M.: Fraud detection using a multinomal logit model with missing information. J. Risk Insur. 72(4), 539–550 (2005)

    Article  Google Scholar 

  48. Wang, S., Yang, J.: A money laundering risk evaluation method based on decision tree. In: International Conference on Machine Learning and Cybernetics, pp. 283–286 (2007)

    Google Scholar 

  49. Le Khac, N.A., Markos, S., Kechadi, M.T.: A data mining-based solution for detecting suspicious money laundering cases in an investment bank. In: 2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications, pp. 235–240 (2010)

    Google Scholar 

  50. Larik, A.S., Haider, S.: Clustering based anomalous transaction reporting. Procedia Comput. Sci. 3, 606–610 (2011)

    Article  Google Scholar 

  51. Zhan, Q., Yin, H.: A loan application fraud detection method based on knowledge graph and neural network. In: Guan, S.-U., Jiannong, C., Du, H., Huang, N.-F. (eds.) ICIAI 2018: Proceedings of the 2nd International Conference on Innovation in Artificial Intelligence, pp. 111–115. Association for Computing Machinery, New York (2018)

    Google Scholar 

  52. Błaszczyński, J., de Almeida Filho, A.T., Matuszyk, A., Szeląg, M., Słowiński, R.: Auto loan fraud detection using dominance-based rough set approach versus machine learning methods. Expert Syst. Appl. 163 (2021)

    Google Scholar 

  53. Holton, C.: Identifying disgruntled employee systems fraud risk through text mining: a simple solution for a multi-billion dollar problem. Decis. Support Syst. 46, 853–864 (2009)

    Article  Google Scholar 

  54. Jans, M., Van Der Werf, J., Lybaert, N., Vanhoof, K.: A business process mining application for internal transaction fraud mitigation. Expert Syst. Appl. 38(10), 13351–13359 (2011)

    Article  Google Scholar 

  55. Sarno, R., Dewandono, R., Tohari, A., Naufal, M., Sinaga, F.: Hybrid association rule learning and process mining for fraud detection. Int. J. Comput. Sci. 42(2), 59–72 (2015)

    Google Scholar 

  56. Karpoff, J.M., Koester, A., Lee, D.S., Martin, G.S.: Proxies and databases in financial misconduct research. Account. Rev. 92(6), 129–163 (2017)

    Article  Google Scholar 

  57. Beneish, M.D.: The detection of earnings manipulation. Financ. Anal. J. 55, 24–36 (1999)

    Article  Google Scholar 

  58. Summers, S.L., Sweeney, J.T.: Fraudulently misstated financial statements and insider trading: an empirical analysis. Account. Rev. 73(1), 131–146 (1998)

    Google Scholar 

  59. Seiffert, C., Khoshgoftaar, T., Van Hulse, J., Napolitano, A.: RUSBoost: a hybrid approach to alleviating class imbalance. Syst. Man Cy. Part A Syst. Hum. 40, 185–197 (2010)

    Google Scholar 

  60. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. Association for Computing Machinery, New York (2016)

    Google Scholar 

  61. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  62. Metz, C.E.: Basic principles of ROC analysis. Semin. Nucl. Med. 8(4), 283–298 (1978)

    Article  Google Scholar 

  63. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  64. Kedia, S., Philippon, T.: The economics of fraudulent accounting. Rev. Financ. Stud. 22(6), 2169–2199 (2009)

    Article  Google Scholar 

  65. Agrawal, A., Cooper, T.: Insider trading before accounting scandals. J. Corp. Financ. 34, 169–190 (2015)

    Article  Google Scholar 

  66. Bartov, E., Mohanram, P.: Private information, earnings manipulations, and executive stock-option exercises. Account. Rev. 79(4), 889–920 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philipp Fukas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fukas, P., Rebstadt, J., Menzel, L., Thomas, O. (2022). Towards Explainable Artificial Intelligence in Financial Fraud Detection: Using Shapley Additive Explanations to Explore Feature Importance. In: Franch, X., Poels, G., Gailly, F., Snoeck, M. (eds) Advanced Information Systems Engineering. CAiSE 2022. Lecture Notes in Computer Science, vol 13295. Springer, Cham. https://doi.org/10.1007/978-3-031-07472-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07472-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07471-4

  • Online ISBN: 978-3-031-07472-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics