Towards Explainable Artificial Intelligence in Financial Fraud Detection: Using Shapley Additive Explanations to Explore Feature Importance

Fukas, Philipp; Rebstadt, Jonas; Menzel, Lukas; Thomas, Oliver

doi:10.1007/978-3-031-07472-1_7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13295))

Included in the following conference series:

International Conference on Advanced Information Systems Engineering

1857 Accesses
2 Citations

Abstract

As the number of organizations and their complexity have increased, a tremendous amount of manual effort has to be invested to detect financial fraud. Therefore, powerful machine learning methods have become a critical factor to reduce the workload of financial auditors. However, as most machine learning models have become increasingly complex over the years, a significant need for transparency of artificial intelligence systems in the accounting domain has emerged. In this paper, we propose a novel approach using Shapley additive explanations to improve the transparency of models in the field of financial fraud detection. Our information systems engineering procedure follows the cross industry standard process for data mining including a systematic literature review of machine learning methods in fraud detection, a systematic development process and an explainable artificial intelligence analysis. By training a downstream Logistic Regression, Support Vector Machine and eXtreme Gradient Boosting classifier on a dataset of publicly traded companies convicted of financial statement fraud by the United States Securities and Exchange Commission, we show how the key items for financial statement fraud detection and their directionality can be identified using Shapley additive explanations. Finally, we contribute to the current state of research with this work by increasing model transparency and by generating insights on important financial statement fraud detection variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/JarFraud/FraudDetection.

References

Bouazza, I., Ameur, E.B., Ameur, F.: Datamining for fraud detecting, state of the art. In: Ezziyyani, M. (ed.) Advanced Intelligent Systems for Sustainable Development (AI2SD’2018), pp. 205–219. Springer, Cham (2018)
Google Scholar
Kokina, J., Davenport, T.H.: The emergence of artificial intelligence: how automation is changing auditing. J. Emerg. Technol. Account. 14, 115–122 (2017)
Article Google Scholar
Downar, B., Fischer, D.: Wirtschaftsprüfung im Zeitalter der Digitalisierung. In: Obermaier, R. (ed.) Handbuch Industrie 4.0 und Digitale Transformation, pp. 753–779. Springer, Wiesbaden (2019). https://doi.org/10.1007/978-3-658-24576-4_32
Chapter Google Scholar
Issa, H., Sun, T., Vasarhelyi, M.A.: Research ideas for artificial intelligence in auditing: the formalization of audit and workforce supplementation. J. Emerg. Technol. Account. 13(2), 1–20 (2016)
Article Google Scholar
Munoko, I., Brown-Liburd, H.L., Vasarhelyi, M.: The ethical implications of using artificial intelligence in auditing. J. Bus. Ethics 167(2), 209–234 (2020). https://doi.org/10.1007/s10551-019-04407-1
Article Google Scholar
Fukas, P., Rebstadt, J., Remark, F., Thomas, O.: Developing an artificial intelligence maturity model for auditing. In: ECIS 2021 Research Papers, 133 (2021)
Google Scholar
Rebstadt, J., Remark, F., Fukas, P., Meier, P., Thomas, O.: Towards personalized explanations for AI systems: designing a role model for explainable AI in auditing. In: Wirtschaftsinformatik 2022 Proceedings, 2 (2022)
Google Scholar
Santos, R.N., et al.: Gradient boosting and Shapley additive explanations for fraud detection in electricity distribution grids. Int. Trans. Electr. Energy Syst. 31, e13046 (2021)
Google Scholar
Severino, M.K., Peng, Y.: Machine learning algorithms for fraud prediction in property insurance: empirical evidence using real-world microdata. Mach. Learn. with Appl. 5, 100074 (2021)
Article Google Scholar
Psychoula, I., Gutmann, A., Mainali, P., Lee, S.H., Dunphy, P., Petitcolas, F.: Explainable machine learning for fraud detection. Computer 54(10), 49–59 (2021)
Article Google Scholar
Webster, J., Watson, R.T.: Analyzing the past to prepare for the future: writing a literature review. Manag. Inf. Syst. Q. 26, xiii–xxiii (2002)
Google Scholar
Cecchini, M., Aytug, H., Koehler, G.J., Pathak, P.: Detecting management fraud in public companies. Manag. Sci. 56(7), 1146–1160 (2010)
Article MATH Google Scholar
Dechow, P.M., Ge, W., Larson, C.R., Sloan, R.G.: Predicting material accounting misstatements. Contemp. Account. Res. 28(1), 17–82 (2011)
Article Google Scholar
Bao, Y., Ke, B., Li, B., Yu, Y.J., Zhang, J.: Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach. J. Account. Res. 58(1), 199–235 (2020)
Google Scholar
Reurink, A.: Financial fraud: a literature review. J. Econ. Surv. 32(5), 1292–1325 (2018)
Article Google Scholar
Green, S.P.: Lying, Cheating, and Stealing: A Moral Theory of White-Collar Crime. Oxford University Press, Oxford (2006)
Google Scholar
Laleh, N., Azgomi, M.A.: A taxonomy of frauds and fraud detection techniques. In: Prasad, S.K., Routray, S., Khurana, R., Sahni, S. (eds.) Information Systems, Technology and Management, pp. 256–267. Springer, Berlin, Heidelberg (2009)
Chapter Google Scholar
Fligstein, N., Roehrkasse, A.: All of the incentives were wrong: opportunism and the financial crisis. In: American Sociology Annual Meeting, New York (2013)
Google Scholar
Lomnicka, E.: Investor protection in securities markets. In: Cane, P., Conaghan, J. (eds.) The New Oxford Companion to Law, pp. 40–65. Oxford University Press, Oxford (2008)
Google Scholar
Selden, S.R.: (Self-)policing the market: congress’s Flawed approach to securities law reform. J. Legis. 33(1), 3 (2007)
Google Scholar
Guttentag, M.: An argument for imposing disclosure requirements on public companies. Florida State Univ. Law Rev. 32(1), 3 (2004)
Google Scholar
ISA 200: ISA 200: Overall Objectives of the independent auditor and the conduct of an audit in accordance with international standards on auditing. In: International Federation of Accountants (IFAC) (ed.) International Standards on Auditing (ISAs). Institut der Wirtschaftsprüfer (IDW) Verlag GmbH, Düsseldorf (2009)
Google Scholar
FBI: Financial Crimes Report to the Public (2012)
Google Scholar
Barman, S., Mandal, P., Mahata, A., Biswas, B., Pal, U., Sarfaraj, M.A.: A complete literature review on financial fraud detection applying data mining techniques. Int. J. Trust Manag. Comput. Commun. 3(4), 336–359 (2016)
Article Google Scholar
Ngai, E.W.T., Hu, Y., Wong, Y.H., Chen, Y., Sun, X.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis. Support Syst. 50(3), 559–569 (2011)
Article Google Scholar
West, J., Bhattacharya, M.: Intelligent financial fraud detection: a comprehensive review. Comput. Secur. 57, 47–66 (2016)
Article Google Scholar
Lipton, Z.C.: The mythos of model interpretability. Commun. ACM. 61(10), 36–43 (2018)
Article Google Scholar
Tomsett, R., Braines, D., Harborne, D., Preece, A., Chakraborty, S.: Interpretable to whom? A role-based model for analyzing interpretable machine learning systems. arXiv (2018)
Google Scholar
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)
Article Google Scholar
Dhurandhar, A., Iyengar, V., Luss, R., Shanmugam, K.: TIP: typifying the interpretability of procedures. arXiv (2017)
Google Scholar
European Commission: High-Level Expert Group on Artificial Intelligence: Ethics Guidelines for Trustworthy AI, Brüssel (2018)
Google Scholar
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: von Luxburg, U., Guyon, I., Bengio, S., Wallach, H., Fergus, R. (eds.) Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), pp. 4768–4777. Curran Associates Inc., Red Hook (2017)
Google Scholar
Shapley, L.S.: A Value for N-Person Games. RAND Corporation, Santa Monica (1952)
MATH Google Scholar
Gianini, G., GhemmogneFossi, L., Mio, C., Caelen, O., Brunie, L., Damiani, E.: Managing a pool of rules for credit card fraud detection by a Game Theory based approach. Futur. Gener. Comput. Syst. 102, 549–561 (2020)
Article Google Scholar
Lundberg, S.M., Erion, G.G., Lee, S.-I.: Consistent individualized feature attribution for tree ensembles. arXiv (2018)
Google Scholar
Wangler, B., Backlund, A.: Information systems engineering: what is it? In: Castro, J., Teniente, E. (eds.) CAiSE 2005 Workshops, pp. 427–437. FEUP Edições, Porto (2005)
Google Scholar
Chapman, P., et al.: CRISP-DM 1.0 (2000)
Google Scholar
Thomas, O., Bruckner, A., Leimkühler, M., Remark, F., Thomas, K.: Konzeption, Implementierung und Einführung von KI-Systemen in der Wirtschaftsprüfung. Die Wirtschaftsprüfung. 74, 551–562 (2021)
Google Scholar
vom Brocke, J., Niehaves, B., Simons, A., Riemer, K.: Reconstructing the giant : on the importance of rigour in documenting the literature search process. In: ECIS 2009 Proceedings, vol. 161 (2009)
Google Scholar
Kitchenham, B.A., Charters, S.M.: Guidelines for performing systematic literature reviews in software engineering (2007)
Google Scholar
Wong, N., Ray, P., Stephens, G., Lewis, L.: Artificial immune systems for the detection of credit card fraud: an architecture, prototype and preliminary results. Inf. Syst. J. 22(1), 53–76 (2012)
Article Google Scholar
Sahin, Y., Bulkan, S., Duman, E.: A cost-sensitive decision tree approach for fraud detection. Expert Syst. Appl. 40(15), 5916–5923 (2013)
Article Google Scholar
Fiore, U., De Santis, A., Perla, F., Zanetti, P., Palmieri, F.: Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf. Sci. 479, 448–455 (2019)
Article Google Scholar
Yang, W.S., Hwang, S.Y.: A process-mining framework for the detection of healthcare fraud and abuse. Expert Syst. Appl. 31(1), 56–68 (2006)
Article MathSciNet Google Scholar
Pinquet, J., Ayuso, M., Guillén, M.: Selection bias and auditing policies for insurance claims. J. Risk Insur. 74(2), 425–440 (2007)
Article Google Scholar
Bermúdez, L., Pérez, J.M., Ayuso, M., Gómez, E., Vázquez, F.J.: A Bayesian dichotomous model with asymmetric link for fraud in insurance. Insur. Math. Econ. 42, 779–786 (2008)
Article MATH Google Scholar
Caudill, S.B., Ayuso, M., Guillen, M.: Fraud detection using a multinomal logit model with missing information. J. Risk Insur. 72(4), 539–550 (2005)
Article Google Scholar
Wang, S., Yang, J.: A money laundering risk evaluation method based on decision tree. In: International Conference on Machine Learning and Cybernetics, pp. 283–286 (2007)
Google Scholar
Le Khac, N.A., Markos, S., Kechadi, M.T.: A data mining-based solution for detecting suspicious money laundering cases in an investment bank. In: 2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications, pp. 235–240 (2010)
Google Scholar
Larik, A.S., Haider, S.: Clustering based anomalous transaction reporting. Procedia Comput. Sci. 3, 606–610 (2011)
Article Google Scholar
Zhan, Q., Yin, H.: A loan application fraud detection method based on knowledge graph and neural network. In: Guan, S.-U., Jiannong, C., Du, H., Huang, N.-F. (eds.) ICIAI 2018: Proceedings of the 2nd International Conference on Innovation in Artificial Intelligence, pp. 111–115. Association for Computing Machinery, New York (2018)
Google Scholar
Błaszczyński, J., de Almeida Filho, A.T., Matuszyk, A., Szeląg, M., Słowiński, R.: Auto loan fraud detection using dominance-based rough set approach versus machine learning methods. Expert Syst. Appl. 163 (2021)
Google Scholar
Holton, C.: Identifying disgruntled employee systems fraud risk through text mining: a simple solution for a multi-billion dollar problem. Decis. Support Syst. 46, 853–864 (2009)
Article Google Scholar
Jans, M., Van Der Werf, J., Lybaert, N., Vanhoof, K.: A business process mining application for internal transaction fraud mitigation. Expert Syst. Appl. 38(10), 13351–13359 (2011)
Article Google Scholar
Sarno, R., Dewandono, R., Tohari, A., Naufal, M., Sinaga, F.: Hybrid association rule learning and process mining for fraud detection. Int. J. Comput. Sci. 42(2), 59–72 (2015)
Google Scholar
Karpoff, J.M., Koester, A., Lee, D.S., Martin, G.S.: Proxies and databases in financial misconduct research. Account. Rev. 92(6), 129–163 (2017)
Article Google Scholar
Beneish, M.D.: The detection of earnings manipulation. Financ. Anal. J. 55, 24–36 (1999)
Article Google Scholar
Summers, S.L., Sweeney, J.T.: Fraudulently misstated financial statements and insider trading: an empirical analysis. Account. Rev. 73(1), 131–146 (1998)
Google Scholar
Seiffert, C., Khoshgoftaar, T., Van Hulse, J., Napolitano, A.: RUSBoost: a hybrid approach to alleviating class imbalance. Syst. Man Cy. Part A Syst. Hum. 40, 185–197 (2010)
Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: KDD 2016: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. Association for Computing Machinery, New York (2016)
Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Metz, C.E.: Basic principles of ROC analysis. Semin. Nucl. Med. 8(4), 283–298 (1978)
Article Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Article Google Scholar
Kedia, S., Philippon, T.: The economics of fraudulent accounting. Rev. Financ. Stud. 22(6), 2169–2199 (2009)
Article Google Scholar
Agrawal, A., Cooper, T.: Insider trading before accounting scandals. J. Corp. Financ. 34, 169–190 (2015)
Article Google Scholar
Bartov, E., Mohanram, P.: Private information, earnings manipulations, and executive stock-option exercises. Account. Rev. 79(4), 889–920 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Osnabrück University, Osnabrück, Lower Saxony, Germany
Philipp Fukas, Lukas Menzel & Oliver Thomas
German Research Center for Artificial Intelligence, Osnabrück, Lower Saxony, Germany
Philipp Fukas, Jonas Rebstadt & Oliver Thomas
Strategion GmbH, Osnabrück, Lower Saxony, Germany
Philipp Fukas & Jonas Rebstadt

Authors

Philipp Fukas
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Rebstadt
View author publications
You can also search for this author in PubMed Google Scholar
Lukas Menzel
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Thomas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philipp Fukas .

Editor information

Editors and Affiliations

Department of Service and Information System Engineering (ESSI), Universitat Politècnica de Catalunya, Barcelona, Spain
Xavier Franch
Ghent University, Gent, Belgium
Geert Poels
Ghent University, Gent, Belgium
Frederik Gailly
KU Leuven, Leuven, Belgium
Monique Snoeck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fukas, P., Rebstadt, J., Menzel, L., Thomas, O. (2022). Towards Explainable Artificial Intelligence in Financial Fraud Detection: Using Shapley Additive Explanations to Explore Feature Importance. In: Franch, X., Poels, G., Gailly, F., Snoeck, M. (eds) Advanced Information Systems Engineering. CAiSE 2022. Lecture Notes in Computer Science, vol 13295. Springer, Cham. https://doi.org/10.1007/978-3-031-07472-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-07472-1_7
Published: 03 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07471-4
Online ISBN: 978-3-031-07472-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics