Internal fraud risk reduction: Results of a data mining case study
Introduction
Saying that fraud is an important (however not loved) part of business, is nothing new. Fraud is a multi-million dollar business concern, as several research studies reveal and as reflected in recent surveys by the Association of Certified Fraud Examiners (ACFE, 2008) and PriceWaterhouse & Coopers (PwC, 2007). The ACFE study conducted in 2007–2008 in the United States reported that company's estimate a loss of 7% of annual revenues to fraud. Applied to US$ 14,196 billion of United States Gross Domestic Product in 2008, this would translate to approximately US$ 994 billion in fraud losses for the United States. The PwC worldwide study revealed that 43% of the companies surveyed had fallen victim to economic crime in the years 2006 and 2007 (PwC, 2007). The average financial damage to these companies was US$ 2.42 million per company over two years. These survey reports demonstrate the magnitude of fraud that companies face today.
Numerous academic studies used data mining to investigate fraud detection (Brockett et al., 2002, Cortes et al., 2002, Estévez et al., 2006, Fanning & Cogger, 1998, Kim & Kwon, 2006) and (; Kirkos et al., 2007). Although prior research has investigated fraud within different domains and using different techniques, the studies all focused on external fraud3 and used predictive data mining. Our study differs in several ways. First, we focus on internal fraud as internal fraud represents the majority of costs as identified by the PwC and ACFE surveys. Second, while prior studies examined only fraud detection, our study investigates internal fraud risk reduction, the combination of fraud detection and fraud prevention. Companies' risk exposure would be substantially greater if they only focused on fraud detection, a reactive working method. Companies use a combination of detection and prevention controls to help minimize their fraud risk. Hence, our study provides a more comprehensive view of the real world. Third, prior research used predictive data mining or more precisely predictive classification techniques. The purpose of these techniques is to classify whether an observation is fraudulent or not. Because we are focusing on risk reduction rather than detection, we believe descriptive data mining is more suited. Descriptive data mining provides us with insights on the complete data set rather than only one aspect of it, i.e., fraudulent or not. This characteristic is valuable for assessing the fraud risk in selected business processes.
The aim of this paper is to provide a framework for both researchers and practitioners to reduce internal fraud risk and to present empirical results on this topic by applying this framework. Based on data collected from an international financial service provider, we investigated fraud risk reduction in the procurement process. The results are promising. In both a subset of recent and old purchasing orders a small cluster with a high risk profile is found. The population of old purchasing orders was of such size that full examination of the specified cluster was feasible. Our analysis suggested a closer examination of ten cases. Of these ten purchasing orders nine were circumventing procedures (creating windows of opportunity to commit fraud), and one was the result of an error.
In the following sections we explain the methodology used in this study, the data set, the latent class clustering algorithm, and the results of investigating the procurement business process of the case company. We first apply a univariate analysis to explore the data and thereafter a multivariate analysis. We compare the results of both analyzes and conclude with the implications of our findings.
Section snippets
Methodology
The applied methodology is the IFR² Framework of Jans et al. (2009), summarized in Fig. 1. The IFR² Framework, which stands for Internal Fraud Risk Reduction, is a conceptual framework to guide research in internal fraud risk reduction. As a first step, an organization should select a business process which it thinks is worthwhile investigating. Selection of a business process can be motivated by the following reasons: a business process that involves large cash flow, that is unstructured, that
Data set
Based on the selected methodology, we focus on the application of the IFR² Framework for a real life database. The data set used in this study was obtained from an international financial services provider. The corporation is ranked in the top 20 among the largest European financial institutions. The business process selected for internal fraud risk reduction is procurement. This selection is inspired by the lack of existing fraud files for the procurement business process within the case
Latent class clustering algorithm
For a descriptive data mining approach, we chose a latent class (LC) clustering algorithm. LC clustering was preferred to the more traditional K-means clustering for several reasons. The most important reason is that this algorithm allows for overlapping clusters. At LC clustering, an observation is given a set of probabilities, expressing the probabilities of belonging to each cluster. Example given, in a 3 cluster setting observation A has p = .80 for cluster 1, p = .20 for cluster 2 and p = .00
Model specifications
Before turning to the core of the model applied in the descriptive data mining approach on behavior describing attributes, univariate clustering is applied to provide a comparative basis for exploring the data. Performing univariate analyzes is a common way of exploring the data at hand, before turning to more complex analyzes, such as multivariate analysis. The univariate analysis is applied on obvious attributes. The three numerical attributes were selected: number of changes (Model A),
Model specifications
Although the univariate clustering analysis showed some interesting deviating characteristics, it yielded contradictory information, depending on which attribute was selected to cluster on. A multivariate analysis takes several attributes at the same time into account and is therefore better suited in a real life scenario than selecting only one attribute at a time. Also, we needed multivariate analysis for conducting a data mining step. Before applying this analysis, the third step of our
Audit by domain experts
Because it is too time consuming to audit all 408 POs of cluster 3, it can be interesting to take a sample of POs that are made by one of the creators described above or involve one of those suppliers (or both). In this context a smaller sample of cluster 3 was extracted by taking only those POs of the six creators or in which one of the three suppliers that are most represented in the cluster were involved. This yielded a sample of 38 POs. Why is it that they merely induced POs in this small
Multivariate versus univariate analysis
The results of using a multivariate descriptive data mining approach based on behavior describing attributes, provided us with interesting results. In the smaller subset of old POs we encountered POs that are changed over and over again. Also in the larger subset, changing the PO a lot of times is a primal characteristic of the selected observations. However, one could wonder if this outcome was not much easier to obtain, simply by applying univariate clustering instead of multivariate
Conclusion
In this paper, a methodology for reducing internal fraud risk, the IFR² Framework (Jans et al. 2009), is applied in a top 20 ranked European financial institution. The results of the case study suggest that the use of a descriptive data mining approach and the multivariate latent class clustering technique, can be of additional value to reduce the risk of internal fraud in a company. Using univariate latent class clustering did not yield the same results. The application of the IFR² Framework
References (23)
- et al.
Continuous monitoring of business process controls: a pilot implementation of a continuous auditing systems at Siemens
Int J Account Inf Syst
(2006) - et al.
Restoring auditor credibility: tertiary monitoring and logging of continuous assurance systems
Int. J. Account. Inf. Syst.
(2004) - et al.
Audit support systems and decision aids: current practice and opportunities for future research
Int J Account Inf Syst
(2007) X-raying segregation of duties: support to illuminate an enterprise's immunity to solo-fraud
Int J Account Inf Syst
(2008)- et al.
Subscription fraud prevention in telecommunications using fuzzy rules and neural networks
Expert Syst Appl
(2006) - et al.
Complementary controls and ERP implementation success
Int J Account Inf Syst
(2007) - et al.
Data mining techniques for the detection of fraudulent financial statements
Expert Syst Appl
(2007) - et al.
Internal and external influences on IT control governance
Int J Account Inf Syst
(2007) - et al.
Understanding the potential impact of information technology on the susceptibility of organizations to fraudulent employee behaviour
Int J Account Inf Syst
(2003) - et al.
An evidential reasoning approach to Sarbanes–Oxley mandated internal control risk assessment
Int J Account Inf Syst
(2009)