Learned lessons in credit card fraud detection from a practitioner perspective
Introduction
Nowadays, enterprises and public institutions have to face a growing presence of fraud initiatives and need automatic systems to implement fraud detection (Delamaire, Abdou, & Pointon, 2009). Automatic systems are essential since it is not always possible or easy for a human analyst to detect fraudulent patterns in transaction datasets, often characterized by a large number of samples, many dimensions and online updates. Also, the cardholder is not reliable in reporting the theft, loss or fraudulent use of a card (Pavía, Veres-Ferrer, & Foix-Escura, 2012). Since the number of fraudulent transactions is much smaller than the legitimate ones, the data distribution is unbalanced, i.e. skewed towards non-fraudulent observations. It is well known that many learning algorithms underperform when used for unbalanced dataset (Japkowicz & Stephen, 2002) and methods (e.g. resampling) have been proposed to improve their performances. Unbalancedness is not the only factor that determines the difficulty of a classification/detection task. Another influential factor is the amount of overlapping of the classes of interest due to limited information that transaction records provide about the nature of the process (Holte, Acker, & Porter, 1989).
Detection problems are typically addressed in two different ways. In the static learning setting, a detection model is periodically relearnt from scratch (e.g. once a year or month). In the online learning setting, the detection model is updated as soon as new data arrives. Though this strategy is the most adequate to deal with issues of non stationarity (e.g. due to the evolution of the spending behavior of the regular card holder or the fraudster), little attention has been devoted in the literature to the unbalanced problem in changing environment.
Another problematic issue in credit card detection is the scarcity of available data due to confidentiality issues that give little chance to the community to share real datasets and assess existing techniques.
Section snippets
Contributions
This paper aims at making an experimental comparison of several state of the art algorithms and modeling techniques on one real dataset, focusing in particular on some open questions like: Which machine learning algorithm should be used? Is it enough to learn a model once a month or it is necessary to update the model everyday? How many transactions are sufficient to train the model? Should the data be analyzed in their original unbalanced form? If not, which is the best way to rebalance them?
State of the art in credit card fraud detection
Credit card fraud detection is one of the most explored domains of fraud detection (Chan et al., 1999, Bolton and Hand, 2001, Brause et al., 1999) and relies on the automatic analysis of recorded transactions to detect fraudulent behavior. Every time a credit card is used, transaction data, composed of a number of attributes (e.g. credit card identifier, transaction date, recipient, amount of the transaction), are stored in the databases of the service provider.
However a single transaction
Formalization of the learning problem
In this section, we formalize the credit card fraud detection task as a statistical learning problem. Let be the transaction number j of a card number i. We assume that the transactions are ordered in time such that if occurs before then . For each transaction some basic information is available such as amount of the expenditure, the shop where it was performed, the currency, etc. However these variables do not provide any information about the normal card usage. The normal
Performance measure
Fraud detection must deal with the following challenges: (i) timeliness of decision (a card should be blocked as soon as it is found victim of fraud, quick reaction to the appearance of the first can prevent other frauds), (ii) unbalanced class sizes (the number of frauds are relatively small compare to genuine transactions) and (iii) cost structure of the problem (the cost of a fraud is not easy to define). The cost of a fraud is often assumed to be equal to the transaction amount (Elkan, 2001
Strategies for incremental learning with unbalanced fraud data
The most conventional way to deal with sequential fraud data is to adopt a static approach (Fig. 1) which creates once in a while a classification model and uses it as a predictor during a long horizon. Though this approach reduces the learning effort, its main problem resides in the lack of adaptivity which makes it insensitive to any change of distribution in the upcoming chunks.
On the basis of the state-of-the-work described in Section 3.3, it is possible to conceive two alternative
Experimental assessment
In this section we perform an extensive experimental assessment on the basis of real data (Section 7.1) in order to address common issues that the practitioner has to solve when facing large credit card fraud datasets (Section 7.2).
Future work
Future work will focus on the automatic selection of the best unbalanced technique in the case of online learning. Dal Pozzolo, Caelen, Waterschoot, and Bontempi (2013) recently proposed to use a F-race (Birattari, Stützle, Paquete, & Varrentrapp, 2002) algorithm to automatically select the correct unbalanced strategy for a given dataset. In their work a cross validation is used to feed the data into the race. A natural extension of this work could be the use of racing in incremental data where
Conclusion
The need to detect fraudulent patterns in huge amount of data demands the adoption of automatic methods. The scarcity of public available dataset in credit card transactions gives little chance to the community to test and asses the impact of existing techniques on real data. The goal of our work it to give some guidelines to practitioners on how to tackle the detection problem.
The paper presents the fraud detection problem and proposes AP, AUC and PrecisonRank as correct performance measures
References (55)
The area above the ordinal dominance graph and the area below the receiver operating characteristic graph
Journal of Mathematical Psychology
(1975)Fast effective rule induction
Nonlinear neural networks: Principles, mechanisms, and architectures
Neural Networks
(1988)- et al.
Credit card incidents and control systems
International Journal of Information Management
(2012) - et al.
Real-time credit card fraud detection using computational intelligence
Expert Systems with Applications
(2008) - et al.
Applying one-sided selection to unbalanced datasets
MICAI 2000: Advances in Artificial Intelligence
(2000) - Birattari, M., Stützle, T., Paquete, L., & Varrentrapp, K. (2002). A racing algorithm for configuring metaheuristics....
- et al.
Unsupervised profiling methods for fraud detection
Credit Scoring and Credit Control
(2001) - et al.
Statistical fraud detection: A review
Statistical Science
(2002) - et al.
Neural data mining for credit card fraud detection