Abstract
Fraud detection is becoming an integral part of business intelligence, as detecting fraud in the work processes of a company is of great value. Fraud is an inhibitory factor to accurate appraisal in the evaluation of an enterprise, and it is economically a loss factor to business. Previous studies for fraud detection have limited the performance enhancement because they have learned the fraud pattern of the whole data. This paper proposes a novel method using hierarchical clusters based on deep neural networks in order to detect more detailed frauds, as well as frauds of whole data in the work processes of job placement. The proposed method, Hierarchical Clusters-based Deep Neural Networks (HC-DNN) utilizes anomaly characteristics of hierarchical clusters pre-trained through an autoencoder as the initial weights of deep neural networks to detect various frauds. HC-DNN has the advantage of improving the performance and providing the explanation about the relationship of fraud types. As a result of evaluating the performance of fraud detection by cross validation, the results of the proposed method show higher performance than those of conventional methods. And from the viewpoint of explainable deep learning the hierarchical cluster structure constructed through HC-DNN can represent the relationship of fraud types.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andrews MJ, Bradley S, Stott D, Upward R (2008) Successful Employer Search? An Empirical analysis of vacancy duration using micro data. Economica 75(299):455–480
Jacobi L, Kluve J (2006) Before and after the Hartz reforms: The performance of active labour market policy in Germany. Institute for the Study of Labor 40(1):45–64
Perry A (2000) Performance indicators: measure for measure or a comedy of errors?. In: Proceedings of Further Education Development Agency Research Conference, pp 57–76
Singh H, Singh BP (2013) Business Intelligence: Effective machine learning for business administration. International Journal of IT. International Journal of IT, Engineering and Applied Sciences Research (IJIEASR) 2(1):13–19
Vidros S, Kolias C, Kambourakis G, Akoglu L (2017) Automatic detection of online recruitment frauds: Characteristics, methods, and a public dataset. Future Internet 9(1):6
Jans M, Lybaert N, Vanhoof K (2010) A framework for internal fraud risk reduction at IT integrating business processes: the IFR2 framework. Int. J. Digit. Account. Res. 9:1–29
Schreyer M, Sattarov T, Borth D, Dengel A, Reimer B (2017) Detection of Anomalies in Large Scale Accounting Data using Deep Autoencoder Networks. arXiv preprint arXiv:1709.05254 (last visited on 2112 2018)
Bolton RJ, Hand DJ (2002) Statistical fraud detection: A review. Stat. Sci. 17(3):235–255
Nolle T, Luettgen S, Seeliger A, Mühlhäuser M (2018) Analyzing business process anomalies using autoencoders. Mach. Learn. https://doi.org/10.1007/s1099 (last visited on 2112 2018)
Bhattacharyya S, Jha S, Tharakunnel K, Westland JC (2011) Data mining for credit card fraud: A comparative study. Decis. Support. Syst. 50(3):602–613
Benmessahel I, Xie K, Chellal M (2018) A new evolutionary neural networks based on intrusion detection systems using multiverse optimization. Appl. Intell. 48(8):2315–2327
Chakraborty S, Gupta S, Ray A, Mukhopadhyay A (2008) Data-driven fault detection and estimation in thermal pulse combustors. J. Aerosp. Eng. 222(8):1097–1108
Zaher A, McArthur SDJ, Infield DG, Patel Y (2009) Online wind turbine fault detection through automated SCADA data analysis. Wind Energy 12(6):574–593
Ogbonnaya EA, Ugwu HU, Theophilus-Johnson K (2012) Gas Turbine Engine Anomaly Detection through Computer Simulation Technique of Statistical Correlation. IOSR Journal of Engineering 2(4):544–554
McKeever G (1999) Detecting, Prosecuting and punishing benefit fraud: The Social Security Administration (Fraud). Act 1997. The Modern Law Review 62(2):261–270
Correia I, Fournier F, Skarbovsky I (2015) The uncertain case of credit card fraud detection. In: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, pp 181–192
Navigli R (2009) Word sense disambiguation: A survey. ACM Comput. Surv. 41(2):1–69
Choi SP (2018) Extraction of protein–protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings. J. Inf. Sci. 44(1):60–73
Leon F, Floria SA, Bădică C (2017) Evaluating the effect of voting methods on ensemble-based classification. In: Proceedings of 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp 1–6
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: A highly efficient gradient boosting decision tree. In: Proceedings of. Adv. Neural Inf. Proces. Syst.:3146–3154
Biau G, Scornet E (2016) A random forest guided tour. Test 25(2):197–227
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3):297–336
Zhang F, Du B, Zhang L (2016) Scene classification via a gradient boosting random convolutional network framework. IEEE Trans. Geosci. Remote Sens. 54(3):1793–1802
Taieb SB, Hyndman RJ (2014) A gradient boosting approach to the Kaggle load forecasting competition. Int. J. Forecast. 30(2):382–394
Razzaghi T, Xanthopoulos P, Şeref O (2017) Constraint relaxation, cost-sensitive learning and bagging for imbalanced classification problems with outliers. Optim. Lett. 11(5):915–928
Belgiu M, Drăguţ L (2016) Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 114:24–31
Kussul N, Lavreniuk M, Skakun S, Shelestov A (2017) Deep learning classification of land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. 14(5):778–782
Kou Y, Lu CT, Sirwongwattana S, Huang YP (2004) Survey of fraud detection techniques. In: Proceedings of 2004 IEEE international conference on Networking, sensing and control, pp 749–754
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. Vol 1. MIT Press, Cambridge, pp 482–586
Hinton GE, Salakhutdinov RR (2006) Reducing the Dimensionality of Data with Neural Networks. Science 313(5786):504–507
Maltarollo VG, Honório KM, da Silva ABF (2013) Applications of artificial neural networks in chemical problems. In: Proceedings of Artificial neural networks-architectures and applications, pp 203–223
Hershey S, Chaudhuri S, Ellis DPW, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B, Slaney M, Weiss RJ, Wilson K (2017) CNN architectures for large-scale audio classification. In: Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 131–135
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (last visited on 2112 2018)
Fu K, Cheng D, Tu Y, Zhang L (2016) Credit card fraud detection using convolutional neural networks. In: Proceedings of International Conference on Neural Information Processing, pp 483–490
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans. Pattern Anal. Mach. Intell. 40(4):834–848
Babaee M, Dinh DT, Rigoll G (2018) A deep convolutional neural network for video sequence background subtraction. Pattern Recogn. 76:635–649
Yang HF, Lin K, Chen CS (2018) Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(2):437–451
Jiang C, Song J, Liu G, Zheng L, Luan W (2018) Credit Card Fraud Detection: A Novel Approach Using Aggregation Strategy and Feedback Mechanism. IEEE Internet Things J. 5(5):3637–3647
Duman E, Elikucuk I (2013) Solving credit card fraud detection problem by the new metaheuristics migrating birds optimization. In: Proceedings of International Work-Conference on Artificial Neural Networks, pp 62–71
Akhilomen J (2013) Data mining application for cyber credit-card fraud detection system. In: Proceeding of Industrial Conference on Data Mining, pp 218–228
Ki Y, Yoon JW (2017) PD-FDS: Purchase Density based Online Credit Card Fraud Detection System. In: Proceedings of KDD 2017 Workshop on Anomaly Detection in Finance, pp 76–84
Wheeler R, Aitken S (2000) Multiple algorithms for fraud detection. Knowl.-Based Syst. 13(2–3):93–99
Kültür Y, Çağlayan MU (2017) Hybrid approaches for detecting credit card fraud. Expert. Syst. 34(2). https://doi.org/10.1111/exsy.12191 (last visited on 2112 2018)
Xu W, Wang S, Zhang D, Yang B (2011) Random rough subspace based neural network ensemble for insurance fraud detection. In: Proceedings of International Joint Conference on Computational Sciences and Optimization (CSO), pp 1276–1280
Wang Y, Xu W (2018) Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decis. Support. Syst. 105:87–95
Bolton RJ, Hand DJ (2001) Unsupervised profiling methods for fraud detection. In: Proceedings of Credit Scoring and Credit Control VII, pp 235–255
Anandakrishnan A, Kumar S, Statnikov A, Faruquie T, Xu D (2017) Anomaly Detection in Finance: Editors’ Introduction. In: Proceedings of Machine Learning Research, pp 1–7
Jiang F, Chen YM (2015) Outlier detection based on granular computing and rough set theory. Appl. Intell. 42(2):303–322
Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Proceedings of International Conference on Data Warehousing and Knowledge Discovery, pp 170–180
Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, pp 709–712
Cozzolino D, Verdoliva L. (2016) Single-image splicing localization through autoencoder-based anomaly detection. In: Proceedings of 2016 IEEE International Workshop on Information Forensics and Security (WIFS), pp 1–6
Agarwal B, Mittal N (2012) Hybrid Approach for Detection of Anomaly Network Traffic using Data Mining Techniques. Procedia Technology 6:996–1003
Andrews JT, Morton EJ, Griffin LD (2016) Detecting anomalous data using auto-encoders. International Journal of Machine Learning and Computing 6(1):1–21
Zhai S, Cheng Y, Lu W, Zhang Z (2016) Deep structured energy based models for anomaly detection. arXiv preprint arXiv:1605.07717 (last visited on 2112 2018)
Mao W, He J, Li Y, Yan Y (2017) Bearing fault diagnosis with auto-encoder extreme learning machine: A comparative study. J. Mech. Eng. Sci. 231(8):1560–1578
Lin S, Brown DE (2006) An outlier-based data association method for linking criminal incidents. Decis. Support. Syst. 41(3):604–615
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Erhan D, Bengio Y, Courville A, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of Advances in Neural Information Processing Systems, pp 153–160
Hinton GE, Osindero S, Teh YW (2006) A Fast Learning Algorithm for Deep Belief Nets. Neural Comput. 18(7):1527–1554
Gao S, Zhang Y, Jia K, Lu J, Zhang Y (2015) Single sample face recognition via learning deep supervised autoencoders. IEEE Transactions on Information Forensics and Security 10(10):2108–2118
Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the conference on empirical methods in natural language processing, Association for Computational Linguistics, pp 151–161
Pollack JB (1990) Recursive distributed representations. Artif. Intell. 46(1–2):77–105
Voegtlin T, Dominey PF (2005) Linear recursive distributed representations. Neural Netw. 18(7):878–895
Elman JL (1991) Distributed representations, simple recurrent networks, and grammatical structure. Mach. Learn. 7(2-3):195–225
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, pp 2672–2680
Liang D, Krishnan RG, Hoffman MD, Jebara T (2018) Variational Autoencoders for Collaborative Filtering. arXiv preprint arXiv:1802.05814 (last visited on 2112 2018)
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning, pp 1096–1103
Deng J, Zhang Z, Marchi E, Schuller B (2013) Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: Proceedings of 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp 511–516
Wu Y, DuBois C, Zheng AX, Ester M (2016) Collaborative denoising auto-encoders for top-n recommender systems. In: Proceedings of the 9th ACM International Conference on Web Search and Data Mining, pp 153–162
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371–3408
Das K, Schneider J (2007) Detecting anomalous records in categorical datasets. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 220–229
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, pp 93–104
Kim H, Chan P (2008) Learning Implicit User Interest Hierarchy for Context in Personalization. Appl. Intell. 28(2):153–166
Takezawa K (2005) Introduction to nonparametric regression, vol 606. John Wiley & Sons, Hoboken, pp 325–406
Carlsson G, Mémoli F, Ribeiro A, Segarra S (2013) Axiomatic construction of hierarchical clustering in asymmetric networks. In: Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 5219–5223
Bengio Y, Yao L, Alain G, Vincent P (2013) Generalized denoising auto-encoders as generative models. In: Proceedings of Advances in Neural Information Processing Systems, pp 899–907
Salakhutdinov R, Hinton G (2007) Learning a nonlinear embedding by preserving class neighbourhood structure. In: Proceedings of Artificial Intelligence and Statistics, pp 412–419
Shirin G (2017) Autoencoders and anomaly detection with machine learning in fraud analytics. Shirin's palygRound, https://shiring.github.io (last visited on 2112 2018)
Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Addison Wesley, Boston, pp 485–664
Kodinariya TM, Makwana PR (2013) Review on determining number of Cluster in K-Means Clustering. Int. J. 1(6):90–95
Friedman JH (2002) Stochastic gradient boosting. Computational Statistics & Data Analysis 38(4):367–378
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5):1189–1232
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9):1263–1284
Agarwal S, Dugar D, Sengupta S (2010) Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach. J. Chem. Inf. Model. 50(5):716–731
Rodriguez M, Posse C, Zhang E (2012) Multiple objective optimization in recommender systems. In: Proceedings of the 6th ACM conference on Recommender systems, pp 11–18
Christopher DM, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge, pp 145–169
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th international conference on artificial intelligence and statistics, pp 249–256
Zeiler MD (2012) ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (last visited on 2112 2018)
Gunn SR (1998) Support vector machines for classification and regression. ISIS Technical Report 14(1):5–16
Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3):37–52
Murtagh F, Pierre L (2014) Ward’s hierarchical agglomerative clustering method: Which algorithms implement Ward’s criterion? J. Classif. 31(3):274–295
Defays D (1977) An efficient algorithm for a complete link method. Comput. J. 20(4):364–366
Sipser M (2006) Introduction to the Theory of Computation. Thomson Course Technology, pp 245–411
Shindler M, Wong A, Meyerson AW (2011) Fast and accurate k-means for large datasets. In: Proceedings of Advances in neural information processing systems, pp 2375–2383
Dhillon IS, Parlett BN (2003) Orthogonal eigenvectors and relative gaps. SIAM Journal on Matrix Analysis and Applications 25(3):858–899
Nguyen TD, Schmidt B, Kwoh CK (2014) SparseHC: a memory-efficient online hierarchical clustering algorithm. Procedia Computer Science 29:8–19
Kim H, Jang C, Yadav DK, Kim MH (2017) The comparison of automated clustering algorithms for resampling representative conformer ensembles with RMSD matrix. Journal of Cheminformatics 9(1):1–21
Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics 24(5):719–720
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1A02086148), and was also supported by the MSIT (Ministry of Science and ICT), Korea under the ITRC (Information Technology Research Center) support program (IITP-2018-08-01417) supervised by the IITP (Institute for Information & communications Technology Promotion).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kim, J., Kim, HJ. & Kim, H. Fraud detection for job placement using hierarchical clusters-based deep neural networks. Appl Intell 49, 2842–2861 (2019). https://doi.org/10.1007/s10489-019-01419-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01419-2