Skip to main content
Log in

Fraud detection for job placement using hierarchical clusters-based deep neural networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Fraud detection is becoming an integral part of business intelligence, as detecting fraud in the work processes of a company is of great value. Fraud is an inhibitory factor to accurate appraisal in the evaluation of an enterprise, and it is economically a loss factor to business. Previous studies for fraud detection have limited the performance enhancement because they have learned the fraud pattern of the whole data. This paper proposes a novel method using hierarchical clusters based on deep neural networks in order to detect more detailed frauds, as well as frauds of whole data in the work processes of job placement. The proposed method, Hierarchical Clusters-based Deep Neural Networks (HC-DNN) utilizes anomaly characteristics of hierarchical clusters pre-trained through an autoencoder as the initial weights of deep neural networks to detect various frauds. HC-DNN has the advantage of improving the performance and providing the explanation about the relationship of fraud types. As a result of evaluating the performance of fraud detection by cross validation, the results of the proposed method show higher performance than those of conventional methods. And from the viewpoint of explainable deep learning the hierarchical cluster structure constructed through HC-DNN can represent the relationship of fraud types.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Andrews MJ, Bradley S, Stott D, Upward R (2008) Successful Employer Search? An Empirical analysis of vacancy duration using micro data. Economica 75(299):455–480

    Article  Google Scholar 

  2. Jacobi L, Kluve J (2006) Before and after the Hartz reforms: The performance of active labour market policy in Germany. Institute for the Study of Labor 40(1):45–64

    Google Scholar 

  3. Perry A (2000) Performance indicators: measure for measure or a comedy of errors?. In: Proceedings of Further Education Development Agency Research Conference, pp 57–76

  4. Singh H, Singh BP (2013) Business Intelligence: Effective machine learning for business administration. International Journal of IT. International Journal of IT, Engineering and Applied Sciences Research (IJIEASR) 2(1):13–19

    Google Scholar 

  5. Vidros S, Kolias C, Kambourakis G, Akoglu L (2017) Automatic detection of online recruitment frauds: Characteristics, methods, and a public dataset. Future Internet 9(1):6

    Article  Google Scholar 

  6. Jans M, Lybaert N, Vanhoof K (2010) A framework for internal fraud risk reduction at IT integrating business processes: the IFR2 framework. Int. J. Digit. Account. Res. 9:1–29

    Google Scholar 

  7. Schreyer M, Sattarov T, Borth D, Dengel A, Reimer B (2017) Detection of Anomalies in Large Scale Accounting Data using Deep Autoencoder Networks. arXiv preprint arXiv:1709.05254 (last visited on 2112 2018)

  8. Bolton RJ, Hand DJ (2002) Statistical fraud detection: A review. Stat. Sci. 17(3):235–255

    Article  MathSciNet  MATH  Google Scholar 

  9. Nolle T, Luettgen S, Seeliger A, Mühlhäuser M (2018) Analyzing business process anomalies using autoencoders. Mach. Learn. https://doi.org/10.1007/s1099 (last visited on 2112 2018)

  10. Bhattacharyya S, Jha S, Tharakunnel K, Westland JC (2011) Data mining for credit card fraud: A comparative study. Decis. Support. Syst. 50(3):602–613

    Article  Google Scholar 

  11. Benmessahel I, Xie K, Chellal M (2018) A new evolutionary neural networks based on intrusion detection systems using multiverse optimization. Appl. Intell. 48(8):2315–2327

    Article  Google Scholar 

  12. Chakraborty S, Gupta S, Ray A, Mukhopadhyay A (2008) Data-driven fault detection and estimation in thermal pulse combustors. J. Aerosp. Eng. 222(8):1097–1108

    Google Scholar 

  13. Zaher A, McArthur SDJ, Infield DG, Patel Y (2009) Online wind turbine fault detection through automated SCADA data analysis. Wind Energy 12(6):574–593

    Article  Google Scholar 

  14. Ogbonnaya EA, Ugwu HU, Theophilus-Johnson K (2012) Gas Turbine Engine Anomaly Detection through Computer Simulation Technique of Statistical Correlation. IOSR Journal of Engineering 2(4):544–554

    Article  Google Scholar 

  15. McKeever G (1999) Detecting, Prosecuting and punishing benefit fraud: The Social Security Administration (Fraud). Act 1997. The Modern Law Review 62(2):261–270

    Article  Google Scholar 

  16. Correia I, Fournier F, Skarbovsky I (2015) The uncertain case of credit card fraud detection. In: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, pp 181–192

  17. Navigli R (2009) Word sense disambiguation: A survey. ACM Comput. Surv. 41(2):1–69

    Article  Google Scholar 

  18. Choi SP (2018) Extraction of protein–protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings. J. Inf. Sci. 44(1):60–73

    Article  Google Scholar 

  19. Leon F, Floria SA, Bădică C (2017) Evaluating the effect of voting methods on ensemble-based classification. In: Proceedings of 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pp 1–6

  20. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: A highly efficient gradient boosting decision tree. In: Proceedings of. Adv. Neural Inf. Proces. Syst.:3146–3154

  21. Biau G, Scornet E (2016) A random forest guided tour. Test 25(2):197–227

    Article  MathSciNet  MATH  Google Scholar 

  22. Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3):297–336

    Article  MATH  Google Scholar 

  23. Zhang F, Du B, Zhang L (2016) Scene classification via a gradient boosting random convolutional network framework. IEEE Trans. Geosci. Remote Sens. 54(3):1793–1802

    Article  Google Scholar 

  24. Taieb SB, Hyndman RJ (2014) A gradient boosting approach to the Kaggle load forecasting competition. Int. J. Forecast. 30(2):382–394

    Article  Google Scholar 

  25. Razzaghi T, Xanthopoulos P, Şeref O (2017) Constraint relaxation, cost-sensitive learning and bagging for imbalanced classification problems with outliers. Optim. Lett. 11(5):915–928

    Article  MathSciNet  MATH  Google Scholar 

  26. Belgiu M, Drăguţ L (2016) Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 114:24–31

    Article  Google Scholar 

  27. Kussul N, Lavreniuk M, Skakun S, Shelestov A (2017) Deep learning classification of land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. 14(5):778–782

    Article  Google Scholar 

  28. Kou Y, Lu CT, Sirwongwattana S, Huang YP (2004) Survey of fraud detection techniques. In: Proceedings of 2004 IEEE international conference on Networking, sensing and control, pp 749–754

  29. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. Vol 1. MIT Press, Cambridge, pp 482–586

    MATH  Google Scholar 

  30. Hinton GE, Salakhutdinov RR (2006) Reducing the Dimensionality of Data with Neural Networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  31. Maltarollo VG, Honório KM, da Silva ABF (2013) Applications of artificial neural networks in chemical problems. In: Proceedings of Artificial neural networks-architectures and applications, pp 203–223

  32. Hershey S, Chaudhuri S, Ellis DPW, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B, Slaney M, Weiss RJ, Wilson K (2017) CNN architectures for large-scale audio classification. In: Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 131–135

  33. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (last visited on 2112 2018)

  34. Fu K, Cheng D, Tu Y, Zhang L (2016) Credit card fraud detection using convolutional neural networks. In: Proceedings of International Conference on Neural Information Processing, pp 483–490

  35. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans. Pattern Anal. Mach. Intell. 40(4):834–848

    Article  Google Scholar 

  36. Babaee M, Dinh DT, Rigoll G (2018) A deep convolutional neural network for video sequence background subtraction. Pattern Recogn. 76:635–649

    Article  Google Scholar 

  37. Yang HF, Lin K, Chen CS (2018) Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 40(2):437–451

    Article  Google Scholar 

  38. Jiang C, Song J, Liu G, Zheng L, Luan W (2018) Credit Card Fraud Detection: A Novel Approach Using Aggregation Strategy and Feedback Mechanism. IEEE Internet Things J. 5(5):3637–3647

    Article  Google Scholar 

  39. Duman E, Elikucuk I (2013) Solving credit card fraud detection problem by the new metaheuristics migrating birds optimization. In: Proceedings of International Work-Conference on Artificial Neural Networks, pp 62–71

  40. Akhilomen J (2013) Data mining application for cyber credit-card fraud detection system. In: Proceeding of Industrial Conference on Data Mining, pp 218–228

  41. Ki Y, Yoon JW (2017) PD-FDS: Purchase Density based Online Credit Card Fraud Detection System. In: Proceedings of KDD 2017 Workshop on Anomaly Detection in Finance, pp 76–84

  42. Wheeler R, Aitken S (2000) Multiple algorithms for fraud detection. Knowl.-Based Syst. 13(2–3):93–99

    Article  Google Scholar 

  43. Kültür Y, Çağlayan MU (2017) Hybrid approaches for detecting credit card fraud. Expert. Syst. 34(2). https://doi.org/10.1111/exsy.12191 (last visited on 2112 2018)

  44. Xu W, Wang S, Zhang D, Yang B (2011) Random rough subspace based neural network ensemble for insurance fraud detection. In: Proceedings of International Joint Conference on Computational Sciences and Optimization (CSO), pp 1276–1280

  45. Wang Y, Xu W (2018) Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decis. Support. Syst. 105:87–95

    Article  Google Scholar 

  46. Bolton RJ, Hand DJ (2001) Unsupervised profiling methods for fraud detection. In: Proceedings of Credit Scoring and Credit Control VII, pp 235–255

  47. Anandakrishnan A, Kumar S, Statnikov A, Faruquie T, Xu D (2017) Anomaly Detection in Finance: Editors’ Introduction. In: Proceedings of Machine Learning Research, pp 1–7

  48. Jiang F, Chen YM (2015) Outlier detection based on granular computing and rough set theory. Appl. Intell. 42(2):303–322

    Article  Google Scholar 

  49. Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Proceedings of International Conference on Data Warehousing and Knowledge Discovery, pp 170–180

  50. Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, pp 709–712

  51. Cozzolino D, Verdoliva L. (2016) Single-image splicing localization through autoencoder-based anomaly detection. In: Proceedings of 2016 IEEE International Workshop on Information Forensics and Security (WIFS), pp 1–6

  52. Agarwal B, Mittal N (2012) Hybrid Approach for Detection of Anomaly Network Traffic using Data Mining Techniques. Procedia Technology 6:996–1003

    Article  Google Scholar 

  53. Andrews JT, Morton EJ, Griffin LD (2016) Detecting anomalous data using auto-encoders. International Journal of Machine Learning and Computing 6(1):1–21

    Google Scholar 

  54. Zhai S, Cheng Y, Lu W, Zhang Z (2016) Deep structured energy based models for anomaly detection. arXiv preprint arXiv:1605.07717 (last visited on 2112 2018)

  55. Mao W, He J, Li Y, Yan Y (2017) Bearing fault diagnosis with auto-encoder extreme learning machine: A comparative study. J. Mech. Eng. Sci. 231(8):1560–1578

    Article  Google Scholar 

  56. Lin S, Brown DE (2006) An outlier-based data association method for linking criminal incidents. Decis. Support. Syst. 41(3):604–615

    Article  Google Scholar 

  57. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  58. Erhan D, Bengio Y, Courville A, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660

    MathSciNet  MATH  Google Scholar 

  59. Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of Advances in Neural Information Processing Systems, pp 153–160

  60. Hinton GE, Osindero S, Teh YW (2006) A Fast Learning Algorithm for Deep Belief Nets. Neural Comput. 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  61. Gao S, Zhang Y, Jia K, Lu J, Zhang Y (2015) Single sample face recognition via learning deep supervised autoencoders. IEEE Transactions on Information Forensics and Security 10(10):2108–2118

    Article  Google Scholar 

  62. Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the conference on empirical methods in natural language processing, Association for Computational Linguistics, pp 151–161

  63. Pollack JB (1990) Recursive distributed representations. Artif. Intell. 46(1–2):77–105

    Article  Google Scholar 

  64. Voegtlin T, Dominey PF (2005) Linear recursive distributed representations. Neural Netw. 18(7):878–895

    Article  MATH  Google Scholar 

  65. Elman JL (1991) Distributed representations, simple recurrent networks, and grammatical structure. Mach. Learn. 7(2-3):195–225

    Article  Google Scholar 

  66. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, pp 2672–2680

  67. Liang D, Krishnan RG, Hoffman MD, Jebara T (2018) Variational Autoencoders for Collaborative Filtering. arXiv preprint arXiv:1802.05814 (last visited on 2112 2018)

  68. Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning, pp 1096–1103

  69. Deng J, Zhang Z, Marchi E, Schuller B (2013) Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: Proceedings of 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp 511–516

  70. Wu Y, DuBois C, Zheng AX, Ester M (2016) Collaborative denoising auto-encoders for top-n recommender systems. In: Proceedings of the 9th ACM International Conference on Web Search and Data Mining, pp 153–162

  71. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  72. Das K, Schneider J (2007) Detecting anomalous records in categorical datasets. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 220–229

  73. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, pp 93–104

  74. Kim H, Chan P (2008) Learning Implicit User Interest Hierarchy for Context in Personalization. Appl. Intell. 28(2):153–166

    Article  Google Scholar 

  75. Takezawa K (2005) Introduction to nonparametric regression, vol 606. John Wiley & Sons, Hoboken, pp 325–406

    Book  Google Scholar 

  76. Carlsson G, Mémoli F, Ribeiro A, Segarra S (2013) Axiomatic construction of hierarchical clustering in asymmetric networks. In: Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 5219–5223

  77. Bengio Y, Yao L, Alain G, Vincent P (2013) Generalized denoising auto-encoders as generative models. In: Proceedings of Advances in Neural Information Processing Systems, pp 899–907

  78. Salakhutdinov R, Hinton G (2007) Learning a nonlinear embedding by preserving class neighbourhood structure. In: Proceedings of Artificial Intelligence and Statistics, pp 412–419

  79. Shirin G (2017) Autoencoders and anomaly detection with machine learning in fraud analytics. Shirin's palygRound, https://shiring.github.io (last visited on 2112 2018)

  80. Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Addison Wesley, Boston, pp 485–664

    Google Scholar 

  81. Kodinariya TM, Makwana PR (2013) Review on determining number of Cluster in K-Means Clustering. Int. J. 1(6):90–95

    Google Scholar 

  82. Friedman JH (2002) Stochastic gradient boosting. Computational Statistics & Data Analysis 38(4):367–378

    Article  MathSciNet  MATH  Google Scholar 

  83. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5):1189–1232

    Article  MathSciNet  MATH  Google Scholar 

  84. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9):1263–1284

    Article  Google Scholar 

  85. Agarwal S, Dugar D, Sengupta S (2010) Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach. J. Chem. Inf. Model. 50(5):716–731

    Article  Google Scholar 

  86. Rodriguez M, Posse C, Zhang E (2012) Multiple objective optimization in recommender systems. In: Proceedings of the 6th ACM conference on Recommender systems, pp 11–18

  87. Christopher DM, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge, pp 145–169

    MATH  Google Scholar 

  88. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th international conference on artificial intelligence and statistics, pp 249–256

  89. Zeiler MD (2012) ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (last visited on 2112 2018)

  90. Gunn SR (1998) Support vector machines for classification and regression. ISIS Technical Report 14(1):5–16

    Google Scholar 

  91. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3):37–52

    Article  Google Scholar 

  92. Murtagh F, Pierre L (2014) Ward’s hierarchical agglomerative clustering method: Which algorithms implement Ward’s criterion? J. Classif. 31(3):274–295

    Article  MathSciNet  MATH  Google Scholar 

  93. Defays D (1977) An efficient algorithm for a complete link method. Comput. J. 20(4):364–366

    Article  MathSciNet  MATH  Google Scholar 

  94. Sipser M (2006) Introduction to the Theory of Computation. Thomson Course Technology, pp 245–411

  95. Shindler M, Wong A, Meyerson AW (2011) Fast and accurate k-means for large datasets. In: Proceedings of Advances in neural information processing systems, pp 2375–2383

  96. Dhillon IS, Parlett BN (2003) Orthogonal eigenvectors and relative gaps. SIAM Journal on Matrix Analysis and Applications 25(3):858–899

    Article  MathSciNet  MATH  Google Scholar 

  97. Nguyen TD, Schmidt B, Kwoh CK (2014) SparseHC: a memory-efficient online hierarchical clustering algorithm. Procedia Computer Science 29:8–19

    Article  Google Scholar 

  98. Kim H, Jang C, Yadav DK, Kim MH (2017) The comparison of automated clustering algorithms for resampling representative conformer ensembles with RMSD matrix. Journal of Cheminformatics 9(1):1–21

    Article  Google Scholar 

  99. Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a hierarchical cluster tree: the dynamic tree cut package for R. Bioinformatics 24(5):719–720

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1A02086148), and was also supported by the MSIT (Ministry of Science and ICT), Korea under the ITRC (Information Technology Research Center) support program (IITP-2018-08-01417) supervised by the IITP (Institute for Information & communications Technology Promotion).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Han-Joon Kim.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Kim, HJ. & Kim, H. Fraud detection for job placement using hierarchical clusters-based deep neural networks. Appl Intell 49, 2842–2861 (2019). https://doi.org/10.1007/s10489-019-01419-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01419-2

Keywords

Navigation