Skip to main content

Applying Machine Learning to Anomaly Detection in Car Insurance Sales

  • Conference paper
  • First Online:
Big Data Analytics (BDA 2020)

Abstract

Financial revenue, in the insurance sector, is systematically rising. This growth is, primarily, related to an increasing number of sold policies. While there exists a substantial body of work focused on discovering insurance fraud, e.g. related to car accidents, an open question remains, is it possible to capture incorrect data in the sales systems. Such erroneous data can result in financial losses. It may be caused by mistakes made by the sales person(s), but may be also a result of a fraud. In this work, research is focused on detecting anomalies in car insurance contracts. It is based on a dataset obtained from an actual insurance company, based in Poland. This dataset is thoroughly analysed, including preprocessing and feature selection. Next, a number of anomaly detection algorithms are applied to it, and their performance is compared. Specifically, clustering algorithms, dynamic classifier selection, and gradient boosted decision trees, are experimented with. Furthermore, the scenario where the size of the dataset is increasing is considered. It is shown that use of, broadly understood, machine learning has a realistic potential to facilitate anomaly detection, during insurance policy sales.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Polish Central Statistical Office: Polish Insurance Market in 2018 (2019). https://stat.gov.pl/en/topics/economic-activities-finances/financial-results/polish-insurance-market-in-2018,2,8.html

  2. Talagala, P.D., Hyndman, R.J., Smith-Miles, K.: Anomaly detection in high dimensional data. J. Comput. Graph. Stat. (2020)

    Google Scholar 

  3. Thiprungsri, S., Vasarhelyi, M.A.: Cluster analysis for anomaly detection in accounting data: an accounting approach. Int. J. Digit. Account. Res. 11, 69–84 (2011)

    Google Scholar 

  4. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Symposium on Mathematical Statistics and Probability (1967)

    Google Scholar 

  5. Zhao, Y., Hryniewicki, M.K.: DCSO: dynamic combination of detector scores for outlier ensembles. In: ACM KDD Workshop on Outlier Detection De-Constructed (ODD v5.0) (2018)

    Google Scholar 

  6. Viaene, S., Derrig, R.A., Baesens, B., Dedene, G.: A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J. Risk Insur. 69, 373–421 (2002)

    Google Scholar 

  7. Hassan, A.K.I., Abraham, A.: Modeling insurance fraud detection using ensemble combining classification. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 8, 257–265 (2016)

    Google Scholar 

  8. DeBarr, D., Wechsler, H.: Fraud detection using reputation features, SVMs, and random forests. In: Proceedings of the International Conference on Data Science (2013)

    Google Scholar 

  9. Niana, K., Zhanga, H., Tayal, A., Coleman, T., Li, Y.: Auto insurance fraud detection using unsupervised spectral ranking for anomaly. J. Financ. Data Sci. 2, 58–75 (2016)

    Article  Google Scholar 

  10. Anton, S.D.D., Sinha, S., Schotten, H.D.: Anomaly-based intrusion detection in industrial data with SVM and random forests. In: International Conference on Software, Telecommunications and Computer Networks (2019)

    Google Scholar 

  11. Dhieb, N., Ghazzai, H., Besbes, H., Massoud, Y.: Extreme gradient boosting machine learning algorithm for safe auto insurance operations. In: IEEE International Conference of Vehicular Electronics and Safety (2019)

    Google Scholar 

  12. Bodaghi, A., Teimourpour, B.: Automobile insurance fraud detection using social network analysis. In: Moshirpour, M., Far, B.H., Alhajj, R. (eds.) Applications of Data Management and Analysis. LNSN, pp. 11–16. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95810-1_2

    Chapter  Google Scholar 

  13. BĂ©jar, J.: K-means vs Mini Batch K-means: A comparison, KEMLG - Grup d’Enginyeria del Coneixement i Aprenentatge AutomĂ tic - Reports de recerca (2013)

    Google Scholar 

  14. McLachlan, G.J., Basford, K.E.: Mixture models. Inference and applications to clustering (1988)

    Google Scholar 

  15. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases (1996)

    Google Scholar 

  16. Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  17. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  18. van der Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics (2009)

    Google Scholar 

  19. Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27, 861–874 (2006)

    Article  Google Scholar 

  20. Insurance Guarantee Fund. https://www.ufg.pl/infoportal/faces/pages_home-page

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcin Paprzycki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Piesio, M., Ganzha, M., Paprzycki, M. (2020). Applying Machine Learning to Anomaly Detection in Car Insurance Sales. In: Bellatreche, L., Goyal, V., Fujita, H., Mondal, A., Reddy, P.K. (eds) Big Data Analytics. BDA 2020. Lecture Notes in Computer Science(), vol 12581. Springer, Cham. https://doi.org/10.1007/978-3-030-66665-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66665-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66664-4

  • Online ISBN: 978-3-030-66665-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics