Applying Machine Learning to Anomaly Detection in Car Insurance Sales

Piesio, Michał; Ganzha, Maria; Paprzycki, Marcin

doi:10.1007/978-3-030-66665-1_17

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12581))

Included in the following conference series:

International Conference on Big Data Analytics

1120 Accesses

Abstract

Financial revenue, in the insurance sector, is systematically rising. This growth is, primarily, related to an increasing number of sold policies. While there exists a substantial body of work focused on discovering insurance fraud, e.g. related to car accidents, an open question remains, is it possible to capture incorrect data in the sales systems. Such erroneous data can result in financial losses. It may be caused by mistakes made by the sales person(s), but may be also a result of a fraud. In this work, research is focused on detecting anomalies in car insurance contracts. It is based on a dataset obtained from an actual insurance company, based in Poland. This dataset is thoroughly analysed, including preprocessing and feature selection. Next, a number of anomaly detection algorithms are applied to it, and their performance is compared. Specifically, clustering algorithms, dynamic classifier selection, and gradient boosted decision trees, are experimented with. Furthermore, the scenario where the size of the dataset is increasing is considered. It is shown that use of, broadly understood, machine learning has a realistic potential to facilitate anomaly detection, during insurance policy sales.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Polish Central Statistical Office: Polish Insurance Market in 2018 (2019). https://stat.gov.pl/en/topics/economic-activities-finances/financial-results/polish-insurance-market-in-2018,2,8.html
Talagala, P.D., Hyndman, R.J., Smith-Miles, K.: Anomaly detection in high dimensional data. J. Comput. Graph. Stat. (2020)
Google Scholar
Thiprungsri, S., Vasarhelyi, M.A.: Cluster analysis for anomaly detection in accounting data: an accounting approach. Int. J. Digit. Account. Res. 11, 69–84 (2011)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Symposium on Mathematical Statistics and Probability (1967)
Google Scholar
Zhao, Y., Hryniewicki, M.K.: DCSO: dynamic combination of detector scores for outlier ensembles. In: ACM KDD Workshop on Outlier Detection De-Constructed (ODD v5.0) (2018)
Google Scholar
Viaene, S., Derrig, R.A., Baesens, B., Dedene, G.: A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. J. Risk Insur. 69, 373–421 (2002)
Google Scholar
Hassan, A.K.I., Abraham, A.: Modeling insurance fraud detection using ensemble combining classification. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 8, 257–265 (2016)
Google Scholar
DeBarr, D., Wechsler, H.: Fraud detection using reputation features, SVMs, and random forests. In: Proceedings of the International Conference on Data Science (2013)
Google Scholar
Niana, K., Zhanga, H., Tayal, A., Coleman, T., Li, Y.: Auto insurance fraud detection using unsupervised spectral ranking for anomaly. J. Financ. Data Sci. 2, 58–75 (2016)
Article Google Scholar
Anton, S.D.D., Sinha, S., Schotten, H.D.: Anomaly-based intrusion detection in industrial data with SVM and random forests. In: International Conference on Software, Telecommunications and Computer Networks (2019)
Google Scholar
Dhieb, N., Ghazzai, H., Besbes, H., Massoud, Y.: Extreme gradient boosting machine learning algorithm for safe auto insurance operations. In: IEEE International Conference of Vehicular Electronics and Safety (2019)
Google Scholar
Bodaghi, A., Teimourpour, B.: Automobile insurance fraud detection using social network analysis. In: Moshirpour, M., Far, B.H., Alhajj, R. (eds.) Applications of Data Management and Analysis. LNSN, pp. 11–16. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95810-1_2
Chapter Google Scholar
Béjar, J.: K-means vs Mini Batch K-means: A comparison, KEMLG - Grup d’Enginyeria del Coneixement i Aprenentatge Automàtic - Reports de recerca (2013)
Google Scholar
McLachlan, G.J., Basford, K.E.: Mixture models. Inference and applications to clustering (1988)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases (1996)
Google Scholar
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Article MathSciNet Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
van der Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics (2009)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27, 861–874 (2006)
Article Google Scholar
Insurance Guarantee Fund. https://www.ufg.pl/infoportal/faces/pages_home-page

Download references

Author information

Authors and Affiliations

Warsaw University of Technology, Warsaw, Poland
Michał Piesio & Maria Ganzha
Systems Research Institute Polish Academy of Sciences, Warsaw, Poland
Marcin Paprzycki

Authors

Michał Piesio
View author publications
You can also search for this author in PubMed Google Scholar
Maria Ganzha
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Paprzycki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcin Paprzycki .

Editor information

Editors and Affiliations

ISAE-ENSMA, Chasseneuil, France
Ladjel Bellatreche
Indraprastha Institute of Information Technology, New Delhi, India
Vikram Goyal
Iwate Prefectural University, Takizawa, Japan
Hamido Fujita
Ashoka University, Sonepat, India
Anirban Mondal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Piesio, M., Ganzha, M., Paprzycki, M. (2020). Applying Machine Learning to Anomaly Detection in Car Insurance Sales. In: Bellatreche, L., Goyal, V., Fujita, H., Mondal, A., Reddy, P.K. (eds) Big Data Analytics. BDA 2020. Lecture Notes in Computer Science(), vol 12581. Springer, Cham. https://doi.org/10.1007/978-3-030-66665-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-66665-1_17
Published: 03 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66664-4
Online ISBN: 978-3-030-66665-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics