Skip to main content
Log in

Predictive framework for crime data analysis using a hybrid logistic regression — support vector machine based ensemble classifier powered by CART (LR-SVMCART)

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Significant rise in illegal activity has directly impacted socioeconomic growth and quality of life. In this article, a predictive crime data analysis framework has been proposed that can resolve the problem of scalability issues and accuracy rate. This paper proposed a hybrid ensemble machine learning classifier to identify authentic crime activities. A series of experiments are used to verify the efficiency of our proposed algorithms. Three datasets of different countries are used for this experiment purpose. All the datasets are tested successfully on our proposed framework and novel ensembles classifier. The result produced by our proposed hybrid ensemble classifier mostly outperforms the performance of most of the existing machine learning approaches. This work aims to identify geospatial crime data intensity where we can anticipate the recurrence of a certain crime in the city using geospatial technology, allowing the police force to take the required precautions to avoid it.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Abbass Z, Ali Z, Ali M, Akbar B, Saleem A (2020) A framework to predict social crime through twitter tweets by using machine learning, 2020 IEEE 14th international conference on semantic computing (ICSC), San Diego, CA, USA. pp. 363–368. https://doi.org/10.1109/ICSC.2020.00073

  2. Alves LGA, Ribeiro HV, Rodrigues FA (2018) Crime prediction through urban metrics and statistical learning. Physica A: Statis Mechan Appl 505:435–443

    Article  Google Scholar 

  3. Amin MS, Chiam YK, Varathan KD (2019 Mar1) Identification of significant features and data mining techniques in predicting heart disease. Telematics Inform 36:82–93

  4. Andresen MA, Linning SJ, Malleson N (2017) Crime at places and spatial concentrations: exploring the spatial stability of property crime in Vancouver BC, 2003–2013. J Quant Criminol 33(2):255–275. https://doi.org/10.1007/s10940-016-9295-8

    Article  Google Scholar 

  5. Arulanandam R, Savarimuthu BTR, Purvis MA (2014) Extracting crime information from online Newspaper articles. Proceeding AWC '14 Proceedings of the Second Australasian Web Conference, Volume 155, Pages 31–38, Auckland, New Zealand — January 20–23

  6. Bai Y, Sun Z, Zeng B, Long J, Li L, de Oliveira JV, Li C (2019) A comparison of dimension reduction techniques for support vector machine modeling of multi-parameter manufacturing quality prediction. J Intell Manuf 30:2245–2256. https://doi.org/10.1007/s10845-017-1388-1

    Article  Google Scholar 

  7. Bandekar SR, Vijayalakshmi C (2020) Design and analysis of machine learning algorithms for the reduction of crime rates in India. Proc Comput Sci 172:122–127

    Article  Google Scholar 

  8. Chicago crime Dataset (n.d.) – “https://data.cityofchicago.org/Public-Safety/Crimes-2001-topresent/ijzp-q8t2” Accessed on (12th March, 2020)

  9. Chu X, Ilyas IF, Krishnan S, Wang J (2016) Data cleaning: Overview and emerging challenges. In: Proceedings of the 2016 International Conference on Management of Data. pp. 2201–2206

  10. Denver crime Dataset (n.d.) – “http://data.denvergov.org”, Accessed on (10th March, 2020)

  11. Fan GF, Yu M, Dong SQ, Yeh YH, Hong WC (2021) Forecasting short-term electricity load using hybrid support vector regression with grey catastrophe and random forest modeling. Util Policy 73:101294

    Article  Google Scholar 

  12. Feng M, Zheng J, Ren J, Hussain A, Li X, Xi Y, Liu Q (2019) Big data analytics and mining for effective visualization and trends forecasting of crime data. IEEE Access 7:106111–106123

    Article  Google Scholar 

  13. Gill J, Torres M (2019) Generalized linear models: a unified approach, vol 134. Sage Publications, Incorporated. https://doi.org/10.4135/9781526421036

    Book  Google Scholar 

  14. Gupta A, Mohammad A, Syed A, Halgamuge MN (2016) A comparative study of classification algorithms using data mining: crime and accidents in Denver City the USA. Education. Int J Adv Comput Sci Appl 7(7):374–381

    Google Scholar 

  15. Khairuddin AR, Alwee R, Haron H (2019) A proposed gradient tree boosting with different loss function in crime forecasting and analysis. In: International Conference of Reliable Information and Communication Technology. pp. 189–198. Springer, Cham. https://doi.org/10.1007/978-3-030-33582-3_18

  16. Kim S, Joshi P, Kalsi PS, Taheri P (2018) Crime Analysis Through Machine Learning. 2018 IEEE 9th annual information technology, electronics and Mobile communication conference (IEMCON), Vancouver, BC. pp. 415–420. https://doi.org/10.1109/IEMCON.2018.8614828

  17. Kim A, Song Y, Kim M, Lee K, Cheon JH (2018) Logistic regression model training based on the approximate homomorphic encryption. BMC Med Genet 11(4):83. https://doi.org/10.1186/s12920-018-0401-7

    Article  Google Scholar 

  18. Lee SY, Kwon Y, (2018) Twitter as a Place Where people meet to make suicide Pacts, Public Health. page 21–26. 159

  19. Lekha KC, Prakasam S (2017) "Data mining techniques in detecting and predicting cyber crimes in banking sector." In 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS). pp. 1639-1643. IEEE. https://doi.org/10.1109/ICECDS.2017.8389725.

  20. Li Y-S, Chi H, Shao X-Y, Qi M-L, Bao-Guang X (2020) A novel random forest approach for imbalance problem in crime linkage. Knowl-Based Syst: 105738. https://doi.org/10.1016/j.knosys.2020.105738

  21. Li MW, Wang YT, Geng J, Hong WC (2021) Chaos cloud quantum bat hybrid optimization algorithm. Nonlin Dynam 103(1):1167–1193

    Article  Google Scholar 

  22. Lim M, Azween A, Jhanjhi NZ, Khan MK, Supramaniam M (2019) Link prediction in time-evolving criminal network with deep reinforcement learning technique. IEEE Access 7:184797–184807

    Article  Google Scholar 

  23. Lim M, Abdullah A, Jhanjhi N, Khurram Khan M (2020) Situation-aware deep reinforcement learning link prediction model for evolving criminal networks. IEEE Access 8:16550–16559. https://doi.org/10.1109/ACCESS.2019.2961805

    Article  Google Scholar 

  24. Lin Y, Zhu X, Zheng Z, Dou Z, Zhou R (2019) The individual identification method of wireless device based on dimensionality reduction and machine learning. J Supercomput 75:3010–3027. https://doi.org/10.1007/s11227-017-2216-2

    Article  Google Scholar 

  25. Malik A, Maciejewski R, Towers S, McCullough S, Ebert DS (2014) Proactive spatiotemporal resource allocation and predictive visual analytics for community policing and law enforcement. IEEE Trans Vis Comput Graph 20(12):1863–1872

    Article  Google Scholar 

  26. McClendon L, Meghanathan N (2015) Using machine learning algorithms to analyze crime data. Machin Learn Appl: An Int J (MLAIJ) 2(1):1–12. https://doi.org/10.5121/mlaij.2015.2101

    Article  Google Scholar 

  27. Mohd F, Noor NMM (2017) A comparative study to evaluate filtering methods for crime data feature selection. Proc Comput Sci 116:113–120. https://doi.org/10.1016/j.procs.2017.10.018

    Article  Google Scholar 

  28. Mukherjee A, Ghosh A (2019) Heterogeneous decomposition of predictive modeling approach on crime dataset using machine learning. In: International conference on innovation in modern science and technology. Springer, Cham. pp. 1004-1012

  29. Palaniappan S, Rajinikanth TV, Govardhan A (2017) Spatial data analysis using various tree classifiers ensembled with adaboost approach. In emerging trends in electrical, communications and information technologies. pp. 165–174. Springer, Singapore. https://doi.org/10.1007/978-981-10-1540-3_17

  30. Panja R, Pal NR (2018) MS-SVM: minimally spanned support vector machine. Appl Soft Comput 64:356–365. https://doi.org/10.1016/j.asoc.2017.12.017

    Article  Google Scholar 

  31. Qasim OS, Algamal ZY (2018) Feature selection using particle swarm optimization-based logistic regression model. Chemom Intell Lab Syst 182:41–46. https://doi.org/10.1016/j.chemolab.2018.08.016

    Article  Google Scholar 

  32. Ratcliffe JH, Taylor RB, Askey AP, Thomas K, Grasso J, Bethel KJ, Fisher R, Koehnlein J (2020) The Philadelphia predictive policing experiment. J Exp Criminol 17:15–41. https://doi.org/10.1007/s11292-019-09400-2

    Article  Google Scholar 

  33. San Francisco Crime Dataset – (n.d.) “https://data.sfgov.org/browse?category=Public+Safety”. Accessed on 7th March 2020

  34. Sarkar S, Patel A, Madaan S, Maiti J (2016) Prediction of occupational accidents using decision tree approach. In: 2016 IEEE Annual India Conference (INDICON). pp. 1–6. IEEE https://doi.org/10.1109/INDICON.2016.7838969.

  35. Uskov VL, Bakken JP, Ganapathi KS, Gayke K, Galloway B, Fatima J (2020) Data Cleaning and Data Visualization Systems for Learning Analytics. In: Uskov V, Howlett R, Jain L (eds) Smart Education and e-Learning 2020. Smart innovation, systems and technologies, vol 188. Springer, Singapore. https://doi.org/10.1007/978-981-15-5584-8_16

    Chapter  Google Scholar 

  36. Vural MS, Gök M. "Criminal prediction using Naive Bayes theory" Neural Comput & Applic 28, no. 9 (2017): 2581–2592. https://doi.org/10.1007/s00521-016-2205-z

  37. Wang B, Yin P, Bertozzi AL, Brantingham PJ, Osher SJ, Xin J (2019) Deep learning for real-time crime forecasting and its ternarization. Chinese Annals Mathe, Series B 40 (6):949–966. https://doi.org/10.1007/s11401-019-0168-y

  38. Wheeler AP, Steenbeek W (2020) Mapping the risk terrain for crime using machine learning. J Quant Criminol. 1–36.

  39. Xiang S, Ye X, Xia J, Wu J, Chen Y, Liu S (2019) Interactive Correction of Mislabeled Training Data, 2019 IEEE conference on visual analytics science and technology (VAST), Vancouver, BC, Canada. pp. 57–68 https://doi.org/10.1109/VAST47406.2019.8986943

  40. Yin J, Afa Michael I, Afa, IJ (2020) Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1

  41. Zhang Z, Hong WC (2021) Application of variational mode decomposition and chaotic grey wolf optimizer with support vector regression for forecasting electric loads. Knowl-Based Syst 228:107297

    Article  Google Scholar 

  42. Zhao X, Tang J (2017) "Modeling temporal-spatial correlations for crime prediction. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. pp. 497–506. https://doi.org/10.1145/3132847.3133024

  43. Zhao H, Zheng J, Xu J, Deng W (2019) Fault diagnosis method based on principal component analysis and broad learning system. IEEE Access 7:99263–99272. https://doi.org/10.1109/ACCESS.2019.2929094

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anupam Ghosh.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mukherjee, A., Ghosh, A. Predictive framework for crime data analysis using a hybrid logistic regression — support vector machine based ensemble classifier powered by CART (LR-SVMCART). Multimed Tools Appl 82, 35357–35377 (2023). https://doi.org/10.1007/s11042-023-14760-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14760-z

Keywords

Navigation