Abstract
Significant rise in illegal activity has directly impacted socioeconomic growth and quality of life. In this article, a predictive crime data analysis framework has been proposed that can resolve the problem of scalability issues and accuracy rate. This paper proposed a hybrid ensemble machine learning classifier to identify authentic crime activities. A series of experiments are used to verify the efficiency of our proposed algorithms. Three datasets of different countries are used for this experiment purpose. All the datasets are tested successfully on our proposed framework and novel ensembles classifier. The result produced by our proposed hybrid ensemble classifier mostly outperforms the performance of most of the existing machine learning approaches. This work aims to identify geospatial crime data intensity where we can anticipate the recurrence of a certain crime in the city using geospatial technology, allowing the police force to take the required precautions to avoid it.
Similar content being viewed by others
References
Abbass Z, Ali Z, Ali M, Akbar B, Saleem A (2020) A framework to predict social crime through twitter tweets by using machine learning, 2020 IEEE 14th international conference on semantic computing (ICSC), San Diego, CA, USA. pp. 363–368. https://doi.org/10.1109/ICSC.2020.00073
Alves LGA, Ribeiro HV, Rodrigues FA (2018) Crime prediction through urban metrics and statistical learning. Physica A: Statis Mechan Appl 505:435–443
Amin MS, Chiam YK, Varathan KD (2019 Mar1) Identification of significant features and data mining techniques in predicting heart disease. Telematics Inform 36:82–93
Andresen MA, Linning SJ, Malleson N (2017) Crime at places and spatial concentrations: exploring the spatial stability of property crime in Vancouver BC, 2003–2013. J Quant Criminol 33(2):255–275. https://doi.org/10.1007/s10940-016-9295-8
Arulanandam R, Savarimuthu BTR, Purvis MA (2014) Extracting crime information from online Newspaper articles. Proceeding AWC '14 Proceedings of the Second Australasian Web Conference, Volume 155, Pages 31–38, Auckland, New Zealand — January 20–23
Bai Y, Sun Z, Zeng B, Long J, Li L, de Oliveira JV, Li C (2019) A comparison of dimension reduction techniques for support vector machine modeling of multi-parameter manufacturing quality prediction. J Intell Manuf 30:2245–2256. https://doi.org/10.1007/s10845-017-1388-1
Bandekar SR, Vijayalakshmi C (2020) Design and analysis of machine learning algorithms for the reduction of crime rates in India. Proc Comput Sci 172:122–127
Chicago crime Dataset (n.d.) – “https://data.cityofchicago.org/Public-Safety/Crimes-2001-topresent/ijzp-q8t2” Accessed on (12th March, 2020)
Chu X, Ilyas IF, Krishnan S, Wang J (2016) Data cleaning: Overview and emerging challenges. In: Proceedings of the 2016 International Conference on Management of Data. pp. 2201–2206
Denver crime Dataset (n.d.) – “http://data.denvergov.org”, Accessed on (10th March, 2020)
Fan GF, Yu M, Dong SQ, Yeh YH, Hong WC (2021) Forecasting short-term electricity load using hybrid support vector regression with grey catastrophe and random forest modeling. Util Policy 73:101294
Feng M, Zheng J, Ren J, Hussain A, Li X, Xi Y, Liu Q (2019) Big data analytics and mining for effective visualization and trends forecasting of crime data. IEEE Access 7:106111–106123
Gill J, Torres M (2019) Generalized linear models: a unified approach, vol 134. Sage Publications, Incorporated. https://doi.org/10.4135/9781526421036
Gupta A, Mohammad A, Syed A, Halgamuge MN (2016) A comparative study of classification algorithms using data mining: crime and accidents in Denver City the USA. Education. Int J Adv Comput Sci Appl 7(7):374–381
Khairuddin AR, Alwee R, Haron H (2019) A proposed gradient tree boosting with different loss function in crime forecasting and analysis. In: International Conference of Reliable Information and Communication Technology. pp. 189–198. Springer, Cham. https://doi.org/10.1007/978-3-030-33582-3_18
Kim S, Joshi P, Kalsi PS, Taheri P (2018) Crime Analysis Through Machine Learning. 2018 IEEE 9th annual information technology, electronics and Mobile communication conference (IEMCON), Vancouver, BC. pp. 415–420. https://doi.org/10.1109/IEMCON.2018.8614828
Kim A, Song Y, Kim M, Lee K, Cheon JH (2018) Logistic regression model training based on the approximate homomorphic encryption. BMC Med Genet 11(4):83. https://doi.org/10.1186/s12920-018-0401-7
Lee SY, Kwon Y, (2018) Twitter as a Place Where people meet to make suicide Pacts, Public Health. page 21–26. 159
Lekha KC, Prakasam S (2017) "Data mining techniques in detecting and predicting cyber crimes in banking sector." In 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS). pp. 1639-1643. IEEE. https://doi.org/10.1109/ICECDS.2017.8389725.
Li Y-S, Chi H, Shao X-Y, Qi M-L, Bao-Guang X (2020) A novel random forest approach for imbalance problem in crime linkage. Knowl-Based Syst: 105738. https://doi.org/10.1016/j.knosys.2020.105738
Li MW, Wang YT, Geng J, Hong WC (2021) Chaos cloud quantum bat hybrid optimization algorithm. Nonlin Dynam 103(1):1167–1193
Lim M, Azween A, Jhanjhi NZ, Khan MK, Supramaniam M (2019) Link prediction in time-evolving criminal network with deep reinforcement learning technique. IEEE Access 7:184797–184807
Lim M, Abdullah A, Jhanjhi N, Khurram Khan M (2020) Situation-aware deep reinforcement learning link prediction model for evolving criminal networks. IEEE Access 8:16550–16559. https://doi.org/10.1109/ACCESS.2019.2961805
Lin Y, Zhu X, Zheng Z, Dou Z, Zhou R (2019) The individual identification method of wireless device based on dimensionality reduction and machine learning. J Supercomput 75:3010–3027. https://doi.org/10.1007/s11227-017-2216-2
Malik A, Maciejewski R, Towers S, McCullough S, Ebert DS (2014) Proactive spatiotemporal resource allocation and predictive visual analytics for community policing and law enforcement. IEEE Trans Vis Comput Graph 20(12):1863–1872
McClendon L, Meghanathan N (2015) Using machine learning algorithms to analyze crime data. Machin Learn Appl: An Int J (MLAIJ) 2(1):1–12. https://doi.org/10.5121/mlaij.2015.2101
Mohd F, Noor NMM (2017) A comparative study to evaluate filtering methods for crime data feature selection. Proc Comput Sci 116:113–120. https://doi.org/10.1016/j.procs.2017.10.018
Mukherjee A, Ghosh A (2019) Heterogeneous decomposition of predictive modeling approach on crime dataset using machine learning. In: International conference on innovation in modern science and technology. Springer, Cham. pp. 1004-1012
Palaniappan S, Rajinikanth TV, Govardhan A (2017) Spatial data analysis using various tree classifiers ensembled with adaboost approach. In emerging trends in electrical, communications and information technologies. pp. 165–174. Springer, Singapore. https://doi.org/10.1007/978-981-10-1540-3_17
Panja R, Pal NR (2018) MS-SVM: minimally spanned support vector machine. Appl Soft Comput 64:356–365. https://doi.org/10.1016/j.asoc.2017.12.017
Qasim OS, Algamal ZY (2018) Feature selection using particle swarm optimization-based logistic regression model. Chemom Intell Lab Syst 182:41–46. https://doi.org/10.1016/j.chemolab.2018.08.016
Ratcliffe JH, Taylor RB, Askey AP, Thomas K, Grasso J, Bethel KJ, Fisher R, Koehnlein J (2020) The Philadelphia predictive policing experiment. J Exp Criminol 17:15–41. https://doi.org/10.1007/s11292-019-09400-2
San Francisco Crime Dataset – (n.d.) “https://data.sfgov.org/browse?category=Public+Safety”. Accessed on 7th March 2020
Sarkar S, Patel A, Madaan S, Maiti J (2016) Prediction of occupational accidents using decision tree approach. In: 2016 IEEE Annual India Conference (INDICON). pp. 1–6. IEEE https://doi.org/10.1109/INDICON.2016.7838969.
Uskov VL, Bakken JP, Ganapathi KS, Gayke K, Galloway B, Fatima J (2020) Data Cleaning and Data Visualization Systems for Learning Analytics. In: Uskov V, Howlett R, Jain L (eds) Smart Education and e-Learning 2020. Smart innovation, systems and technologies, vol 188. Springer, Singapore. https://doi.org/10.1007/978-981-15-5584-8_16
Vural MS, Gök M. "Criminal prediction using Naive Bayes theory" Neural Comput & Applic 28, no. 9 (2017): 2581–2592. https://doi.org/10.1007/s00521-016-2205-z
Wang B, Yin P, Bertozzi AL, Brantingham PJ, Osher SJ, Xin J (2019) Deep learning for real-time crime forecasting and its ternarization. Chinese Annals Mathe, Series B 40 (6):949–966. https://doi.org/10.1007/s11401-019-0168-y
Wheeler AP, Steenbeek W (2020) Mapping the risk terrain for crime using machine learning. J Quant Criminol. 1–36.
Xiang S, Ye X, Xia J, Wu J, Chen Y, Liu S (2019) Interactive Correction of Mislabeled Training Data, 2019 IEEE conference on visual analytics science and technology (VAST), Vancouver, BC, Canada. pp. 57–68 https://doi.org/10.1109/VAST47406.2019.8986943
Yin J, Afa Michael I, Afa, IJ (2020) Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data. Preprints, 2020020108. https://doi.org/10.20944/preprints202002.0108.v1
Zhang Z, Hong WC (2021) Application of variational mode decomposition and chaotic grey wolf optimizer with support vector regression for forecasting electric loads. Knowl-Based Syst 228:107297
Zhao X, Tang J (2017) "Modeling temporal-spatial correlations for crime prediction. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. pp. 497–506. https://doi.org/10.1145/3132847.3133024
Zhao H, Zheng J, Xu J, Deng W (2019) Fault diagnosis method based on principal component analysis and broad learning system. IEEE Access 7:99263–99272. https://doi.org/10.1109/ACCESS.2019.2929094
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mukherjee, A., Ghosh, A. Predictive framework for crime data analysis using a hybrid logistic regression — support vector machine based ensemble classifier powered by CART (LR-SVMCART). Multimed Tools Appl 82, 35357–35377 (2023). https://doi.org/10.1007/s11042-023-14760-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14760-z