Abstract
In this paper, we focus on the discovering criminal behaviors and patterns issue and propose a Parallel Crime Pattern Discovery system using machine learning and high-performance computing techniques. We formulate the problem of criminal behaviors and propose a Criminal Activity Clustering (CAC) algorithm based on fuzzy clustering to detect potential criminal patterns in large-scale spatiotemporal datasets. Based on the detected criminal patterns, we further propose a Crime Rate Evaluation (CRE) algorithm to identify the crime rate for each group of locations and target types. In addition, we propose a Criminal Hotspot Locating (CHL) algorithm to predict and highlight the hotspot areas for the prevention of the target place. Moreover, to improve the performance of the proposed CPD system that mainly contains CAC, CRE, and CHL algorithms, we implement a parallel solution for these algorithms using high-performance computing power. Experimental results show that the proposed algorithms can effectively detect accurate criminal patterns from large-scale spatiotemporal data.
Similar content being viewed by others
References
Alzaabi, M., Taha, K., Martin, T.A.: A crime investigation system using the relative importance of information spreaders in networks depicting criminals communications. IEEE Trans. Inform. Forens. Secur. 10(10), 2019–2211 (2015)
Apache. Spark. http://spark-project.org
Brown, D., Gunderson, L.: Using clustering to discover the preferences of computer criminals. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 31(4), 311–318 (2001)
Chen, J., Li, K., Bilal, K., Xu, Z., Li, K., Yu, P.S.: A bi-layered parallel training architecture for large-scale convolutional neural networks. IEEE Trans. Parallel Distrib. Syst. 30(99), (2019a)
Chen, J., Li, K., Rong, H., Bilal, K., Nan, Y., Li, K.: A disease diagnosis and treatment recommendation system based on big data mining and cloud computing. Inform. Sci. 435, 124–149 (2018a)
Chen, J., Li, K., Tang, Z., Yu, S., Li, K.: A parallel random forest algorithm for big data in spark cloud computing environment. IEEE Trans. Parallel Distrib. Syst. 28(4), 919–933 (2017)
Chen, P.S.: Discovering investigation clues through mining criminal databases. Intell. Secur. Inform. 12(3), 173–198 (2008)
Chen, Y., Li, K., Yang, W., Xiao, G., Xie, X., Li, T.: Performance-aware model for sparse matrix-matrix multiplication on the sunway taihulight supercomputer. IEEE Trans. Parallel Distrib. Syst. 29(99), (2018b)
Chen, Y., Xiao, G., Yang, W.: Optimizing partitioned csr-based spgemm on the sunway taihulightt. Neural Comput. Appl. (2019b)
Han, X., Wang, L., Cui, C., Ma, J., Zhang, S.: Linking multiple online identities in criminal investigations: a spectral co-clustering framework. IEEE Trans. Inform. Forens. Secur. 12(9), 2242–2255 (2017)
Jeyanthi, S., Maheswari, N.U., Venkatesh, R.: An efficient automatic overlapped fingerprint identification and recognition using anfis classifier. Int. J. Fuzzy Syst. 18(3), 478–491 (2016)
Kaza, S., Xu, J., Marshall, B., Chen, H.: Topological analysis of criminal activity networks: enhancing transportation security. IEEE Trans. Intell. Trans. Syst. 10(1), 83–91 (2009)
Lei, L.: The gis-based research on criminal cases hotspots identifying. Procedia Environ. Sci. 12(2), 957–963 (2012)
Li, C., Zhao, H., Xu, Z.: Kernel c-means clustering algorithms for hesitant fuzzy information in decision making. Int. J. Fuzzy Syst. 20(1), 141–154 (2018a)
Li, K., Mei, J., Li, K.: A fund-constrained investment scheme for profit maximization in cloud computing. IEEE Trans. Serv. Comput. 11(6), 893–907 (2018b)
Li, K., Tang, X., Veeravalli, B., Li, K.: Scheduling precedence constrained stochastic tasks on heterogeneous cluster systems. IEEE Trans. Comput. 64(1), 191–204 (2015a)
Li, K., Yang, W., Li, K.: Performance analysis and optimization for spmv on gpu using probabilistic modeling. IEEE Trans. Parallel Distrib. Syst. 26(1), 196–205 (2015b)
Liu, C., Li, K., Xu, C., Li, K.: Strategy configurations of multiple users competition for cloud service reservation. IEEE Trans. Parallel Distrib. Syst. 27(2), 508–520 (2016)
Mehmet Sait Vural, M.G.: Criminal prediction using naive bayes theory. Neural Comput. Appl. 28(9), 2581–2592 (2017)
of Maryland, U. Global terrorism database (gtd). http://www.start.umd.edu/gtd
Phua, C., Smith-Miles, K., Lee, V., Gayler, R.: Resilient identity crime detection. IEEE Trans. Knowl. Data Eng. 24(3), 533–546 (2012)
Rashidi, P., Wang, T., Skidmore, A., Vrieling, A., Omondi, P.: Spatial and spatiotemporal clustering methods for detecting elephant poaching hotspots. Ecol. Modell. 297(10), 180–186 (2015)
Son, L.H., Tien, N.D.: Tune up fuzzy c-means for big data: Some novel hybrid clustering algorithms based on initial selection and incremental clustering. Int. J. Fuzzy Syst. 19(5), 1585–1602 (2017)
Toole, J.L., Eagle, N., Plotkin, J.B.: Spatiotemporal correlations in criminal offense records. ACM Trans. Intell. Syst. Technol. 2(4), 38 (2011)
University, H.: National supercomputing centre in changsha. http://nscc.hnu.edu.cn
Vennila, V., Kannan, A.R.: Hybrid parallel linguistic fuzzy rules with canopy mapreduce for big data classification in cloud. Int. J. Fuzzy Syst. 21(1), 1–14 (2019)
Wang, H., Yao, H., Kifer, D., Graif, C., and Li, Z.: Non-stationary model for crime rate inference using modern urban data. IEEE Trans. Big Data 30 (2018a)
Wang, S., Wang, X., Ye, P., Yuan, Y., Liu, S., Wang, F.-Y.: Parallel crime scene analysis based on acp approach. IEEE Trans. Comput. Social Syst. 5(1), 244–255 (2018b)
Xiao, G., Li, K., Chen, Y., He, W., Zomaya, A. Y., and Li, T.: Caspmv: A customized and accelerative spmv framework for the sunway taihulight. IEEE Trans. Parallel Distrib. Syst. (2019)
Xiao, G., Li, K., Li, K.: Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries. Inform. Sci. 405, 207–226 (2017a)
Xiao, G., Li, K., Zhou, X., Li, K.: Efficient monochromatic and bichromatic probabilistic reverse top-k query processing for uncertain big data. J. Comput. Syst. Sci. 89, 92–113 (2017b)
Xue, Y., Brown, D.E.: Spatial analysis with preference specification of latent decision makers for criminal event prediction. Decis. Support Syst. 41(3), 560–573 (2006)
Zhang, L., Li, K., Li, C., Li, K.: Bi-objective workflow scheduling of the energy consumption and reliability in heterogeneous computing systems. Inform. Sci. 379, 241–256 (2017)
Acknowledgements
This work is partially funded by the National Key R&D Program of China (Grant No. 2018YFB1003401), the National Outstanding Youth Science Program of National Natural Science Foundation of China (Grant No. 61625202), the International (Regional) Cooperation and Exchange Program of National Natural Science Foundation of China (Grant Nos. 61661146006, 61860206011), the China Scholarships Council (Grant No. 201706310080), and the International Postdoctoral Exchange Fellowship Program (Grant No. 20180024).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Win, K.N., Chen, J., Chen, Y. et al. PCPD: A Parallel Crime Pattern Discovery System for Large-Scale Spatiotemporal Data Based on Fuzzy Clustering. Int. J. Fuzzy Syst. 21, 1961–1974 (2019). https://doi.org/10.1007/s40815-019-00673-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-019-00673-3