Abstract
The rate of recidivism by criminals after their release from prison is high, which is harmful to society. Thus, it is socially significant to reduce their recidivism rate. This article uses public data from the state of Iowa in the United States. According to the data characteristics, such as having redundant samples and mixed attributes, we propose the following methods. First, we use a rough set attribute reduction algorithm based on probability distributions to reduce the redundant items. Second, the sample data are clustered with an improved clustering algorithm. Based on the traditional K-prototype clustering algorithm, the clustering algorithm is improved by changing the measurement method of the categorical attributes, changing the initial cluster center selection method, and weighting the attributes based on the information entropy. The clustering experiment results show that the improved clustering algorithm has a better clustering effect and higher clustering accuracy than the traditional K-prototype clustering algorithm. Finally, a back propagation neural network is used to predict the recidivism probability of the sample processed by the above algorithm. The final experimental results show that the two redundant attributes are successfully reduced by rough sets, which greatly reduces the run time of the model. Compared with the traditional K-prototype clustering algorithm, the improved K-prototype clustering algorithm proposed in this paper has a better effect on the various indicators and objective function. Finally, through neural network prediction, the prediction accuracy of this model reached 87.9%. At the same time, a large number of experiments on benchmark datasets verify the effectiveness of our proposed model.


Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
These datasets were derived from the following public domain resources: https://data.iowa.gov/Correctional-System/3-Year-Recidivism-for-Offenders-Released-from-Pris/mw8r-vqy4; https://data.iowa.gov/Correctional-System/3-Year-Recidivism-for-Offenders-Admitted-to-Probat/e9zy-uibf; http://archive.ics.uci.edu/ml/datasets/Adult; http://archive.ics.uci.edu/ml/datasets/Nursery; http://archive.ics.uci.edu/ml/datasets/Car+Evaluation; http://archive.ics.uci.edu/ml/datasets/Tic-Tac-Toe+Endgame; http://archive.ics.uci.edu/ml/datasets/Balance+Scale.
References
Bares J, Mowen J (2020) Examining the parole officer as a mechanism of social support during reentry from prison. Crime Delinq 66(6–7):1023–1051
Chan H, Lo T, Zhong L (2016) Identifying the self-anticipated reoffending risk factors of incarcerated male repeat offenders in Hong Kong. Prison J 95(5):731–751
Chen J, He H (2016) A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data. Inf Sci 345:271–293
Dressel J, Farid H (2018) The accuracy, fairness, and limits of predicting recidivism. Sci Adv 4(1):eaao5580
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Heng J, Tian Y, Yan P (2013) Research on the method of warfare simulation data analysis based on BP neural network and roughset. Adv Mater Res 2584:498–503
Hong J, Xiang Y, Liu Y et al (2018) Development of EV charging templates: an improved K-prototypes method. IET Gener Transm Distrib 12(20):4361–4367
Ji J, Zhou C, Zhe W et al (2012) A fuzzy k-prototypes algorithm using fuzzy centroid for clustering mixed data. Int J Adv Comput Technol 4(7):281–290
Jiajun X, Qinghua L, Jing T (2006) A heuristic clustering algorithm forintrusion detection based on information entropy. Wuhan Univ J Nat Sci 2:355–359
Kim B (2017) A fast K-prototypes algorithm using partial distance computation. Symmetry 9(4):58
Li K, Wang H (2019) A mobile node localization algorithm based on an overlapping self-adjustment mechanism. Inf Sci 481:635–649
Li J, Wu X, Qin C et al (2012) The design of image compression with BP neural network based on the dynamic adjusting hidden layer nodes. Adv Mater Res 1566:3797–3801
Li K, Chen Y et al (2018) Improved gene expression programming to solve the inverse problem for ordinary differential equations. Swarm Evol Comput 38:231–239
Li K, Liang Z et al (2019) Performance analyses of differential evolution algorithm based on dynamic fitness landscape. Int J Cogn Inform Nat Intell (IJCINI) 13(1):36–61
Lin J, Duan G, Tian Z (2020) Interval intuitionistic fuzzy clustering algorithm based on symmetric information entropy. Symmetry 12(1):79
Ruan X, Zhu Y, Li J, Cheng Y (2020) Predicting the citation counts of individual papers via a BP neural network. J Inform 14(3):101039
Tiwari A, Shreevastava S, Som T, Shukla K (2018) Tolerance-based intuitionistic fuzzy-rough set approach for attribute reduction. Expert Syst Appl 101:205–212
Wan H, Peng Y (2013) An algorithm of LDA topic reduction based on rough set. Appl Mech Mater 2755:1593–1596
Wang F, Zhang H, Li K, Lin Z, Yang J, Shen X-L (2018) A hybrid particle swarm optimization algorithm using adaptive learning strategy. Inf Sci 436–437:162–177
Wang F, Li Y, Zhang H, Ting Hu, Shen X-L (2019) An adaptive weight vector guided evolutionary algorithm for preference-based multi-objective optimization. Swarm Evol Comput 49:220–233
Acknowledgements
This work is supported by National Key R&D Program of China with the Grant no. 2018YFC0831100, Natural Science Foundation of Guangdong Province of China with no. 2020A1515010784, the National Natural Science Foundation of China with the Grant no. 61773296, Foreign Science and Technology Cooperation Program of Guangzhou with the Grant no. 201907010021, Key R&D Program of Guangdong Province with no. 2019B020219003, Foreign Science and Technology Cooperation Program of Huangpu District of Guangzhou with no. 2018GH0, the Major Science and Technology Project in Dongguan with no. 2018215121005.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, K., Wang, Z., Yao, X. et al. Recidivism early warning model based on rough sets and the improved K-prototype clustering algorithm and a back propagation neural network. J Ambient Intell Human Comput 14, 839–851 (2023). https://doi.org/10.1007/s12652-021-03337-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03337-z