Knowledge mining and social dangerousness assessment in criminal justice: metaheuristic integration of machine learning and graph-based inference

Lettieri, Nicola; Guarino, Alfonso; Malandrino, Delfina; Zaccagnino, Rocco

doi:10.1007/s10506-022-09334-7

Knowledge mining and social dangerousness assessment in criminal justice: metaheuristic integration of machine learning and graph-based inference

Original Research
Published: 20 October 2022

Volume 31, pages 653–702, (2023)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

Nicola Lettieri ORCID: orcid.org/0000-0001-6342-3252¹,
Alfonso Guarino²,
Delfina Malandrino² &
…
Rocco Zaccagnino²

832 Accesses
3 Citations
5 Altmetric
Explore all metrics

Abstract

One of the main challenges for computational legal research is drawing up innovative heuristics to derive actionable knowledge from legal documents. While a large part of the research has been so far devoted to the extraction of purely legal information, less attention has been paid to seeking out in the texts the clues of more complex entities: legally relevant facts whose detection requires to link and interpret, as a unified whole, legal information and results of empirical analyses. This paper presents an ongoing research that points in this direction, trying to devise new ways to support public prosecutors in assessing the dangerousness of individuals and groups under investigation, an activity that precisely relies on the cross-sectional evaluation of legal and empirical data. A knowledge mining strategy will be outlined that lines up, into a single metaheuristic model, information extraction, network-based inference, machine learning and visual analytics. We will focus, in particular, on the integration of graph-based inference and machine learning methods used both to support classification tasks and to explore new forms of man-machine cooperation. Experiments made involving public prosecutors from the Italian Anti-Mafia Investigation Directorate and using data from real investigations have not only shown the potentialities of our approach but also offered an opportunity to reflect on the role we could assign to AI when thinking about the future of legal science and practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Crime Analysis and Prediction Using Graph Mining

Towards Designing a Knowledge Graph-Based Framework for Investigating and Preventing Crime on Online Social Networks

Data Science Techniques for Law and Justice: Current State of Research and Open Problems

Notes

Question-answering system “searches a large text collection and finds a short phrase or sentence that precisely answers a user’s question” (Prager et al. 2000). “Information extraction is the problem of summarizing the essential details particular to a given document” (Freitag 2000). Argument mining involves “automatically identifying argumentative structures within document texts, for instance, premises and conclusion, and relationships between pairs of arguments” (Mochales and Moens 2011).
The platform is available online at: https://bit.ly/3xPqZp5.
The expression refers to immediately executive measures of coercion resulting in limitations of personal freedom or the availability of goods. Taken against the suspect or the accused, such measures aim: i) to prevent inappropriate behaviours during the course of the criminal proceeding (e.g. attempts to conceal evidence or to commit other crimes); ii) to ensure the enforcement of the judgement.
A network is a graph with N nodes (or vertices) and L links (or edges) that can be weighted or unweighted, directed or not. An unweighted network is completely represented by its $N \times N$ adjacency matrix A such that $A_{ij} = 1$ if node i points to node j, $A_{ij} = 0$ otherwise. Let $G = (V, E)$ be a graph, where V is the set of its vertices such that $|V| = N$ and E is the set of its edges such that $|E| = L$. Edges may denote just the connection among two nodes or being labeled with a number indicating weights assigned to them. In the latter case, the graph is called weighted. As we will see in more details later on, there are many important properties through which a network can be described (Freeman 1978; Kolaczyk and Csárdi 2014), providing interesting insight of the phenomenon the network is representing.
CrimeMiner has been developed with a Java Spring backend and JavaScript libraries for visualization (e.g., D3.js). The platform handles data about social relations that are represented as a graph $G = (V,E)$, where $V =$ individuals included in the case files, and $E =$ relation, such as telephone or environmental tappings. The architecture of the tool is described in detail in Appendix B. The tool is available at https://bit.ly/3xPqZp5.
See, COM(2021) 206 final - Proposal for a regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative act: “Actions by law enforcement authorities involving certain uses of AI systems are characterised by a significant degree of power imbalance and may lead to surveillance, arrest or deprivation of a natural person’s liberty, as well as other adverse impacts on fundamental rights guaranteed in the Charter”.
2012/C 326/02.
It should be emphasized that, given the experimental nature of the project, we have taken into consideration only a part of the indexes currently provided for by the Criminal Code to assess criminal dangerousness. Certain categories of offenses like, to give just an example, conspiracy provided by art. 110 of Italian Criminal Code, were not taken into consideration. Likewise, we put apart psychological indexes of social dangerousness which are also considered by Italian Criminal law (art. 203) and this for two reasons: (i) as highlighted in Sect. 2, they can only be assessed with the contribution of specific categories of domain experts like psychiatrists or psychologists; (ii) social dangerousness PPs deal with in fighting organized crime is usually unrelated to mental illnesses.
A parser developed in PERL language extracts entities (e.g., names, surnames, telephone number, charges, records) from requests for provisional orders.
Individuals are represented as nodes in a graph, and the social activities (e.g. telephone calls) are represented as edges.
The Network Analysis component applies NA metrics (Page Rank, centrality measures, community detection algorithms) to infer relevant properties of the criminal network and individuals therein.
An anonymized excerpt of the original document is available at https://bit.ly/3NZBg7y.
The distinction between episodic and prolonged crimes becomes “computationally” relevant in our system only and exclusively to the extent it turns into different levels of severity of the legal sanctions provided by the Criminal Code and that, together with other variables, impacts the assessment of criminal dangerousness.
The concept of variable importance is an implicit feature selection performed by RF with a random subspace methodology, and it is assessed by the Gini impurity criterion index (Ceriani and Verme 2012). The Gini index is a measure of the prediction power of variables in regression or classification, based on the principle of impurity reduction (Strobl et al. 2007); it is non-parametric and therefore does not rely on data belonging to a particular type of distribution. For a binary split (dangerous and not dangerous), the Gini index of a node n is calculated as $Gini(n)=1-\sum _{j=1}^2(p_j)^2$, where $p_j$ is the relative frequency of class j in the node n. For splitting a binary node in the best way, the improvement in the Gini index should be maximized. In other words, a low Gini (i.e., a greater decrease in Gini) means that a particular predictor feature plays a greater role in partitioning the data into the two classes. Thus, the Gini index can be used to rank the importance of features for a classification problem.
Cross-validation is primarily used to estimate the skill of a machine learning model on unseen data. As clearly explained in James et al. (2013), “this approach involves randomly dividing the set of observations into k groups, or folds, of approximately equal size. The first fold is treated as a validation set, and the method is fit on the remaining k-1 folds”.
We first assess the normality distribution of data with Shapiro-Wilk test (Shapiro and Wilk 1965) with a significance level of $\alpha = 0.05$, obtaining p value = 0.33. We then use the t-student test (Japkowicz and Shah 2011). We assume the difference between the groups is zero with the significance level .05 and check if we can reject this hypothesis.
We remark that the implementation of the classifiers RF, J48, MLP, Logistic, and NB compared in Sect. 5.5 did not provide an update functionality, hence they were not suitable for this task. Instead, SVM has been discarded due to its lower performance in accuracy (see Table 5).
https://www.selenium.dev/documentation/webdriver/.
See the above-mentioned Proposal COM (2021) 206 final.
The reference is to the open letter Research priorities for robust and beneficial artificial intelligence published by the Future of Life Institute. The letter is available online at: https://futureoflife.org/ai-open-letter/.
https://linkurious.com/neo4j/.
https://www.highcharts.com.
https://datatables.net.
https://www.highcharts.com.
https://doc.linkurio.us/ogma/latest/.
https://d3js.org.

References

Akers RL (1973) Deviant behavior: A social learning approach
Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D, Lampos V (2016) Predicting judicial decisions of the european court of human rights: A natural language processing perspective. PeerJ Computer Sci 2:e93
Google Scholar
Alves LG, Ribeiro HV, Rodrigues FA (2018) Crime prediction through urban metrics and statistical learning. Phys A: Stat Mech Appl 505:435–443
Google Scholar
André O, Peter F, Nellen S (2016) A Visual Approach to the History of Swiss Federal Law. In: DHd 2016: Modelling - Networking - Visualization
Asaro C, Biasiotti MA, Guidotti P, Papini M, Sagri MT, Tiscornia D, et al (2003) A domain ontology: Italian crime ontology. In: Proceedings of the ICAIL 2003 Workshop on Legal Ontologies & Web based legal information management, pp 1–7
Ashley KD (2017) Artificial intelligence and legal analytics: new tools for law practice in the digital age. Cambridge University Press, Cambridge
Google Scholar
Berlusconi G, Calderoni F, Parolini N, Verani M, Piccardi C (2016) Link prediction in criminal networks: a tool for criminal intelligence analysis. PLOS One 11(4):1–21. https://doi.org/10.1371/journal.pone.0154244
Article Google Scholar
Bhargava N, Sharma G, Bhargava R, Mathuria M (2013) Decision tree analysis on j48 algorithm for data mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering 3(6)
Boden MA (2016) AI: Its nature and future. Oxford University Press, Oxford
Google Scholar
Bogomolov A, Lepri B, Staiano J, Oliver N, Pianesi F, Pentland A (2014) Once upon a crime: towards crime prediction from demographics and mobile data. In: Proceedings of the 16th international conference on multimodal interaction, pp. 427–434
Bostrom N (2017) Superintelligence. Dunod
Google Scholar
Boulton G, Campbell P, Collins B, Elias P, Hall W, Laurie G, O’Neill O, Rawlins M, Thornton J, Vallance P, et al. (2012) Science as an open enterprise. The Royal Society
Branting K, Petersen S, Shin D, Finegan J, Balhana C, Lyte A, Pfeifer C (2019) Adept: Automated directive extraction from policy texts. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law, pp 250–251
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
MATH Google Scholar
Burgess RL, Akers RL (1966) A differential association-reinforcement theory of criminal behavior. Soc Probl 14(2):128–147
Google Scholar
Calvó-Armengol A, Zenou Y (2004) Social networks and crime decisions: the role of social structure in facilitating delinquent behavior. Int Econ Rev 45(3):939–958
MathSciNet Google Scholar
Carlson K, Dadgostari F, Livermore MA, Rockmore DN (2021) A multinetwork and machine learning examination of structure and content in the united states code. Front Phys 8:676
Google Scholar
Carter S, Nielsen M (2017) Using artificial intelligence to augment human intelligence. Distill 2:12
Google Scholar
Castano S, Falduti M, Ferrara A, Montanelli S (2022) A knowledge-centered framework for exploration and retrieval of legal documents. Inf Syst 106:101–842
Google Scholar
Castelfranchi C (2020) For a science-oriented, socially responsible, and self-aware ai: beyond ethical issues. In: 2020 IEEE International Conference on Human-Machine Systems (ICHMS), pp 1–4. IEEE
Ceriani L, Verme P (2012) The origins of the gini index: extracts from variabilità e mutabilità (1912) by corrado gini. J Econ Inequal 10(3):421–443
Google Scholar
Chan JB (2001) The technological game: How information technology is transforming police practice. Crim Justice 1(2):139–159
Google Scholar
Cioffi-Revilla C (2014) Introduction to computational social science. Springer, London
MATH Google Scholar
Clarke RVG (1997) Situational crime prevention. Criminal Justice Press Monsey, NY
Google Scholar
Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure. In: Machine Learning Proceedings 1995, pp 108–114. Elsevier
Cohen LE, Felson M (1979) On estimating the social costs of national economic policy: a critical examination of the brenner study. Soc Indicators Res 6(2):251–259
Google Scholar
Cohen LE, Felson M (1979) Social change and crime rate trends: a routine activity approach. Am Sociol Rev 87:588–608
Google Scholar
Cosimato A, Prisco RD, Guarino A, Malandrino D, Lettieri N, Sorrentino G, Zaccagnino R (2019) The conundrum of success in music: playing it or talking about it? IEEE Access 7:123,289-123,298
Google Scholar
Coupette C, Beckedorf J, Hartung D, Bommarito M, Katz DM (2021) Measuring law over time: a network analytical framework with an application to statutes and regulations in the united states and germany. Front Phys 9:269
Google Scholar
Cozza F, Guarino A, Isernia F, Malandrino D, Rapuano A, Schiavone R, Zaccagnino R (2020) Hybrid and lightweight detection of third party tracking: design, implementation, and evaluation. Computer Netw 167:106,993
Google Scholar
Davies T, Marchione E (2015) Event networks and the identification of crime pattern motifs. PLOS ONE 10(11):1–19
Google Scholar
Delahoz-Dominguez EJ, Fontalvo-Herrera TJ, Mendoza-Mendoza AA (2020) Definición de perfiles geográficos de hurto de automóviles. caso aplicado en cartagena. Justicia 25(37):99–108
Google Scholar
Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, Cambridge
MATH Google Scholar
Esquivel N, Nicolis O, Peralta B, Mateu J (2020) Spatio-temporal prediction of baltimore crime events using clstm neural networks. IEEE Access 8:209,101-209,112
Google Scholar
Ferrara E, De Meo P, Catanese S, Fiumara G (2014) Detecting criminal organizations in mobile phone networks. Expert Syst Appl 41(13):5733–5750
Google Scholar
Filtz E, Navas-Loro M, Santos C, Polleres A, Kirrane S (2020) Events matter: Extraction of events from court decisions. In: Legal Knowledge and Information Systems: JURIX 2020: The Thirty-third Annual Conference, Brno, Czech Republic, December 9-11, 2020, vol. 334, pp 33–42. IOS Press
Floud J (1982) Dangerousness and criminal justice. British J Criminol 22(3):213–228
Google Scholar
Francesconi E, Passerini A (2007) Automatic classification of provisions in legislative texts. Artif Intell Law 15(1):1–17
Google Scholar
Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239
Google Scholar
Freitag D (2000) Machine learning for information extraction in informal domains. Mach Learn 39(2–3):169–202
MATH Google Scholar
Gordon TF (2007) Visualizing carneades argument graphs. Law, Probability and Risk
Grohe M (2020) word2vec, node2vec, graph2vec, x2vec: Towards a theory of vector embeddings of structured data. In: Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pp 1–16
Guarino A, Lettieri N, Malandrino D, Russo P, Zaccagnino R (2019) Visual analytics to make sense of large-scale administrative and normative data. In: 2019 23rd International Conference Information Visualisation (IV), pp 133–138. IEEE
Guarino A, Lettieri N, Malandrino D, Zaccagnino R (2021) A machine learning-based approach to identify unlawful practices in online terms of service: analysis, implementation and evaluation. Neural Computing and Applications pp 1–19
Guarino A, Malandrino D, Zaccagnino R (2022) An automatic mechanism to provide privacy awareness and control over unwittingly dissemination of online private information. Computer Netw 202:108,614
Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newslett 11(1):10–18
Google Scholar
Harcourt BE (2015) Exposed. Harvard University Press, Cambridge
Google Scholar
Harvard Law Review Student Note (1982) Selective incapacitation: reducing crime through predictions of recidivism. Harvard Law Rev 96:511–533.
Article Google Scholar
Hepler AB, Dawid AP, Leucari V (2007) Object-oriented graphical representations of complex patterns of evidence. Probabil Risk Law 25:87
Google Scholar
Humphreys P (2004) Extending ourselves: computational science, empiricism, and scientific method. Oxford University Press, UK
Google Scholar
Hvistendahl M (2016) Crime forecasters. Science 353(6307):1484–1487 https://doi.org/10.1126/science.353.6307.1484. https://science.sciencemag.org/content/353/6307/1484
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 112. Springer, Cham
MATH Google Scholar
Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, Cambridge
MATH Google Scholar
Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proc. of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 538–543
Katz D, Bommarito M (2014) Legal analytics. introduction to the course. https://bit.ly/3BXTlwX
Katz DM, Bommarito MJ (2014) Measuring the complexity of the law: the united states code. Artif Intell Law 22(4):337–374
Google Scholar
Katz DM, Bommarito MJ, Blackman J (2017) A general approach for predicting the behavior of the supreme court of the united states. PloS One 12(4):e0174,698
Google Scholar
Katz DM, Gubler JR, Zelner J, Bommarito MJ (2011) Reproduction of hierarchy-a social network analysis of the American law professoriate. J Legal Educ 61:76
Google Scholar
Kaufman KA, Michalski RS (2005) From data mining to knowledge mining. Handbook Stat 24:47–75
Google Scholar
Kehl DL, Kessler SA (2017) Algorithms in the criminal justice system: Assessing the use of risk assessments in sentencing
Keim D, Kohlhammer J, Ellis G, Mansmann F (2010) Mastering the information age: solving problems with visual analytics
Kim S, Joshi P, Kalsi PS, Taheri P (2018) Crime analysis through machine learning. In: 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp 415–420. IEEE
Kolaczyk ED, Csárdi G (2014) Statistical analysis of network data with R, vol 65. Springer, Cham
MATH Google Scholar
Koschützki D, Lehmann KA, Peeters L, Richter S, Tenfelde-Podehl D, Zlotowski O (2005) Centrality indices. network analysis. Springer, Cham
MATH Google Scholar
Kuppevelt D, Dijck G (2017) Answering legal research questions about dutch case law with network analysis and visualization. In: Legal Knowledge and Information Systems: JURIX 2017: The Thirtieth Annual Conference, vol. 302, p 95. IOS Press
Larkin JH, Simon HA (1987) Why a diagram is (sometimes) worth ten thousand words. Cognit Sci 11(1):65–100
Google Scholar
Laune FF (1936) Predicting criminality: Forecasting behavior on parole. 1. Northwestern university
Leitner E, Rehm G, Moreno-Schneider J (2019) Fine-grained named entity recognition in legal documents. In: International Conference on Semantic Systems, pp 272–287. Springer
Lettieri N (2020) Law in Turing’s cathedral notes on the algorithmic turn of the legal universe. In: Barfield W (ed) The Cambridge handbook of the law of algorithms. Cambridge University Press, Cambridge
Google Scholar
Lettieri N (2020) Law, rights, and the fallacy of computation. on the hidden pitfalls of predictive analytics. Jura Gentium 17(2):72–87
MathSciNet Google Scholar
Lettieri N, Altamura A, Faggiano A, Malandrino D (2016) A computational approach for the experimental study of eu case law: analysis and implementation. Soc Netw Anal Min 6(1):56
Google Scholar
Lettieri N, Altamura A, Giugno R, Guarino A, Malandrino D, Pulvirenti A, Vicidomini F, Zaccagnino R (2018) Ex machina: analytical platforms, law and the challenges of computational legal science. Future Internet 10(5):37
Google Scholar
Lettieri N, Altamura A, Malandrino D (2017) The legal macroscope: experimenting with visual legal analytics. Inf Visual 16(4):332–345
Google Scholar
Lettieri N, Altamura A, Malandrino D, Punzo V (2017) Agents shaping networks shaping agents: Integrating social network analysis and agent-based modeling in computational crime research. In: EPIA Conference on Artificial Intelligence, pp 15–27. Springer
Lettieri N, Faro S, Malandrino D, Faggiano A, Vestoso M (2018) Network, visualization, analytics. a tool allowing legal scholars to experimentally investigate eu case law. In: U. Pagallo, M. Palmirani, P. Casanovas, G. Sartor, S. Villata (eds.) AI Approaches to the Complexity of Legal Systems, pp 543–555. Springer International Publishing, Cham
Lettieri N, Guarino A, Malandrino D (2018) E-science and the law. three experimental platforms for legal analytics. In: Legal Knowledge and Information Systems - JURIX 2018: The Thirty-first Annual Conference, Groningen, The Netherlands, 12-14 December 2018., pp 71–80
Lettieri N, Guarino A, Malandrino D, Zaccagnino R (2020) The affordance of law. sliding treemaps browsing hierarchically structured data on touch devices. In: 2020 24th International Conference Information Visualisation (IV), pp 16–21. IEEE
Lettieri N, Guarino A, Malandrino D, Zaccagnino R (2021) The sight of justice. visual knowledge mining, legal data and computational crime analysis. In: 2021 25th International Conference Information Visualisation (IV), pp 267–272. IEEE
Lettieri N, Malandrino D, Spinelli R, Rinaldi C (2013) Text and (social) network analysis as investigative tools: a case study. In: Law and Computational Social Science, pp 263–280. ESI
Lettieri N, Malandrino D, Vicidomini L (2017) By investigation, i mean computation. Trends Organized Crime 20(1–2):31–54
Google Scholar
Licklider JC (1960) Man-computer symbiosis. IRE Trans Human Factors Electron 1:4–11
MATH Google Scholar
Lin YL, Chen TY, Yu LC (2017) Using machine learning to assist crime prevention. In: 2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), pp 1029–1030. IEEE
Lippi M, Pałka P, Contissa G, Lagioia F, Micklitz HW, Sartor G, Torroni P (2019) Claudette: an automated detector of potentially unfair clauses in online terms of service. Artif Intell Law 27(2):117–139
Google Scholar
Long JB, Ehrenfeld JM (2020) The role of augmented intelligence (ai) in detecting and preventing the spread of novel coronavirus
Lui A, Lamb GW (2018) Artificial intelligence and augmented intelligence collaboration: regaining trust and confidence in the financial sector. Inf Commun Technol Law 27(3):267–283
Google Scholar
Malcai O, Shur-Ofry M (2021) Using complexity to calibrate legal response to covid-19. Front Phys 9:164
Google Scholar
Maron ME (1961) Automatic indexing: an experimental inquiry. J ACM (JACM) 8(3):404–417
MATH Google Scholar
Masías VH, Valle M, Morselli C, Crespo F, Vargas A, Laengle S (2016) Modeling verdict outcomes using social network measures: the watergate and caviar network cases. PloS one 11:1
Google Scholar
McHugh ML (2012) Interrater reliability: the kappa statistic. Biochemia medica 22(3):276–282
MathSciNet Google Scholar
Medvedeva M, Vols M, Wieling M (2020) Using machine learning to predict decisions of the European court of human rights. Artif Intell Law 28(2):237–266
Google Scholar
Meneses-Escobar CA, Castillo-Rodríguez CM, Rodas-Vásquez A (2019) Análisis espacial y temporal del hurto de celulares, pereira, risaralda, año 2018. Revista Logos Ciencia & Tecnología 11(2):167–175
Google Scholar
Mitchell TM (2005) Logistic regression. Mach Learn 10:701
Google Scholar
Mochales R, Moens MF (2011) Argumentation mining. Artif Intell Law 19(1):1–22
Google Scholar
Mohler G, Porter MD (2018) Rotational grid, pai-maximizing crime forecasts. Stat Anal Data Min: ASA Data Sci J 11(5):227–236
MathSciNet MATH Google Scholar
Moreno JL (1937) Sociometry in relation to other social sciences. Sociometry 1(1/2):206–219
Google Scholar
Morselli C (2009) Inside criminal networks, vol 8. Springer, Cham
Google Scholar
Nissan E (2009) Legal evidence, police intelligence, crime analysis or detection, forensic testing, and argumentation: an overview of computer tools or techniques. Int J Law Inf Technol 17(1):1–82
Google Scholar
Noble SU (2018) Algorithms of oppression. New York University Press, New York
Google Scholar
O’Neil C (2016) Weapons of math destruction: How big data increases inequality and threatens democracy. Crown
Ordoñez-Eraso HA, Pardo-Calvache CJ, Cobos-Lozada CA (2020) Detección de tendencias de homicidios en colombia usando machine learning. Revista Facultad de Ingeniería 29(54):e11,740-e11,740
Google Scholar
Ormerod P, Wiltshire G (2009) ?binge?drinking in the uk: a social network phenomenon. Mind & Soc 8(2):135
Google Scholar
Ovádek M, Dyevre A, Wigard K (2021) Analysing eu treaty-making and litigation with network analysis and natural language processing. Front Phys 9:202
Google Scholar
Pedraza-Fariña LG, Whalen R (2020) A network theory of patentability. Univ Chicago Law Rev 87(1):63–144
Google Scholar
Powers DM (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation
Prager J, Brown E, Coden A, Radev D (2000) Question-answering by predictive annotation. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’00, pp 184–191. ACM, New York, NY, USA. https://doi.org/10.1145/345508.345574
Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier, Amsterdam
Google Scholar
Richardson R, Schultz JM, Crawford K (2019) Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. NYUL Rev 94:15
Google Scholar
Rockmore DN, Carlson K, Dadgostari F, Livermore M (2020) A multinetwork and machine learning examination of structure and content in the united states code. Front Phys 8:676
Google Scholar
Ruhl J, Katz DM, Bommarito MJ (2017) Harnessing legal complexity. Science 355(6332):1377–1378
Google Scholar
Rumelhart DE, Hinton GE, Williams RJ et al (1988) Learning representations by back-propagating errors. Cognit Model 5(3):1
MATH Google Scholar
Rummens A, Hardyns W, Pauwels L (2017) The use of predictive analysis in spatiotemporal crime forecasting: building and testing a model in an urban context. Appl Geogr 86:255–261
Google Scholar
Russell S (2019) Human compatible: Artificial intelligence and the problem of control. Penguin
Sarica A, Cerasa A, Quattrone A (2017) Random forest algorithm for the classification of neuroimaging data in alzheimer’s disease: a systematic review. Front Aging Neurosci 9:329
Google Scholar
Schwartz MD (2021) Modern machine learning and particle physics. http://arxiv.org/abs/2103.12226
Shaheen Z, Wohlgenannt G, Filtz E (2020) Large scale legal text classification using transformer models. http://arxiv.org/abs/2010.12871
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3/4):591–611
MathSciNet MATH Google Scholar
Sharma M (2019) Augmented intelligence: a way for helping universities to make smarter decisions. emerging trends in expert applications and security. Springer, Cham
Google Scholar
Short JF, Strodtbeck FL (1965) Group process and gang delinquency. University of Chicago Press, Chicago
Google Scholar
Shulayeva O, Siddharthan A, Wyner A (2017) Recognizing cited facts and principles in legal judgements. Artif Intell Law 25(1):107–126
Google Scholar
Smith TA (2007) The web of law. San Diego L Rev 44:309
Google Scholar
Strobl C, Boulesteix AL, Augustin T (2007) Unbiased split selection for classification trees based on the gini index. Comput Stat Data Anal 52(1):483–501
MathSciNet MATH Google Scholar
Sutherland EH, Cressey DR, Luckenbill DF (1992) Principles of criminology. Altamira Press, UK
Google Scholar
Taroni F, Biedermann A, Bozza S, Garbolino P, Aitken C (2014) Bayesian networks for probabilistic inference and decision analysis in forensic science. Wiley, New Jersey
MATH Google Scholar
Tashea J (2017) Calculating crime. ABAJ 103:54
Google Scholar
Tillers P (2005) Picturing factual inference in legal settings
du Toit N (2019) Network visualisation as a citator user interface. J Open Access L 7:1
Google Scholar
Verheij B (2007) Argumentation support software: boxes-and-arrows and beyond. Law, Probability and Risk
Wang L (2005) Support vector machines: theory and applications, vol 177. Springer Science & Business Media, Cham
MATH Google Scholar
Wang T, Rudin C, Wagner D, Sevieri R (2013) Learning to detect patterns of crime. In: Joint European conference on machine learning and knowledge discovery in databases, pp 515–530. Springer
Wang Z, Wei L, Peng S, Deng L, Niu B (2018) Child-trafficking networks of illegal adoption in china. Nature Sustain 1(5):254–260
Google Scholar
Whalen R (2016) Legal networks: the promises and challenges of legal network analysis. Mich. St. L. Rev, p 539
Wheeler AP, Steenbeek W (2020) Mapping the risk terrain for crime using machine learning. J Quant Criminol 45:1–36
Google Scholar
Wikström POH (2004) Crime as alternative: towards a cross-level situational action theory of crime causation. Beyond Empiricism: Instit Intentions Study Crime 13:1–37
Google Scholar
Wikström POH (2006) Individuals, settings, and acts of crime: situational mechanisms and the explanation of crime. Explanation Crime: Context, Mech Develop 45:61–107
Google Scholar
Yau KLA, Lee HJ, Chong YW, Ling MH, Syed AR, Wu C, Goh HG (2021) Augmented intelligence: surveys of literature and expert opinion to understand relations between human intelligence and artificial intelligence. IEEE Access 25:71
Google Scholar
Yuan L, Wang J, Fan S, Bian Y, Yang B, Wang Y, Wang X (2019) Automatic legal judgment prediction via large amounts of criminal cases. In: 2019 IEEE 5th International Conference on Computer and Communications (ICCC), pp 2087–2091. IEEE
Zheng Nn, Liu Zy, Ren Pj, Ma Yq, Chen St, Yu Sy, Xue Jr, Chen Bd, Wang Fy (2017) Hybrid-augmented intelligence: collaboration and cognition. Front Inf Technol Electron Eng 18(2):153–179
Google Scholar

Download references

Acknowledgements

Authors would like to thank for their contributions and suggestions Dr. Luigi Landolfi (deputy prosecutor of the Antimafia District Department of Naples), and Carlo Rinaldi (deputy prosecutor of the Criminal Court of Salerno). Authors are deeply grateful to Margherita Vestoso and Ilaria Cecere for the insightful comments and the support provided in proofreading the work. This paper is dedicated to the memory of Domenico Parisi, visionary researcher, source of inspiration for us as for generations of scholars around the world.

Funding

No funds, grants, or other support was received.

Author information

Authors and Affiliations

National Institute for Public Policy Analysis (INAPP), Rome, Italy
Nicola Lettieri
University of Salerno, Fisciano, Italy
Alfonso Guarino, Delfina Malandrino & Rocco Zaccagnino

Authors

Nicola Lettieri
View author publications
You can also search for this author in PubMed Google Scholar
Alfonso Guarino
View author publications
You can also search for this author in PubMed Google Scholar
Delfina Malandrino
View author publications
You can also search for this author in PubMed Google Scholar
Rocco Zaccagnino
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authorship of the work here presented can be attributed as follows: NL CrimeMiner concept and functional design; legal and computational social science profiles of the research. DM, RZ, AG CrimeMiner technical and architectural design, computer science profiles of the research. The case study is the result of a joint effort of the authors.

Corresponding author

Correspondence to Nicola Lettieri.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: CrimeMiner data tier

The Data Tier represents the first tier of the knowledge mining architecture we conceived (see an overall sketch in Fig. 13).

In this tier, we extract entities and relations within trial documents. For the extraction we adopt straightforward string matching mechanism. For entities such as names, we employ dictionaries of Italians names. For locations, we employed an open maps dataset concerning Campania region, and for relations such as tapped phone calls, we searched for main Italian verbs referring to calls (a list manually crafted), e.g., “ha chiamato”, “ha telefonato”, “chiama”, “telefona”, “parla al telefono”, “conversazione”, and so on (in English, “has called”, “called”, “calls”, “talks on the phone”, “conversation”). The result is an XML file that is then ported into the database of CrimeMiner. Given such a strategy has the downside to produce different errors in the data entry, we introduced a document-enhancement functionality where the user interacts with the trial documents and supports the extraction procedure. The adoption of solutions somehow inspired by the social media tagging mechanism, allows us to avoid different irregularities like the one deriving, for instance, by the fact that when the same person is mistakenly reported as Giuseppe, Peppe, or Peppino (in English could be Jo, Joe, Joseph) this produces three entities and, consequently, three different nodes on the final graph. As shown in Fig. 14, CrimeMiner provides an advanced editor that allows the creation of a database containing all relevant information for the investigation. Figure 14 gives an example of how the system highlights tagged people in the trial document (the one matched with dictionaries for example). The system helps avoiding errors in data entry by suggesting that the name of a person the user is typing is already available in the database (in the example, the person, anonymized as “Node 72”, is highlighted in violet). In addition, to allow a check on the correctness of the system’s suggestion, CrimeMiner shows details about that individual (see Fig. 14) by simply clicking on the name.

Appendix: CrimeMiner’s architecture

The CrimeMiner is built upon the Java EE Spring Data Neo4j framework whose architecture is structured in three layers, as shown in Fig. 15. We describe each of them in the following.

Storage: This layer stores all the data (including graphs) under examination. Managing data in this layer requires the communication between Neo4j and Spring Data Neo4j. This is accomplished by a Neo4j HTTP Driver (system integrated thanks to a Maven dependency). Data stored include personal details of investigated individuals, tappings (telephone and environmental tappings) and, finally, the document created by the user by means of CrimeMiner;
Server: this layer is responsible of the mapping of Neo4j relations and entities in Java classes. Besides, it processes data mapped and provides developed services to the top layer (thanks to a REST service returning JSON data). In this layer, all SNA metrics are also defined. Here, the tool invokes WEKA library for classifying individuals under investigation as well as updating the KStar model underlying the “Dangerousness module”;
Client: it includes user interface allowing PPs to interact with CrimeMiner’s features. Processed data, exploiting JavaScript libraries, are shown to the user through: graph visualizations via Linkurious^{Footnote 21}, 2D and 3D graphics with Highcharts^{Footnote 22}) and finally, tables with rich functionalities using Datatables.^{Footnote 23}

Appendix: CrimeMiner interaction tier

The Appendix sketches how CrimeMiner’s users interact with the system. In particular, we detail how visual metaphors are used to enable the exploration of the features of the organization and its members (organization structure, roles and criminal profiles of single individuals) and their evolution over time. To this end, two visualizations are taken into account in the following:

basic graphs visualizations—like those in Sect. 5.1 and more—(Appendix C.1);
Similarity visualizations (Appendix C.2);
Temporal graph visualization (Appendix C.3).

We remark that all visualizations are actionable from a left-sided menu panel in the CrimeMiner (see Fig. 16).

1.1 Basic graphs visualizations

CrimeMiner offers a graph visualization module where the user can select and visualize different types of graphs (e.g., those defined in Sect. 5.1). In Fig. 17 we show the multi-graph individual-phone call, the bi-partite individual-environment graph and the projection of it. Such a projection graph represents a network projection using individual-environmental tapping data: G(V, E) where $V=\left\{ v_{1},v_{2},...,v_{n}:v_{i} = individual\right\}$ and $\exists (v_{i},v_{j})$ $\Leftrightarrow v_{i}$ and $v_{j}$ were involved in the same environmental tapping. When clicking on a node of interest, the tools shows its main information and its edges are highlighted. In addition, from this module it is possible to apply all the NA metrics (centrality measures) seen in Sect. 5.4. The application of a specific metric has spillovers on the nodes size (the bigger the higher the metric value).

1.2 Similarities between criminals at a glance

The similarity—in terms of criminal profile—between two or more individuals is an information of major interest on the investigative level.

CrimeMiner offers a Similarity module implementing SimRank (Jeh and Widom 2020), a similarity measure based on a graph-theoretical model. The module provides user with information about similarities (in terms of individual’s characteristics, social relationships) that may exist between individuals under investigation (Lettieri et al. 2021). To increase understanding of the similarities measures applied on various types of graphs, we create 2D and 3D graphics, thanks to Highcharts^{Footnote 24}JavaScript Library. In Fig. 18a, we show a plotted 3D graphic in which x-z axes are for individuals’ names and the y-axis shows similarity percentages between each pair of individuals, respectively. Using this arrangement on a Cartesian graph, we can clearly show that top points represent the more similar pairs. The pair is represented as a point in the Cartesian graph. With a mouse over, the Similarity module shows names and similarity percentage of each pair of individuals. Using “Data Settings” option, the user can modify the SimRank threshold to better visualize the levels of similarity he/her is interested in (default: $[30, 100\%]$) and, of course, it is possible to select just one individual to get his SimRank with all the others (see Fig. 18b). In this case, the similarity module exploits a bar chart where the bars represents the similarity (the higher the more similar) and on the x-axis there are the individuals belonging to the network.

1.3 About time: visualizing the evolution of criminal networks

Studying the diachronic evolution of the criminal network can better support the evaluation of dangerous individuals, highlighting suspicious social patterns. CrimeMiner offers a temporal graph-based visualization module, providing to the user the chance to browse the social relationship between individuals in the criminal network over the period of investigation. This helps to grasp the evolution of social patterns, highlighting suspicious ones. An abstracted version of the visualization is available in Fig. 19a–b.

The temporal-graph based visualization module works with every kind of social relation (thus graph) CrimeMiner handles; here we present the visualization’s details using phone calls as testbed. Fig. 19 depicts the temporal multi-graph $G=(V,E)$. Formally, let $V=\left\{ v_{1},v_{2},\ldots ,v_{n}\right\}$ be a set of individuals, and E be a set of tapped phone calls such that $(v_{i},v_{j})_t \in E$ if there exists a tapped phone call from $v_i$ to $v_j$, with $1 \le i,j \le n$ and $i \ne j$ at time t. We define $C=\{0, 1,\ldots , 360\}$ as the set of colors according to the HSL metrics (Hue, Saturation, Lightness)—which goes from blue to red –, and a color function $color: V \rightarrow C$, assigning to every node $v_i\in V$ a color $c\in C$. The color of $v_i$ depends on the sum of outgoing and ingoing edges from $v_i$. We define $\forall v_i\in V$, $E_{v_i}=\{(vj,vk)_t | v_j=v_i \vee v_k=v_i\}$, with $E_{v_i}\subseteq E$, as the set of outgoing and ingoing edges of $v_i$ at time t. Therefore, the shade of $v_i$ is given computing

$$\begin{aligned} color(v_i) = \Bigl \lfloor \frac{(bc_{v_i,G_t}-m_t)}{(M_t-m_t)}\times 360 \Bigr \rfloor \end{aligned}$$

where $m_t$ and $M_t$ are the minimum and maximum centrality value at the time t respectively.

We define a thickness function $thick: E \rightarrow {\mathbb {N}}$ which assigns a thickness to the edges of the temporal graph, acting as a grouping function to reduce the visual clutter. The thickness of a visualized edge at time t is proportional to the number of edges between two nodes at time t. Formally, let $v_i,v_j\in V$ two nodes, we compute the thickness

$$\begin{aligned} thick((v_i,v_j)_t)=|E_{v_i}\cap E_{v_j}| \end{aligned}$$

By double-clicking on a node of interest, the user highlights all its social relationships, belonging to the first (individuals called or that have called, informally “friends”) and second level (individuals called or been called by individuals of the first level, informally “friends of friends”). The timeline at the bottom considers all the investigation period, in our case, from October 2002 to 2006. By default, the temporal graph displayed covers all the investigation period but the timeline allows to display the criminal network at a time t or in a time range $[t_s,t_e]$ (see Fig. 19c–d). The timeline is composed of two graphics sharing the x-axis as the time axis. The y-axis represents the number of tapped phone calls. Therefore, a depicted point $P(t_p,pc_p)$ in the timeline represents the number of tapped phone calls $pc_p=|E|$ in the temporal graph at a specific time $t_p$. Furthermore, these points are colored up according to the total duration of tapped phone calls; the more the duration, the darker the color.

To implement this module, we adopted two JavaScript libraries, that is Ogma^{Footnote 25} e D3.js.^{Footnote 26}

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lettieri, N., Guarino, A., Malandrino, D. et al. Knowledge mining and social dangerousness assessment in criminal justice: metaheuristic integration of machine learning and graph-based inference. Artif Intell Law 31, 653–702 (2023). https://doi.org/10.1007/s10506-022-09334-7

Download citation

Accepted: 22 September 2022
Published: 20 October 2022
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10506-022-09334-7

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge mining and social dangerousness assessment in criminal justice: metaheuristic integration of machine learning and graph-based inference

Abstract

Access this article

Similar content being viewed by others

Crime Analysis and Prediction Using Graph Mining

Towards Designing a Knowledge Graph-Based Framework for Investigating and Preventing Crime on Online Social Networks

Data Science Techniques for Law and Justice: Current State of Research and Open Problems

Notes

References

Acknowledgements

Funding