Abstract
When we talk about the coherence of a story, we seem to think of how well its individual pieces fit together—how to explicate this notion formally, though? We develop a Bayesian network based coherence measure with implementation in R, which performs better than its purely probabilistic predecessors. The novelty is that by paying attention to the network structure, we avoid simply taking mean confirmation scores between all possible pairs of subsets of a narration. Moreover, we assign special importance to the weakest links in a narration, to improve on the other measures’ results for logically inconsistent scenarios. We illustrate and investigate the performance of the measures in relation to a few philosophically motivated examples, and (more extensively) using the real-life example of the Sally Clark case.
Similar content being viewed by others
Notes
Another notable perspective is provided by the argumentative approaches, which focus on arguments based on evidence, meant to support or attack conclusions. The approach is inspired by Wigmore (2012), and heavily relies on diagrams of the structure of arguments (Anderson 2007; Bex et al. 2003), sometimes used in argument mapping software tools (Verheij 2007). There are some translations between arguments and Bayesian networks (Timmer et al. 2014), although no single commonly accepted normative framework for the argumentative approach exists, which limits the applicability of the approach and is in contrast with the widespread use of Bayesian networks. One can also find in the literature a range of mixed approaches based on the conviction that there is no real disagreement between the narrative, argumentative and probabilistic approaches (Spottswood 2013; Shen et al. 2006; Bex et al. 2010; Vlek et al. 2014; Verheij 2014; Verheij et al. 2016; Verheij 2017).
There is a related notion in the neighborhood where an agent’s degrees of beliefs are coherent just in case they are probabilistic. We will not use this notion in this paper.
This is, ultimately, with some bells and whistles, what (Vlek et al. 2013) suggests. We postpone a more elaborate discussion till later.
The other aspect is coverage, which reflects the extent to which the story accounts for the evidence.
To be fair, they do say that one factor important for plausibility evaluation is the form of the arguments involved. They briefly refer to (Collins et al. 1988) classification of (around 40) inference forms. But, again, this classification is rather informal, and it is rather unclear how this exact classification is to be justified, or used in the practice of narration evaluation.
Another problem is that the principles used are quite debatable. For example, Thagard requires that if propositions \(E_1, \dots , E_m\) explain H, the degree of coherence of \(E_1, \dots , E_m, H\) is inversely proportional to the number of propositions \(E_1, \dots E_m\). This is a very crude simplicity criterion. There are multiple ways a description of events can be cut into propositions, and merely counting propositions does not provide an enlightening measure of the complexity of an explanation.
Source R code and a brief manual for the coherence calculating functions is available here: https://rflurbaniak.github.io/coherence/.
Another way present in the literature is to formulate abstract formal requirements for a coherence measure and to investigate whether a given coherence measure satisfies them. One paper where some such principles have been formulated is (Schippers 2014), but they do not seem intuitive to us and we think progress in this direction, while desirable, would require a more elaborate justification of the underlying principles. For this reason, we decided to focus on developing a notion that avoids at least the fairly obvious difficulties with the existing measures instead.
Yes, we are aware of argumentative frameworks to model evidential dependencies in an argument; we already mentioned them in footnote 1, but since they are not quantitative, we stick with Bayesian networks.
D is not displayed as uniform due to rounding and the requirement that the probabilties sum to 1.
Note that one might have somewhat different view on what these should be. For one thing, the probability that the second child has been killed if the first died of SIDS is extremely low. For another, the probability of sings of bruising in case of murder increase from .01 to only .05. Moreover, the probabilities might look too specific for the reader. Analysis with a range of alternative CPTs might indeed be worthwile.
Why would one be worried about Z? For one thing, (Fitelson 2021) raises the following problem. Say two items of evidence E and \(E'\) are confirmationally independent regarding a hypothesis H just in case both \(c(H, E \vert E' ) = c(H, E )\) and \(c(H, E' \vert E ) = c(H, E')\), where c is a confirmation level. Say E and \(E'\) are conflicting evidence regarding H iff \(\mathsf {P}(H\vert E)> \mathsf {P}(H)\) while \(\mathsf {P}(H\vert E') < \mathsf {P}(H)\). It turns out that any measure ordinally equivalent with Z excludes the possibility of the existence of confirmationally independent and yet conflicting items of evidences, which Fitelson considers counter-intuitive.
The problem is a particular case of a common problem in statistics: how to represent a set of different values in a simple way without distorting the information too much? One easy and accurate solution is to plot all values. The problem is, it gives us no unambiguous way to compare different sets. For such tasks, a single score is desirable.
This might seem in line with the average mutual support measures. However, on our approach we only care about specific directions of support.
To take the simplest example, if two elements are logically inconsistent, the whole narration is incoherent, even if some of its other elements cohere to a large degree. Imagine two narrations. In the first one, you have a case where all parent-child links except one get the maximal positive score. The remaining one gets the score of -1. We submit that the overall score should be -1. In the second narration all the relations take a value close to -1. We share the intuition that the narration still should have a higher overall score than -1. The presence of an element with the posterior that equals 0 (which is needed for \(Z\) confirmation being -1) means that the probability of the whole scenario itself is null, which is clearly lower than whatever low posterior the other scenario might have.
Again, there are other ways to mathematically capture the intuition that the lower minimum, the more attention is to be paid to it, but we decided to take the most straightforward way of doing so for a ride.
Spearman correlation is simply Pearson correlation run on ranks instead of raw values. The dependence between variables is not linear, so Pearson correlation would be misleading in this context.
The Spearman correlation test p-values are fairly low in most of the cases. Here is a table of rounded p-values for Spearman correlation tests for the Sally Clark scenarios.
References
Akiba K (2000) Shogenji’s probabilistic measure of coherence is incoherent. Analysis 60(4):356–359
Allen R (2010) No plausible alternative to a plausible story of guilt as the rule of decision in criminal cases. In: Cruz, J., Laudan, L. (eds) Proof and standards of proof in the law. Northwestern University School of Law, pp 10–27
Anderson TJ (2007) Visualization tools and argument schemes: a question of standpoint. Law Probab Risk 6:97
Bex F, Prakken H, Reed C, Walton D (2003) Towards a formal account of reasoning about evidence: argumentation schemes and generalisations. Artif Intell Law 11:125–165
Bex F, Van Koppen P, Prakken H, Verheij B (2010) A hybrid formal theory of arguments, stories and criminal evidence. Artif Intell Law 18:123–152
Bovens L, Hartmann S (2004) Bayesian epistemology. Oxford University Press, Oxford
Collins, A., Burstein, M., Baker, M (1988) Human plausible reasoning. Technical report, BBN LABS INC, Cambridge, MA
Crupi V, Tentori K, Gonzalez M (2007) On Bayesian measures of evidential support: theoretical and empirical Issues. Philos Sci 74(2):229–252
Di Bello M (2013) Statistics and Probability in Criminal Trials. PhD thesis, University of Stanford
Douven I, Meijs W (2007) Measuring coherence. Synthese 156(3):405–425
Fenton N, Neil M (2011) Avoiding probabilistic reasoning fallacies in legal practice using Bayesian networks. Austl J Leg Phil 36:114
Fenton N, Neil M (2012) On limiting the use of Bayes in presenting forensic evidence
Fenton N, Neil M (2018) Risk assessment and decision analysis with Bayesian networks. Chapman and Hall, Boca Raton
Fenton N, Neil M, Hsu A (2014) Calculating and understanding the value of any type of match evidence when there are potential testing errors. Artif Intell Law 22:1–28
Fenton N, Neil M, Lagnado D (2013) A general structure for legal arguments about evidence using Bayesian networks. Cogn Sci 37(1):61–102
Fitelson B (2003) A probabilistic theory of coherence. Analysis 63(3):194–199
Fitelson B (2021) A problem for confirmation measure \(z\) (2021)
Gittelson S, Biedermann A, Bozza S, Taroni F (2013) Modeling the forensic two-trace problem with Bayesian networks. Artif Intell Law 21:221–252
Glass DH (2002) Coherence, explanation, and Bayesian networks. In: Goos G, Hartmanis J, van Leeuwen J, O’Neill M, Sutcliffe RFE, Ryan C, Eaton M, Griffith NJL (eds) Artificial intelligence and cognitive science, vol 2464. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp. 177–182
Keppens J (2012) Argument diagram extraction from evidential Bayesian networks. Artif Intell Law 20(2):109–143
Koscholke J (2016) Evaluating test cases for probabilistic measures of coherence. Erkenntnis 81(1):155–181
Lagnado DA, Fenton N, Neil M (2013) Legal idioms: a framework for evidential reasoning. Argum Comput 4(1):46–63
Meijs W, Douven I (2007) On the alleged impossibility of coherence. Synthese 157(3):347–360
Merricks T (1995) Warrant entails truth. Philos Phenom Res. 55:841–855
Neil M, Fenton N, Lagnado D, Gill RD (2019) Modelling competing legal arguments using Bayesian model comparison and averaging. Artif Intell Law 27:403–430
Olmos P (2017) (ed) Narration as Argument. Springer
Olsson EJ (2001) Why coherence is not truth-conducive. Analysis 61(3):236–241
Olsson EJ (2005) The impossibility of coherence. Erkenntnis 63(3):387–412
Pennington N, Hastie R (1991) A cognitive theory of juror decision making: the story model. Cardozo Law Rev. 13:519–557
Pennington N, Hastie R (1992) Explaining the evidence: tests of the story model for juror decision making. J Pers Soc Psychol 62(2):189–204
Pennington N, Hastie R (1993a) Reasoning in explanation-based decision making. Cognition 49:123–163
Pennington N, Hastie R (1993b) The story model for juror decision making. Cambridge University Press, Cambridge
Riesen M, Serpen G (2008) Validation of a Bayesian belief network representation for posterior probability calculations on national crime victimization survey. Artif Intell Law 16:245–276
Roche W (2013) Coherence and probability: a probabilistic account of coherence. In: Araszkiewicz M, Savelka J (eds) Coherence: insights from philosophy, jurisprudence and artificial intelligence. Springer, Dordrecht, pp 59–91
Schippers M (2014) Probabilistic measures of coherence: from adequacy constraints towards pluralism. Synthese 191(16):3821–3845
Schippers M, Koscholke J (2019) A general framework for probabilistic measures of coherence. Studia Logica
Shen Q, Keppens J, Aitken C, Schafer B, Lee M (2006) A scenario-driven decision support system for serious crime investigation. Law Probab Risk 5(2):87–117
Shogenji T (1999) Is coherence truth conducive? Analysis 59(4):338–345
Shogenji T (2001) Reply to akiba on the probabilistic measure of coherence. Analysis 61(2):147–150
Shogenji T (2006) Why does coherence appear truth-conducive? Synthese 157(3):361–372
Siebel M (2004) On Fitelson’s measure of coherence. Analysis 64:189–190
Siebel, M (2006) Against probabilistic measures of coherence. In: Coherence, truth and testimony, pp. 43–68. Springer
Spottswood M (2013) Bridging the gap between Bayesian and story-comparison models of juridical inference. Law Probab Risk, pp. mgt010
Thagard P (1989) Explanatory coherence. Behav Brain Sci 12(3):435–467
Thagard P (1989) Explanatory coherence. Behav Brain Sci. 12:435–502
Timmer S T, Meyer J-JC, Prakken H, Renooij S, Verheij B (2014)s Extracting legal arguments from forensic Bayesian networks. In: Legal knowledge and information systems, pp. 71–80. IOS Press
Urbaniak, R (2018) Narration in judiciary fact-finding: a probabilistic explication. Artif Intell Law, pp. 1–32. https://doi.org/10.1007/s10506-018-9219-z
Urbaniak R, Di Bello M (2021) Legal Probabilism. In: Zalta EN (ed) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Fall 2021 edition
Verheij B (2007) Argumentation support software: boxes-and-arrows and beyond. Law Probab Risk 6(1–4):187–208
Verheij B (2014) To catch a thief with and without numbers: arguments, scenarios and probabilities in evidential reasoning. Law Probab Risk 13(3–4):307–325
Verheij B (2017) Proof with and without probabilities. Correct evidential reasoning with presumptive arguments, coherent hypotheses and degrees of uncertainty. Artif Intell Law, pp. 1–28
Verheij B, Bex F, Timmer ST, Meyer J, Renooij S, Prakken H et al (2016) Arguments, scenarios and probabilities: connections between three normative frameworks for evidential reasoning. Law Probab Risk 15:35–70
Vlek C (2016) When stories and numbers meet in court: constructing and explaining Bayesian networks for criminal cases with scenarios. Rijksuniversiteit Groningen
Vlek C, Prakken H, Renooij S, Verheij B (2013) Modeling crime scenarios in a Bayesian network. In: Proceedings of the fourteenth international conference on artificial intelligence and law. pp. 150–159. ACM
Vlek C, Prakken H, Renooij S, Verheij B (2014) Building Bayesian networks for legal evidence with narratives: a case study evaluation. Artif Intell Law 22:375–421
Vlek C, Prakken H, Renooij S, Verheij B (2015) Representing the quality of crime scenarios in a Bayesian network. In: Rotolo A (ed) Legal knowledge and information systems. IOS Press, Amsterdam, pp 133–140
Vlek C, Prakken H, Renooij S, Verheij B (2016) A method for explaining Bayesian networks for legal evidence with scenarios. Artif Intell Law 24:285–324
Wigmore JH (2012) Principles of judicial proof. JSTOR
Funding
The work is supported by Narodowe Centrum Nauki Grant No. 2016/22/E/HS1/00304.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kowalewska, A., Urbaniak, R. Measuring coherence with Bayesian networks. Artif Intell Law 31, 369–395 (2023). https://doi.org/10.1007/s10506-022-09316-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10506-022-09316-9