A non-probabilist principle of higher-order reasoning

Talbott, William J.

doi:10.1007/s11229-015-0922-y

A non-probabilist principle of higher-order reasoning

Published: 09 October 2015

Volume 193, pages 3099–3145, (2016)
Cite this article

Synthese Aims and scope Submit manuscript

William J. Talbott¹

345 Accesses
5 Citations
Explore all metrics

Abstract

The author uses a series of examples to illustrate two versions of a new, nonprobabilist principle of epistemic rationality, the special and general versions of the metacognitive, expected relative frequency (MERF) principle. These are used to explain the rationality of revisions to an agent’s degrees of confidence in propositions based on evidence of the reliability or unreliability of the cognitive processes responsible for them—especially reductions in confidence assignments to propositions antecedently regarded as certain—including certainty-reductions to instances of the law of excluded middle or the law of noncontradiction in logic or certainty-reductions to the certainties of probabilist epistemology. The author proposes special and general versions of the MERF principle and uses them to explain the examples, including the reasoning that would lead to thoroughgoing fallibilism—that is, to a state of being certain of nothing (not even the MERF principle itself). The author responds to the main defenses of probabilism: Dutch Book arguments, Joyce’s potential accuracy defense, and the potential calibration defenses of Shimony and van Fraassen by showing that, even though they do not satisfy the probability axioms, degrees of belief that satisfy the MERF principle minimize expected inaccuracy in Joyce’s sense; they can be externally calibrated in Shimony and van Fraassen’s sense; and they can serve as a basis for rational betting, unlike probabilist degrees of belief, which, in many cases, human beings have no rational way of ascertaining. The author also uses the MERF principle to subsume the various epistemic akrasia principles in the literature. Finally, the author responds to Titelbaum’s argument that epistemic akrasia principles require that we be certain of some epistemological beliefs, if we are rational.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

The anchoring bias reflects rational use of cognitive resources

Article 08 May 2017

Probabilistic epistemic logic based on neighborhood semantics

Article 20 April 2024

Notes

Jeffrey-conditionalization is the rule: Rational changes in degrees of confidence from \(\hbox {prob}_{1}\) (at time \(\hbox {t}_{1}\)) to \(\hbox {prob}_{2}\) (at a later time \(\hbox {t}_{2}\)) require that, where one’s total change in evidence between \(\hbox {t}_{1}\) and \(\hbox {t}_{2}\) is given by a statement of the form \(\hbox {prob}_{2}\hbox {(E)} = \hbox {x}\): For any proposition P, \(\hbox {prob}_{2}\hbox {(P)} = [\hbox {prob}_{1}\hbox {(P/E)} \times \hbox {prob}_{2}\hbox {(E)}] + [\hbox {prob}_{1}\hbox {(P/}-\mathrm{E)} \times \hbox {prob}_{2}\hbox {(}-\mathrm{E)}]\) (cf. (e.g., Howson and Urbach 1989, p. 286).
The technical explanation of this result is that, for Jeffrey, rational degrees of confidence are sets of confidence assignments, each element of which satisfies the standard probability axioms (1983, pp. 143, 145–146). That Jeffrey is committed to probabilistic closure is undeniable, because he uses it in one of his proofs (1983, p. 152). The problem that Jeffrey’s proposal was meant to address, the problem of old evidence (e.g., Glymour 1980), is different from the current example, because the problem of old evidence does not involve a violation of probabilistic closure. (At all times in the process by which a theory comes to acquire confirmation by old evidence, the old evidence is more probable than the theory.)
I emphasize that the certainty-reducing reasoning in this example (and in all the others that I discuss) is empirical reasoning based solely on empirical evidence, to forestall the potential objection that Bayesianism does not apply to a priori reasoning.
See the Appendix for a proof.
Though, as I explain the Appendix, the reliability of metacognitive processes can be evaluated in the same way.
There is also a ground-level expected relative frequency (GERF) principle, but I ignore it here. It is clearly beyond the scope of this paper to consider the principles that govern ground-level cognitive processing.
Thus, there is no analog of reliabilism’s generality problem here (e.g., Feldman 1985), because the MCS’s determination of the relevant expected relative frequencies of truth are always relative to the information available to the MCS, which is assumed to be finite.
The special MERF Principle is a refinement of an earlier proposal (Talbott 1990, Text 110–112). I should mention here that, to avoid unnecessary complications, the version of the MERF principle stated in the text assumes that all of the MCS’s judgments of relative frequency of truth are well-behaved, in the following sense: When an MCS evaluates the reliability of the agent’s assignment of confidence of c to proposition P, if the MCS believes there to be reliability-relevant categorizations of cognitive processes \(\hbox {CP}_{1}\) and \(\hbox {CP}_{2}\) that both include the cognitive processes responsible for the agent’s assignment of confidence of c to P, neither of which is a proper sub-class of the other; and if the MCS believes that ERF\(\mathrm{(T_{r}}\)/conf=c, \(\hbox {CP}_{1}\))=d and that ERF\(\mathrm{(T_{r}}\)/conf=c, \(\hbox {CP}_{2}\))=e and it is not the case that \(\hbox {d} \approx \hbox {e}\), then the MCS also has a belief about the value of the ERF\(\mathrm{(T_{r}}\)/conf=c, \(\hbox {CP}_{1}\) & \(\hbox {CP}_{2}\)). This requirement can be illustrated by a simple example. Suppose that an agent assigns confidence of c to a proposition P (the result of an arithmetical calculation) on the basis of having performed two different algorithms (\(\hbox {CP}_{1}\) and \(\hbox {CP}_{2}\)), which both agreed on the solution. If the agent’s MCS has an opinion about the expected relative frequency of truth when each of these algorithms is applied individually—that is, an opinion about ERF\(\mathrm{(T_{r}}\)/conf=c, \(\hbox {CP}_{1}\)) and ERF\(\mathrm{(T_{r}}\)/conf=c, \(\hbox {CP}_{2}\))—then the MCS also has an opinion about the expected relative frequency of truth when the combination of algorithms \( (\hbox {CP}_{1} \& \hbox {CP}_{2})\) is applied (i.e., \( \hbox {ERF}(\mathrm{{T_{r}/conf=c}}, \hbox {CP}_{1} \& \hbox {CP}_{2})\)). I discuss this assumption of well-behavedness more fully below.
For example, a reliability-relevant categorization of David’s belief about the amount to be tipped is that it is the result of a manual calculation of a simple arithmetical product. A categorization that is not relevant to a determination of its reliability would be that it is a correct manual calculation, even though David’s GCS may believe that it is a correct manual calculation. For more on the distinction between those kinds that are reliability-relevant and those that are not, see Talbott (1990) (Preface 21–30 and Text 81–96).
An interesting question, which I can only mention here, is this: Is the logic of my epistemological theory classical? Because the concept of truth plays a crucial role in my epistemological theory and because I am persuaded that an adequate theory of truth requires a nonclassical logic, I believe that my epistemological theory requires a nonclassical logic. But classical logic is true enough for the purposes of this paper.
Here I only insist that it could be rational to be certain of nothing. For more considerations that could support this conclusion, see Christensen (2007). To argue that it is rational to be certain of nothing, I would have to consider more examples—for example, the Cartesian examples: I am thinking or I exist.
The Dutch Book arguments for probabilism are synchronic. Diachronic Dutch Book arguments have been given for both simple conditionalization and Jeffrey conditionalization. See Skyrms (1990, chap. 5).
Joyce (1998) allows for a family of functions as measures of inaccuracy, the most familiar of which is the Brier score. I explain the Brier score and illustrate its use as a measure of inaccuracy in the Appendix.
For a proof, see the Appendix.
Van Fraassen acknowledges the possibility of examples of this kind and then introduces a requirement that rules them out, because they are “irrational” (1983, p. 303). Why are they irrational? Van Fraassen’s answer depends on their logically implying, by the laws of classical logic, an extension that is not well-calibrated; so the truth of the laws of classical logic are assumptions of his argument. Shimony rules out such “possibilities” from the outset, by simply assuming that the laws of classical logic determine what is possible (1988, p. 81).
Though Christensen (2010b) does not fully endorse his principle, because of puzzles he discusses.
I set aside here the question of whether we have any beliefs that are purely a priori. I have my doubts about this, but I will try to show that even if there are purely a priori beliefs of the kind that Titelbaum employs in his account, his argument fails.
This requirement places some limits on the content of \(\hbox {P}_{2}\). For example, I doubt that, even in theory, there could be a drug that would make it possible to coherently believe that our current evidence rationally requires belief in any alternative to the belief “I exist.” I think we have enough evidence of powerful cognitive illusions in human beings to be quite confident that there are some values of \(\hbox {P}_{2}\) for which, in theory, there could be drugs with the effects I describe in the text. If this is correct, then perhaps some, but not all, rational a priori situational epistemic judgments are infallible. I can allow for this possibility and still be a thoroughgoing fallibilist even about rational a priori situational epistemic beliefs. Even if some sub-class of those beliefs are infallible, it would not be rational for us to be certain of any one of them, if, as I believe, there is no infallible way of drawing the line between those that are infallible and those that are not.
It might seem that Titelbaum could save his rational infallibility claim by holding that, after the debriefing, rationality does not permit Mike to believe anything about the situation in which an agent has relevant background beliefs BB2 and total evidence E2; not even that very statement. I think this is a mistaken conclusion to draw about the example described in the text, as I explain in the next paragraph. But even if rationality did not permit Mike to believe anything about his situation after the debriefing, the example would still be an example of empirical defeat of an a priori situational epistemic belief. Titelbaum does consider the possibility that his opponent might be stuck in a kind of rational dilemma (290), though he himself does not endorse this position as rational. I discuss what Titelbaum takes to be the rational dilemma shortly.
The MERF principle explanation has other explanatory advantages over Titelbaum’s account, but it is not necessary to address them here.
The “almost all” qualification is meant to leave it open that there might be some very narrow kinds of reasoning that are not subject to defeat—for example, Descartes’ inferences from propositions about what he is thinking to the propositions that he exists. I think that even this kind of reasoning may be subject to rational defeat, but it is beyond the scope of this paper to discuss such issues. My claims here are limited to inferential principles for reasoning about testimony and perception, and for logical reasoning. Also, for ease of exposition, I follow Titelbaum in considering principles of reasoning that apply to beliefs, even though I believe that the most general principles of reasoning apply to degrees of belief.
We have very little understanding of the standards of coherence. My earlier discussion of fallibility about the laws of logic implies that not even the laws of logic are standards of rational coherence. The kind of coherence involved is explanatory coherence, but no one has a very good understanding of what that is.
I should mention that there are other alternatives that could be more rational than to accept the conclusion of logical reasoning: to give up the belief that the premises deductively imply the conclusion, either because one made a mistake about which rules are deductively valid or because one made a mistake in thinking that the inference in question was an instance of those rules.
I should add that it is possible to believe that all reasoning is a kind of coherence reasoning without being a coherence theorist of rationality, because one can allow for rational input, so long as the input beliefs themselves are thought of as rationally defeasible.
My account relies on there being some sort of conceptual connection between the reliability-relevant property of my being something that I am rationally required not to believe and the property of having a low expected relative frequency of truth. If we translate Pollock’s use of being reliable into my terms as having a reliability-relevant classification with a high estimated relative frequency of truth and equate Pollock’s use of being a justified belief with being an (epistemically) rational belief, then Pollock (1984) offers an insightful discussion of this conceptual connection. See also Talbott (1990). Of course, when I assert that something is a conceptual truth, I do not mean to imply that it is rational to be certain of it. Given the history of mistakes in what were thought to be conceptual truths, the MERF principle can easily explain why it is not rational for us to be certain of propositions that we take to be conceptual truths. It is just such considerations that led BonJour to become a fallibilist about all a priori justification (BonJour 1998, chap. 4).
Thanks to an anonymous referee for making a version of this argument.
For a more thorough and insightful discussion of the role of models in epistemology, see Titelbaum (2013, chaps. 2 and 3).
In data from Southern California, the life expectancy of left-handed men was 11 years less than the life expectancy of right-handed men (Coren and Halpern 1991).
For a discussion of why it is a problem for all accounts of probability, see Hajek (2007).
The general MERF Principle is a refinement of an earlier proposal (Talbott 1990, Preface 31–35). As before, I assume that the agent’s confidence assignments to the relevant expected relative frequencies are well-behaved, in the sense explained in footnote 8, when suitably generalized by replacing references to beliefs about the relevant expected relative frequencies with references to confidence assignments to the relevant expected relative frequencies.
Sobel (1987) uses Dutch Book arguments to defend this kind of immodesty as a virtue of probabilist theories. I discussed Dutch Book arguments above. For another kind of Bayesian immodesty, see Belot (2013).
The only novel part of the proof is to interpret the expected relative frequencies of truth as probabilities by reference to which expected inaccuracy can be defined. The rest of the proof simply follows de Finetti’s [1940] that the Brier Score is a proper scoring rule.

References

Belot, G. (2013). Bayesian orgulity. Philosophy of Science, 80, 483–503.
Article Google Scholar
BonJour, L. (1998). In defense of pure reason. Cambridge: Cambridge University Press.
Google Scholar
Cartwright, N. (1983). How the laws of physics lie. Oxford: Clarendon Press.
Book Google Scholar
Christensen, D. (2007). Does Murphy’s law apply in epistemology? Self-doubt and rational ideals. Oxford Studies in Epistemology, 2, 3–31.
Google Scholar
Christensen, D. (2010a). Higher-order evidence. Philosophy and Phenomenological Research, 81, 185–215.
Article Google Scholar
Christensen, D. (2010b). Rational reflection. Philosophical Perspectives, 24, 121–140.
Article Google Scholar
Christensen, D. (2011). Disagreement, question-begging, and epistemic self-criticism. Philosophers’ Imprint, 11, 1–21.
Google Scholar
Coren, S., & Halpern, D. F. (1991). Left-handedness: A marker for decreased survival fitness. Psychological Bulletin, 109, 90–106.
Article Google Scholar
Crick, F. (1988). What mad pursuit. New York: Basic Books.
de Finetti, B. [1937] (1980). La Prevision: Ses Lois Logiques, Ses Sources Subjectives. (Annales de l’Institut Henri Poincare, 7, 1–68). Translated into English and reprinted in Kyburg and Smokler, Studies in Subjective Probability. Huntington: Krieger.
de Finetti, B. [1940] (2008). Decisions and proper scoring rules. In Alberto, M. (Ed.), Hykel Hosni, tr., Philosophical Lectures on Probability (pp. 15–26). Dordrecht: Springer.
Feldman, R. (1985). Reliability and justification. Monist, 68, 159–174.
Article Google Scholar
Feldman, R. (2005). Respecting the evidence. Philosophical Perspectives, 19, 95–119.
Article Google Scholar
Field, H. (1996). The A prioricity of logic. Proceedings of the Aristotelian Society, 96, 359–379.
Article Google Scholar
Field, H. (2008). Saving truth from paradox. Oxford: Oxford University Press.
Book Google Scholar
Garber, D. (1983). Old evidence and logical omniscience in bayesian confirmation theory. In J. Earman (Ed.), Testing scientific theories, midwest studies in the philosophy of science 10 (pp. 99–131). Minneapolis: University of Minnesota Press.
Google Scholar
Glymour, C. (1980). Theory and evidence. Princeton: Princeton University Press.
Google Scholar
Hajek, A. (2007). The reference class problem is your problem too. Synthèse, 156, 563–585.
Article Google Scholar
Hoefer, C. (2007). The third way on probability: A Sceptic’s guide to chance. Mind, 116, 549–596.
Article Google Scholar
Horowitz, S. (2014). Epistemic akrasia. Noûs, 48, 718–744. doi:10.1111/nous.12026.
Howson, C., & Urbach, P. (1989). Scientific reasoning: The Bayesian approach. Chicago: Open Court.
Google Scholar
Jeffrey, R. (1983). Bayesianism with a human face. In J. Earman (Ed.), Testing scientific theories, minnesota studies in the philosophy of science (Vol. 10, pp. 133–156). Minneapolis: University of Minnesota Press.
Google Scholar
Jeffrey, R. (1986). Probabilism and induction. Topoi, 5, 51–58.
Article Google Scholar
Jeffrey, R. (1992). Probability and the art of judgment. Cambridge: Cambridge University Press.
Book Google Scholar
Joyce, J. M. (1998). A nonpragmatic vindication of probabilism. Philosophy of Science, 68, 575–603.
Article Google Scholar
Kelly, T. (2010). Peer disagreement and higher-order evidence. In R. Feldman & T. A. Warfield (Eds.), Disagreement (pp. 111–174). Oxford: Oxford University Press.
Chapter Google Scholar
Keynes, J. M. (1952). A treatise on probability. London: Macmillan.
Google Scholar
Kripke, S. (1975). Outline of a theory of truth. Journal of Philosophy, 72, 690–716.
Article Google Scholar
Kyburg, H. E, Jr. (2003). Are there degrees of belief? Journal of Applied Logic, 1, 139–149.
Article Google Scholar
Laudan, L. (1981). A confutation of convergent realism. Philosophy of Science, 48, 19–48.
Article Google Scholar
Levi, I. (1980). The enterprise of knowledge. Cambridge: MIT Press.
Google Scholar
Levi, I. (1991). The fixation of belief and its undoing. Cambridge: Cambridge University Press.
Book Google Scholar
Lewis, D. (1980). A subjectivist’s guide to chance. In R. C. Jeffrey (Ed.), Studies in inductive logic and probability (Vol. 2, pp. 263–293). Berkeley: University of California Press.
Google Scholar
Millgram, E. (2009). Hard truths. Chichester: Wiley-Blackwell.
Book Google Scholar
Paris, J. B. (2001). A note on Dutch book method. Proceedings of the Second International Symposium on Imprecise Probabilities and Their Applications (pp. 301–306). Ithaca: Shaker.
Pollock, J. L. (1984). Reliability and justified belief. Canadian Journal of Philosophy, 14, 103–114.
Article Google Scholar
Priest, G. (2002). Paraconsistent logic. In D. Gabbay & F. Guenthner (Eds.), Handbook of philosophical logic (2nd ed., Vol. 6, pp. 287–393). Dordrecht: Kluwer Academic Publishers.
Chapter Google Scholar
Quine, W. V. O. (1961). Two dogmas of empiricism. In W. V. O. Quine (Ed.), From a logical point of view (pp. 20–46). New York: Harper & Row.
Google Scholar
Seidenfeld, T., Schervish, M. J., & Kadane, J. B. (2012). What kind of uncertainty is that? Using personal probability for expressing one’s thinking about logical and mathematical propositions. Journal of Philosophy, 109(2012), 516–533.
Article Google Scholar
Shimony, A. (1988). An adamite derivation of the principles of the calculus of probability. In J. H. Fetzer (Ed.), Probability and causality (pp. 79–89). Dordrecht: Kluwer.
Chapter Google Scholar
Skyrms, B. (1980). Higher order degrees of belief. In D. H. Mellor (Ed.), Prospects for pragmatism (pp. 109–137). Cambridge: Cambridge University Press.
Google Scholar
Skyrms, B. (1990). The dynamics of rational deliberation. Cambridge: Harvard University Press.
Google Scholar
Sobel, J. H. (1987). Self-doubts and Dutch strategies. Australasian Journal of Philosophy, 65, 56–81.
Article Google Scholar
Talbott, W. J. [1990]. (2015). The reliability of the cognitive mechanism: A mechanist account of empirical justification. New York: Routledge.
Tarski, A. (1944). The semantic conception of truth. Philosophy and Phenomenological Research, 4, 341–376.
Article Google Scholar
Titelbaum, M. (2013). Quitting certainties: A Bayesian framework modeling degrees of belief. Oxford: Oxford University Press.
Google Scholar
Titelbaum, M. (2015). Rationality’s fixed point: (or: In defense of right reason). In T. S. Gendler & J. Hawthorne (Eds.), Oxford studies in epistemology (Vol. 5, pp. 253–294). Oxford: Oxford University Press.
Chapter Google Scholar
Van Fraassen, B. C. (1983). Calibration: A frequency justification for personal probability. In R. Cohen & L. Laudan (Eds.), Physics, philosophy, and psychoanalysis (pp. 295–319). Dordrecht: D. Reidel.
Chapter Google Scholar
Williams, J. R. G. (2012). Gradational accuracy and non-classical semantics. Review of Symbolic Logic, 5, 513–537.
Article Google Scholar

Download references

Acknowledgments

This paper has benefited from discussions with many people. I received helpful comments on earlier versions from Arthur Fine, Tyler Hildebrand, John Manchak, and Conor Mayo-Wilson, as well as from several anonymous referees. I am grateful to have had the opportunity to work on this paper while in residence as a fellow at the Helen Riaboff Whiteley Center

Author information

Authors and Affiliations

Department of Philosophy, University of Washington, Box 353350, Seattle, WA, 98195-3350, USA
William J. Talbott

Authors

William J. Talbott
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William J. Talbott.

Appendices

Appendix

1.1 The general MERF principle

To state the general version of the MERF principle I need to introduce two complications to the special MERF principle in the text. The special version of the principle presupposed that, at the meta-level, the agent’s MCS has beliefs about the relevant factors. This simplified the exposition. The general version of the MERF principle replaces meta-level beliefs with meta-level confidence assignments.

Also, I may have given the impression that there is a division in the cognitive self—that there are two different selves involved in two different levels of cognitive processing. This is only a useful fiction. There is only one self. That single, unified self can evaluate not only the reliability of its ground-level processes; it can evaluate the reliability of its meta-level processes, though of course, it can’t evaluate them all at once. The difference in levels is just a heuristic to remind us that whenever the self evaluates its cognitive processes, it steps back and brackets their outputs, rather than just reasserting them.

Here then is the general principle:

(General metacognitive expected relative frequency (MERF) Principle) An agent’s confidence assignment of c to a proposition P is in disequilibrium if: The agent assigns higher-order confidence to propositions of the following form: There exists a reliability-relevant category of cognitive processes \({ CP}_{1}\) that includes the cognitive processes responsible for the agent’s confidence assignment of c to P such that: \({ ERF}({ T_r/conf=c}, { CP}_{1}) = x\);
and the weighted average of these higher-order confidence assignments (the sum of her confidence in each of them weighted by x) is equal to d;
and it is not the case that \(d \approx c\);
UNLESS [Narrower Reference Class Exception] the agent assigns higher-order confidence to propositions asserting that there is a reliability-relevant categorization \({ CP}_{2}\) of the causal processes responsible for her confidence assignment to P, \({ CP}_{2} \le { CP}_{1}\), such that:
\({ ERF}({ T_r/conf=c}, { CP}_{2}) = y\);
and the weighted average of these higher-order confidence assignments (the sum of her confidence in each of them weighted by y) is approximately equal to c. ^{Footnote 30}

This general version of the MERF principle makes it possible to explain how it could be rational for an agent, Van, to be certain about nothing, for it makes it possible for Van’s MCS to reduce all of his confidence assignments of 1.0, both ground-level and meta-level, to a value between 0 and 1.0 (and, similarly, to increase all of his confidence assignments of 0 to a value between 0 and 1.0). Here is a simple example meant only to illustrate the main idea: Let \(\Phi \) be the set of all propositions to which Van assigns confidence of 1.0. I explain how the general MERF principle could lead him to adopt a confidence assignment in which every member of \(\Phi \) is assigned confidence of .998, with the result that no proposition is assigned confidence of 1.0 (or 0). For simplicity, I assume that there is no relevant narrower subclass \(\Phi \) for which Van’s MCS projects a different expected relative frequency of truth than for \(\Phi \). In the cases of interest, Van will not assign confidence of 1.0 to the proposition that the expected relative frequency of truth of the members of \(\Phi \) is .998. Van will divide his confidence among various alternative values for that expected relative frequency of truth. Here is a simple example: Van assigns confidence of .5 to each of two possibilities: that the relevant relative frequency of truth is .999 and that the relevant relative frequency of truth is .997. His confidence assignment of 1.0 to a proposition P (in \(\Phi \)) will be in disequilibrium when, as in this case, the weighted average of his various estimates of the expected relative frequency of truth in the members of \(\Phi \) (in this case, .998) is not equal to his confidence in the individual members of \(\Phi \) (in this case, 1.0). In the simplest case, the general MERF principle will require him to reduce his confidence assignment to each of the members of \(\Phi \) to .998.

1.2 A proof that probabilist epistemologies are immodest

To show that all probabilist epistemologies imply that we are rationally required to assign confidence of 1.0 to at least some substantive epistemological claims, I carry out the argument for strict classical probabilism. It is easily modified to apply to non-strict classical probabilism, and even to non-classical probabilism, simply by making suitable substitutions for the variable ‘L,’ because all of these views require certainty in some propositions, and those requirements support implications to the conclusion that some substantive epistemological claims must be certain. Here is the argument for strict, classical probabilism:

Let RC(P) = x be the relation: Rationality requires assigning confidence of x to proposition P. The central principle of any form of probabilism is:
1. (1)
  [prob (P) = x] iff [RC(P) = x].
  
  Consider L, a truth of classical logic. Strict classical probabilism requires:
2. (2)
  prob(L) = 1.
  
  From (1):
3. (3)
  [prob(L) = 1] iff [RC(L) = 1].
  
  From (2) and (3):
4. (4)
  RC(L) = 1 [Rationality requires assigning confidence of 1.0 to L.]
  
  What is the probability of (4)? To answer this question, we need to determine the probability of (2)—that is, the probability of a probability, which is a higher-order probability. In a formalism rich enough to coherently model higher-order probabilities (e.g., Skyrms 1980), from (2) it follows that:
5. (5)
  prob[prob(L) = 1] =1.
  
  We also need one more probability theorem:
6. (6)
  [P iff Q] \(\rightarrow \) [prob(P) = prob (Q)].
  
  Then from (3), (5), and (6):
7. (7)
  prob[RC(L) = 1] = 1.
  
  And from (1) and (7):
8. (8)
  RC([RC(L) = 1] = 1.

So strict classical probabilism requires an assignment of confidence of 1.0 to it’s own non-trivial epistemological claim [RC(L) = 1]—that is, it requires a confidence assignment of 1.0 to the claim that we are rationally required to assign confidence of 1.0 to L. We can continue the construction to generate a potentially infinite list of non-trivial claims of strict classical probabilist epistemology to which strict classical epistemology requires us to assign confidence of 1.0. This shows that strict classical probabilism is an immodest epistemology.^{Footnote 31} Parallel arguments show that any non-strict classical probabilist epistemology, such as Garber (1983), and even any non-classical probabilist epistemology must be immodest. So all probabilist theories, which include all Bayesian theories, are immodest. They all require rational degrees of confidence of 1.0 in at least some of their own substantive (i.e., non-trivial) epistemological claims.

A proof that departures from MERF-Defined equilibria increase expected inaccuracy

In the text, I assert that when an agent S’s confidence assignment satisfies the special or general MERF Principle and S’s MCS has opinions about the relevant relative frequencies of truth, changes to S’s confidence assignment increase its expected inaccuracy (and thus decrease its expected accuracy). Here I prove this result for the most commonly used measure of inaccuracy, the Brier score, according to which the inaccuracy of a confidence assignment of x to P equals the square of its distance from the truth value of P (i.e., 1.0 if p is true; 0 if p is not true).

The key step in the proof is to define the probabilities to be used in the definition of expected inaccuracy. Joyce correctly points out that expected inaccuracy cannot be consistently defined using non-probabilist confidence assignments in the role of probabilities (1998, pp. 589–590). I do not do so. The probabilities I use are the probabilities defined by the MCS’s expected relative frequency of truth for the narrowest relevant reference class that includes the proposition of interest. The MCS must have opinions about those relative frequencies for these probabilities to exist. So there can be no determination of expected inaccuracy without them.

I use prob to refer to the MCS’s relevant estimates of the expected relative frequencies of truth. Then the expected inaccuracy (EI) of an agent S’s confidence assignment of z to proposition P can be defined as the weighted sum of its inaccuracy if P is true \(([\hbox {1}-\mathrm{z}]^{2})\), weighted by the probability that P is true [prob(P)] and its inaccuracy if P is not true \((\hbox {z}^{2})\), weighted by the probability that P is not true (prob(\(-\)P) = [1\(-\)prob(p)]).^{Footnote 32}

In the case in which S’s confidence assignment of x to P is in equilibrium, prob(P) = conf(P) = x. Therefore, the expected inaccuracy of the confidence assignment of x to P is:

(1)
EI(conf(p) = x) = \(\hbox {x}[1-\mathrm{x}]^{2}\) + \(\hbox {(1}-\mathrm{x)x}^{2}\) = x\( \,-\,\hbox {x}^{2}\)

I compare (1) with what the expected inaccuracy of S’s confidence assignment to P would be if S’s confidence in P were increased from x to [x + y] (where \(y > 0\) and \(0 \le [\hbox {x}\,+\,\hbox {y}] \le 1\)). (The proof of the case in which S’s confidence assignment to p is decreased is exactly parallel.) Intuitively, increasing S’s confidence assignment to P will decrease the inaccuracy of the assignment if P is true and increase the inaccuracy of the assignment if P is not true. The expected inaccuracy of S’s confidence assignment of [x + y] to P is again a weighted sum of two components: the inaccuracy of the assignment of [x + y] to P if P is true \(([1-[\hbox {x}\,+\,\hbox {y}]]^{2})\), weighted by the probability that P is true (in this case, x), and the inaccuracy of the assignment of [x + y] to P if P is not true \(([\hbox {x}\,+\,\hbox {y}]^{2})\), weighted by the probability that P is not true (1\(-\)x). Thus:

(2)
EI(conf(p) = [x + y]) = \(\hbox {x}(1 - [\hbox {x}+\mathrm{y}])^{2}\) + \((\hbox {1}-\mathrm{x})[\hbox {x}\,+\,\mathrm{y}]^{2} = x {-} \hbox {x}^{2}\) + \(\hbox {y}^{2}\)

The expected inaccuracy of the confidence assignment of \([\hbox {x}\,+\,\hbox {y}]\) to P is greater than the expected inaccuracy of the confidence assignment of x to P by the amount \(\hbox {y}^{2}\). So the confidence assignment of x to P minimizes expected inaccuracy, and the farther S’s confidence to P departs from x (i.e., \({\vert }\hbox {y}{\vert }\)), the greater its expected inaccuracy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Talbott, W.J. A non-probabilist principle of higher-order reasoning. Synthese 193, 3099–3145 (2016). https://doi.org/10.1007/s11229-015-0922-y

Download citation

Received: 14 May 2014
Accepted: 16 September 2015
Published: 09 October 2015
Issue Date: October 2016
DOI: https://doi.org/10.1007/s11229-015-0922-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A non-probabilist principle of higher-order reasoning

Abstract

Access this article

Similar content being viewed by others

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

The anchoring bias reflects rational use of cognitive resources

Probabilistic epistemic logic based on neighborhood semantics

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix

1.1 The general MERF principle

1.2 A proof that probabilist epistemologies are immodest

A proof that departures from MERF-Defined equilibria increase expected inaccuracy

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A non-probabilist principle of higher-order reasoning

Abstract

Access this article

Similar content being viewed by others

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

The anchoring bias reflects rational use of cognitive resources

Probabilistic epistemic logic based on neighborhood semantics

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix

1.1 The general MERF principle

1.2 A proof that probabilist epistemologies are immodest

A proof that departures from MERF-Defined equilibria increase expected inaccuracy

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation