Bridging uncertain and ambiguous knowledge with imprecise probabilities
Introduction
Nature’s complexity and stochastic behavior imply that models of environmental systems are always approximations of reality and lead to uncertain predictions. Sources of uncertainty include: (i) non-deterministic, potentially stochastic behavior of the true system – referred to as aleatory uncertainty and (ii) lack of knowledge about the true system, its mathematical representation, and specific parameter values – referred to as epistemic uncertainty (Parry, 1996, Walker et al., 2003, Refsgaard et al., 2007). In environmental modeling, epistemic uncertainty is often dominant (Ayyub and Klir, 2006). Recognizing and quantifying both types of uncertainty is important because it allows modelers to effectively allocate their resources toward model improvement and allows decision-makers to assess the degree of confidence they can have in model predictions (Warmink et al., 2010).
Probability theory has long been the well-accepted mathematical framework for describing aleatory uncertainty. However, Keynes, 1921, de Finetti, 1931, Ramsey, 1931, Cox, 1946 and others have shown by the so-called Dutch Book argument that probability theory is also appropriate for describing epistemic uncertainty: when an individual’s state of knowledge is quantified using stated preferences between lotteries (with the requirement that such preferences be consistent in the sense of avoiding sure loss), then the resulting knowledge quantifications adhere to the laws of probability. Additionally, since aleatory uncertainty becomes epistemic uncertainty once a random event has taken place and if its outcome is not yet observed, describing both kinds of uncertainty within the same mathematical framework avoids problems of inconsistency between mathematical formalisms. This is consistent with the viewpoint that Bayesian statistics is the logical framework for inference and prediction (de Finetti, 1974, Howson and Urbach, 1989, Seidenfeld et al., 1995, Kadane and O’Hagan, 1995, Kadane et al., 1996).
Of course, in most cases, a modeler will not be entirely familiar with the current state of knowledge or opinion regarding the relevant uncertainties and so may seek outside expertise (Pollino et al., 2007, Reichert et al., 2007). The formal approach to obtaining expertise about an uncertain quantity within probability theory is referred to as probability elicitation, and a variety of pertinent approaches, guidelines, and cautions have been published (e.g., Meyer and Booker, 1991, O’Hagan et al., 2006, James et al., 2010), and see also Section 2 for further references). In the case of models being used to inform public decisions, a modeler might be interested in representing intersubjective knowledge, rather than the beliefs of individual experts. Intersubjective knowledge in such a context represents the current state of knowledge of the scientific community about an environmental system, its mathematical description, or specific parameter values. Arguments in favor of a mathematical formalism, such as probability theory, to represent both aleatory and epistemic uncertainty are even further strengthened in the case of intersubjective knowledge representation because of the need to maintain consistency and transparency. Note that the importance of an intersubjective interpretation of probabilities to describe scientific reasoning has already been discussed by Gillies (1991).
As outlined in the previous paragraph, there are convincing arguments for formulating epistemic, subjective and, especially, intersubjective knowledge by probabilities. However, inaccuracies in elicitation procedures, misrepresentation of elicitation results, problems in expressing an individual’s beliefs quantitatively, different perception of information by different individuals, or disagreement between experts can all lead to uncertainty about the probabilistic quantification of knowledge (O’Hagan and Oakley, 2004). This type of uncertainty has been referred to as ambiguity (Ellsberg, 1961, Frisch and Baron, 1988). In particular, it has been discussed in decision sciences where ambiguity aversion (aversion to unknown probabilities) is distinguished from risk aversion (aversion to uncertainty that can be quantified probabilistically) (Einhorn and Hogarth, 1985, Camerer and Weber, 1992). As ambiguity is a different aspect of uncertainty than probabilistically quantified uncertainty, we are interested to identify, describe, and seek to reduce this particular form of uncertainty regardless of how much additional uncertainty may be embedded in the elicited probabilities themselves.
One method for separating ambiguity in the choice of a probability specification from the uncertainty contained within the specification itself is to replace the standard single probability distribution with a set of distributions. This is an extension of conventional probability theory and the literature broadly refers to it as imprecise probability theory (Walley, 1991, Caselton and Luo, 1992; http://www.sipta.org/http://www.sipta.org). In the context of imprecise probability theory, conventional Bayesian statistics extend to what is called robust Bayesian statistics (Ríos Insua and Ruggeri, 2000, Berger, 1994). Depending on the degree of ambiguity, a set of probability distributions can contain a large variety of shapes or can simply contain those shapes in a small neighborhood of a particular distribution. Multiple approaches, or classes, have been proposed to define membership in such sets (see references in Section 2), and we believe it would be useful to have some standard metrics for describing the relative ambiguity contained in any particular set, independent of the approach taken to set specification.
In this paper, we propose metrics to describe the relative ambiguity contained in a set of distributions defined according to imprecise probability theory, and we apply these metrics to demonstrate the wide variety of ambiguity present across different application cases. The paper is structured as follows. In Section 2, we briefly discuss probability elicitation and the relative merits of various classes of imprecise probabilities. In Section 3, we introduce some general metrics to quantify the degree of relative ambiguity in any such class. In Section 4, we implement these metrics for a particular class that we have found most useful, the Density Ratio Class. In Section 5, we demonstrate the use of our metrics using elicitation data from the literature. We present three cases of differing degree of ambiguity in order to show the wide range present in actual elicitation results. In Section 6, we discuss (1) our metrics of ambiguity relative to others, (2) the merits of using imprecise probabilities, relative to second-order probabilities and (3) some implications of using imprecise probabilities for environmental decision support. Finally, in Section 7 we draw our conclusions. In the section Software Availability, we present our elicitation software written in R that is applicable to the Density Ratio Class.
Section snippets
Elicitation
A common technique for eliciting a probability distribution from an expert for a continuous quantity is to employ the quantile method. According to this method, the analyst provides a number of cumulative probabilities (e.g., 0.05, 0.25, 0.5, 0.75, 0.95) and the expert then estimates the corresponding quantiles of the uncertain quantity. This procedure, first suggested by Winkler (1967), minimizes anchoring and other biases that may be inherent in the “cumulative probability method”, in which
Metrics of imprecision
There seem to be at least three important attributes of probability distributions for which we would be interested to quantify the ambiguity or imprecision: (i) the width of the distribution, (ii) the shape of the distribution within its range, and (iii) the position of the mode. We need metrics of ambiguity or imprecision about these attributes that are independent of the particular definition used to define the imprecise probability class. The focus on imprecision in specific attributes
Implementation for the Density Ratio Class
As already mentioned, a variety of imprecise probability classes have been proposed. Rinderknecht et al. (2011) discuss the relative merits of the different classes and conclude that the Density Ratio Class has clear conceptual and practical advantages. In particular, the Density Ratio Class’s invariance under Bayesian updating and marginalization (Wasserman, 1992) makes it the unique class that allows for simultaneously describing a consistent sequential Bayesian learning process and
Largely uncertainty: date of maximum periphyton biomass in a riverine ecosystem model
Schweizer (2007) developed a deterministic model with a stochastic error for river periphyton biomass recovery after a flood. In reduced form, this model can be expressed aswhere is the deterministically modeled biomass of periphyton, consisting multiplicatively of a Monod function, , limiting terms, , and a seasonality term, . Here, we focus on the model parameter describing the Julian day within the year at which the potential
Discussion
We see three issues that require further discussion: (1) a comparison of our metrics of ambiguity relative to others that have been proposed in the literature, (2) the merits of using imprecise probabilities to describe ambiguity in elicitation results, relative to (precise) first and second-order probabilities or fuzzy distributions, and (3) the implications of using imprecise probabilities for environmental decision support. The third point will exemplify the bridging function we see
Conclusion
Imprecise probabilities allow us to characterize the degree of ambiguity in probability distributions elicited from subject matter experts. Because of the variety of approaches taken to specifying imprecise probabilities, some generic metrics applicable to all approaches are required. Besides overall measures of imprecision of a class of distributions, we are interested in the imprecision of specific attributes of the class. In particular, important attributes are the width, shape and position
Software availability
The example results in Section 5 were generated using our recently implemented software package for R (Ihaka and Gentleman, 1996) that is able to calculate the Density Ratio Class for given quantile intervals according to the method described in Section 4.2 (Rinderknecht et al., 2011). Possible lower and upper densities are the Gaussian, Student-t, Logistic, Gamma, Weibull, F, Beta, Uniform, Log-Normal, Log-Student-t and the Log-Logistic. Additionally, two transformations for the variable θ are
Acknowledgments
The authors would like to thank all anonymous reviewers for their useful comments and gratefully acknowledge support from the Swiss National Science Foundation. M.E.B. was partially supported by the US EPA through grant #RD-83366601 from the STAR program. This work has not been subjected to the Agency’s required peer and policy review and therefore does not necessarily reflect the views of the Agency. No official endorsement should be inferred.
References (57)
Robust Bayesian analysis: sensitivity to the prior
Journal of Statistical Planning and Inference
(1990)- et al.
Arithmetic with uncertain numbers: rigorous and (often) best possible answers
Reliability Engineering & System Safety
(2004) - et al.
Elicitator: an expert elicitation tool for regression in ecology
Environmental Modelling & Software
(2010) - et al.
Probability is perfect, but we can’t elicit it perfectly
Reliability Engineering & System Safety
(2004) The characterization of uncertainty in Probabilistic Risk Assessments of complex systems
Reliability Engineering & System Safety
(1996)- et al.
Parameterisation and evaluation of a bayesian network for use in an ecological risk assessment
Environmental Modelling & Software
(2007) - et al.
Uncertainty in the environmental modelling process – a framework and guidance
Environmental Modelling & Software
(2007) - et al.
Concepts of decision support for river rehabilitation
Environmental Modelling & Software
(2007) - et al.
Eliciting density ratio classes
International Journal of Approximate Reasoning
(2011) - et al.
Identification and classification of uncertainties in the application of environmental models
Environmental Modelling & Software
(2010)
Fuzzy sets
Information and Control
Fuzzy sets as a basis for a theory of possibility
Fuzzy Sets and Systems
Uncertainty Modeling and Analysis in Engineering and the Sciences
An overview of robust Bayesian analysis
Test
Hydraulic habitat suitability for periphyton in rivers
Regulated Rivers-Research & Management
Uncertainty, imprecision, and the precautionary principle in climate change assessment
Water Science and Technology
A survival model of the effects of bottom-water hypoxia on the population density of an estuarine clam (Macoma balthica)
Canadian Journal of Fisheries and Aquatic Sciences
Recent developments in modeling preferences: uncertainty and ambiguity
Journal of Risk and Uncertainty
Decision-making with imprecise probabilities: Dempster–Shafer theory and application
Water Resources Research
Probability, frequency and reasonable expectation
American Journal of Physics
Sul signiflcato soggettivo della probabilità
Fundamenta Mathematicae
Theory of probability
Bayesian inference using intervals of measures
The Annals of Statistics
Unifying practical uncertainty representations: I. generalized p-boxes – II. clouds
International Journal of Approximate Reasoning
Assessment and propagation of model uncertainty
Journal of the Royal Statistical Society. Series B (Methodological)
Ambiguity and uncertainty in probabilistic inference
Psychological Review
Risk, ambiguity, and the Savage axioms
The Quarterly Journal of Economics
Ambiguity and rationality
Journal of Behavioral Decision Making
Cited by (42)
Iterative importance sampling with Markov chain Monte Carlo sampling in robust Bayesian analysis
2022, Computational Statistics and Data AnalysisEvaluating the ecological influence of hydraulic projects: A review of aquatic habitat suitability models
2017, Renewable and Sustainable Energy ReviewsRobust discrimination between uncertain management alternatives by iterative reflection on crossover point scenarios: Principles, design and implementations
2016, Environmental Modelling and SoftwareMulti-scale land-use disaggregation modelling: Concept and application to EU countries
2016, Environmental Modelling and SoftwareApplication of the fuzzy analytic hierarchy process in multi-criteria decision in noise action plans: Prioritizing road stretches
2016, Environmental Modelling and SoftwareEfficient fuzzy Bayesian inference algorithms for incorporating expert knowledge in parameter estimation
2016, Journal of HydrologyCitation Excerpt :The reason is that expert knowledge often has the form of imprecisely-defined and ambiguous terms and statements rather than exact probability distributions (Li et al., 2013). So it would be more acceptable to describe expert knowledge as intervals, bounds or sets of probability distributions (Rinderknecht et al., 2012). Moreover, using single probability distributions to describe the intrinsically imprecise expert knowledge can bring new, faulty and unwarranted assumptions to the parameter estimation process (Lele and Allen, 2006; Stein et al., 2013).