Bridging uncertain and ambiguous knowledge with imprecise probabilities

doi:10.1016/j.envsoft.2011.07.022

Environmental Modelling & Software

Volume 36, October 2012, Pages 122-130

https://doi.org/10.1016/j.envsoft.2011.07.022 Get rights and content

Abstract

Model-based environmental decision support requires that uncertainty be rigorously evaluated. Whether uncertainty is aleatory or epistemic, we argue that probability is the natural mathematical construct for describing uncertainty in predictions used for decision-making. If expert knowledge is elicited using stated preferences between lotteries, and the experts are rational in the sense of avoiding sure loss, then the resulting knowledge quantifications will be consistent with the axiomatic foundation of probability theory. This idea can be extended to the description of intersubjective knowledge when the intent is to characterize the state of knowledge of the scientific community. Many methods for probability elicitation have been reported, but there is nearly always some degree of ambiguity in translating elicited quantities into probabilistic description. This would include: any lack of fit of a particular distributional form to elicited data; incertitude in the elicited data themselves; and/or disagreement in the elicited data across multiple experts. By replacing a precise probability distribution by a set of distributions, the mathematical concept of imprecise probabilities provides a means for representing this ambiguity. In this way, imprecise probabilities can form a bridge between total ignorance and precisely characterized risk by allowing for a continuous degree of imprecision to represent ambiguity. We introduce three metrics to describe the relative ambiguity of important attributes of probability distributions, namely their width, shape, and mode. These metrics are applicable to sets of distributions characterized by using any available method, and we derive the specific forms of these metrics for the Density Ratio Class, which we have found to have many desirable properties. Based on these metrics and on elicitation data from the literature, we use three examples to demonstrate the wide variety of ambiguity that can be present in elicited knowledge. Imprecise probabilities allow us to quantify this ambiguity and consider it in environmental decision-making. Our examples were implemented using a package we recently developed and made freely available for the R statistical programming environment.

Introduction

Nature’s complexity and stochastic behavior imply that models of environmental systems are always approximations of reality and lead to uncertain predictions. Sources of uncertainty include: (i) non-deterministic, potentially stochastic behavior of the true system – referred to as aleatory uncertainty and (ii) lack of knowledge about the true system, its mathematical representation, and specific parameter values – referred to as epistemic uncertainty (Parry, 1996, Walker et al., 2003, Refsgaard et al., 2007). In environmental modeling, epistemic uncertainty is often dominant (Ayyub and Klir, 2006). Recognizing and quantifying both types of uncertainty is important because it allows modelers to effectively allocate their resources toward model improvement and allows decision-makers to assess the degree of confidence they can have in model predictions (Warmink et al., 2010).

Probability theory has long been the well-accepted mathematical framework for describing aleatory uncertainty. However, Keynes, 1921, de Finetti, 1931, Ramsey, 1931, Cox, 1946 and others have shown by the so-called Dutch Book argument that probability theory is also appropriate for describing epistemic uncertainty: when an individual’s state of knowledge is quantified using stated preferences between lotteries (with the requirement that such preferences be consistent in the sense of avoiding sure loss), then the resulting knowledge quantifications adhere to the laws of probability. Additionally, since aleatory uncertainty becomes epistemic uncertainty once a random event has taken place and if its outcome is not yet observed, describing both kinds of uncertainty within the same mathematical framework avoids problems of inconsistency between mathematical formalisms. This is consistent with the viewpoint that Bayesian statistics is the logical framework for inference and prediction (de Finetti, 1974, Howson and Urbach, 1989, Seidenfeld et al., 1995, Kadane and O’Hagan, 1995, Kadane et al., 1996).

Of course, in most cases, a modeler will not be entirely familiar with the current state of knowledge or opinion regarding the relevant uncertainties and so may seek outside expertise (Pollino et al., 2007, Reichert et al., 2007). The formal approach to obtaining expertise about an uncertain quantity within probability theory is referred to as probability elicitation, and a variety of pertinent approaches, guidelines, and cautions have been published (e.g., Meyer and Booker, 1991, O’Hagan et al., 2006, James et al., 2010), and see also Section 2 for further references). In the case of models being used to inform public decisions, a modeler might be interested in representing intersubjective knowledge, rather than the beliefs of individual experts. Intersubjective knowledge in such a context represents the current state of knowledge of the scientific community about an environmental system, its mathematical description, or specific parameter values. Arguments in favor of a mathematical formalism, such as probability theory, to represent both aleatory and epistemic uncertainty are even further strengthened in the case of intersubjective knowledge representation because of the need to maintain consistency and transparency. Note that the importance of an intersubjective interpretation of probabilities to describe scientific reasoning has already been discussed by Gillies (1991).

As outlined in the previous paragraph, there are convincing arguments for formulating epistemic, subjective and, especially, intersubjective knowledge by probabilities. However, inaccuracies in elicitation procedures, misrepresentation of elicitation results, problems in expressing an individual’s beliefs quantitatively, different perception of information by different individuals, or disagreement between experts can all lead to uncertainty about the probabilistic quantification of knowledge (O’Hagan and Oakley, 2004). This type of uncertainty has been referred to as ambiguity (Ellsberg, 1961, Frisch and Baron, 1988). In particular, it has been discussed in decision sciences where ambiguity aversion (aversion to unknown probabilities) is distinguished from risk aversion (aversion to uncertainty that can be quantified probabilistically) (Einhorn and Hogarth, 1985, Camerer and Weber, 1992). As ambiguity is a different aspect of uncertainty than probabilistically quantified uncertainty, we are interested to identify, describe, and seek to reduce this particular form of uncertainty regardless of how much additional uncertainty may be embedded in the elicited probabilities themselves.

One method for separating ambiguity in the choice of a probability specification from the uncertainty contained within the specification itself is to replace the standard single probability distribution with a set of distributions. This is an extension of conventional probability theory and the literature broadly refers to it as imprecise probability theory (Walley, 1991, Caselton and Luo, 1992; http://www.sipta.org/http://www.sipta.org). In the context of imprecise probability theory, conventional Bayesian statistics extend to what is called robust Bayesian statistics (Ríos Insua and Ruggeri, 2000, Berger, 1994). Depending on the degree of ambiguity, a set of probability distributions can contain a large variety of shapes or can simply contain those shapes in a small neighborhood of a particular distribution. Multiple approaches, or classes, have been proposed to define membership in such sets (see references in Section 2), and we believe it would be useful to have some standard metrics for describing the relative ambiguity contained in any particular set, independent of the approach taken to set specification.

In this paper, we propose metrics to describe the relative ambiguity contained in a set of distributions defined according to imprecise probability theory, and we apply these metrics to demonstrate the wide variety of ambiguity present across different application cases. The paper is structured as follows. In Section 2, we briefly discuss probability elicitation and the relative merits of various classes of imprecise probabilities. In Section 3, we introduce some general metrics to quantify the degree of relative ambiguity in any such class. In Section 4, we implement these metrics for a particular class that we have found most useful, the Density Ratio Class. In Section 5, we demonstrate the use of our metrics using elicitation data from the literature. We present three cases of differing degree of ambiguity in order to show the wide range present in actual elicitation results. In Section 6, we discuss (1) our metrics of ambiguity relative to others, (2) the merits of using imprecise probabilities, relative to second-order probabilities and (3) some implications of using imprecise probabilities for environmental decision support. Finally, in Section 7 we draw our conclusions. In the section Software Availability, we present our elicitation software written in R that is applicable to the Density Ratio Class.

Section snippets

Elicitation

A common technique for eliciting a probability distribution from an expert for a continuous quantity is to employ the quantile method. According to this method, the analyst provides a number of cumulative probabilities (e.g., 0.05, 0.25, 0.5, 0.75, 0.95) and the expert then estimates the corresponding quantiles of the uncertain quantity. This procedure, first suggested by Winkler (1967), minimizes anchoring and other biases that may be inherent in the “cumulative probability method”, in which

Metrics of imprecision

There seem to be at least three important attributes of probability distributions for which we would be interested to quantify the ambiguity or imprecision: (i) the width of the distribution, (ii) the shape of the distribution within its range, and (iii) the position of the mode. We need metrics of ambiguity or imprecision about these attributes that are independent of the particular definition used to define the imprecise probability class. The focus on imprecision in specific attributes

Implementation for the Density Ratio Class

As already mentioned, a variety of imprecise probability classes have been proposed. Rinderknecht et al. (2011) discuss the relative merits of the different classes and conclude that the Density Ratio Class has clear conceptual and practical advantages. In particular, the Density Ratio Class’s invariance under Bayesian updating and marginalization (Wasserman, 1992) makes it the unique class that allows for simultaneously describing a consistent sequential Bayesian learning process and

Largely uncertainty: date of maximum periphyton biomass in a riverine ecosystem model

Schweizer (2007) developed a deterministic model with a stochastic error for river periphyton biomass recovery after a flood. In reduced form, this model can be expressed as $B_{Δ t_{flood}} (x, θ) = b_{Δ t_{flood}} (x, θ) + Z (θ),$ where $b_{Δ t_{flood}} (x, θ)$ is the deterministically modeled biomass of periphyton, consisting multiplicatively of a Monod function, $m (x, θ)$ , limiting terms, $l (x, θ)$ , and a seasonality term, $s (x, θ)$ . Here, we focus on the model parameter describing the Julian day within the year at which the potential

Discussion

We see three issues that require further discussion: (1) a comparison of our metrics of ambiguity relative to others that have been proposed in the literature, (2) the merits of using imprecise probabilities to describe ambiguity in elicitation results, relative to (precise) first and second-order probabilities or fuzzy distributions, and (3) the implications of using imprecise probabilities for environmental decision support. The third point will exemplify the bridging function we see

Conclusion

Imprecise probabilities allow us to characterize the degree of ambiguity in probability distributions elicited from subject matter experts. Because of the variety of approaches taken to specifying imprecise probabilities, some generic metrics applicable to all approaches are required. Besides overall measures of imprecision of a class of distributions, we are interested in the imprecision of specific attributes of the class. In particular, important attributes are the width, shape and position

Software availability

The example results in Section 5 were generated using our recently implemented software package for R (Ihaka and Gentleman, 1996) that is able to calculate the Density Ratio Class for given quantile intervals according to the method described in Section 4.2 (Rinderknecht et al., 2011). Possible lower and upper densities are the Gaussian, Student-t, Logistic, Gamma, Weibull, F, Beta, Uniform, Log-Normal, Log-Student-t and the Log-Logistic. Additionally, two transformations for the variable θ are

Acknowledgments

The authors would like to thank all anonymous reviewers for their useful comments and gratefully acknowledge support from the Swiss National Science Foundation. M.E.B. was partially supported by the US EPA through grant #RD-83366601 from the STAR program. This work has not been subjected to the Agency’s required peer and policy review and therefore does not necessarily reflect the views of the Agency. No official endorsement should be inferred.

References (57)

J.O. Berger
Robust Bayesian analysis: sensitivity to the prior
Journal of Statistical Planning and Inference
(1990)
S. Ferson et al.
Arithmetic with uncertain numbers: rigorous and (often) best possible answers
Reliability Engineering & System Safety
(2004)
A. James et al.
Elicitator: an expert elicitation tool for regression in ecology
Environmental Modelling & Software
(2010)
A. O’Hagan et al.
Probability is perfect, but we can’t elicit it perfectly
Reliability Engineering & System Safety
(2004)
G.W. Parry
The characterization of uncertainty in Probabilistic Risk Assessments of complex systems
Reliability Engineering & System Safety
(1996)
C.A. Pollino et al.
Parameterisation and evaluation of a bayesian network for use in an ecological risk assessment
Environmental Modelling & Software
(2007)
J.C. Refsgaard et al.
Uncertainty in the environmental modelling process – a framework and guidance
Environmental Modelling & Software
(2007)
P. Reichert et al.
Concepts of decision support for river rehabilitation
Environmental Modelling & Software
(2007)
S.L. Rinderknecht et al.
Eliciting density ratio classes
International Journal of Approximate Reasoning
(2011)
J. Warmink et al.
Identification and classification of uncertainties in the application of environmental models
Environmental Modelling & Software
(2010)

L. Zadeh

Fuzzy sets

Information and Control

(1965)

L.A. Zadeh

Fuzzy sets as a basis for a theory of possibility

Fuzzy Sets and Systems

(1978)

B.M. Ayyub et al.

Uncertainty Modeling and Analysis in Engineering and the Sciences

(2006)

J.O. Berger

An overview of robust Bayesian analysis

Test

(1994)

B.J.F. Biggs et al.

Hydraulic habitat suitability for periphyton in rivers

Regulated Rivers-Research & Management

(1996)

M.E. Borsuk et al.

Uncertainty, imprecision, and the precautionary principle in climate change assessment

Water Science and Technology

(2005)

M.E. Borsuk et al.

A survival model of the effects of bottom-water hypoxia on the population density of an estuarine clam (Macoma balthica)

Canadian Journal of Fisheries and Aquatic Sciences

(2002)

C. Camerer et al.

Recent developments in modeling preferences: uncertainty and ambiguity

Journal of Risk and Uncertainty

(1992)

W.F. Caselton et al.

Decision-making with imprecise probabilities: Dempster–Shafer theory and application

Water Resources Research

(1992)

R. Cox

Probability, frequency and reasonable expectation

American Journal of Physics

(1946)

B. de Finetti

Sul signiflcato soggettivo della probabilità

Fundamenta Mathematicae

(1931)

B. de Finetti

Theory of probability

(1974)

L. DeRobertis et al.

Bayesian inference using intervals of measures

The Annals of Statistics

(1981)

S. Destercke et al.

Unifying practical uncertainty representations: I. generalized p-boxes – II. clouds

International Journal of Approximate Reasoning

(2008)

D. Draper

Assessment and propagation of model uncertainty

Journal of the Royal Statistical Society. Series B (Methodological)

(1995)

H.J. Einhorn et al.

Ambiguity and uncertainty in probabilistic inference

Psychological Review

(1985)

D. Ellsberg

Risk, ambiguity, and the Savage axioms

The Quarterly Journal of Economics

(1961)

D. Frisch et al.

Ambiguity and rationality

Journal of Behavioral Decision Making

(1988)

Cited by (42)

Iterative importance sampling with Markov chain Monte Carlo sampling in robust Bayesian analysis
2022, Computational Statistics and Data Analysis
Bayesian inference under a set of priors, called robust Bayesian analysis, allows for estimation of parameters within a model and quantification of epistemic uncertainty in quantities of interest by bounded (or imprecise) probability. Iterative importance sampling can be used to estimate bounds on the quantity of interest by optimizing over the set of priors. A method for iterative importance sampling when the robust Bayesian inference relies on Markov chain Monte Carlo (MCMC) sampling is proposed. To accommodate the MCMC sampling in iterative importance sampling, a new expression for the effective sample size of the importance sampling is derived, which accounts for the correlation in the MCMC samples. To illustrate the proposed method for robust Bayesian analysis, iterative importance sampling with MCMC sampling is applied to estimate the lower bound of the overall effect in a previously published meta-analysis with a random effects model. The performance of the method compared to a grid search method and under different degrees of prior-data conflict is also explored.
Evaluating the ecological influence of hydraulic projects: A review of aquatic habitat suitability models
2017, Renewable and Sustainable Energy Reviews
In order to evaluate the ecological influence of hydraulic projects, aquatic habitat suitability modeling was proposed. Compared with other environmental flow methods, this method, for the first time, integrated considering hydrology parameters and ecology attributes, quantitatively describing the relationships between species and habitat. Aquatic habitat suitability models began with one-dimensional physical habitat simulation model (PHABSIM) at micro-scale physical habitat. Afterwards, habitat suitability models for meso- and macro-scale were developed; eco-factors indicating physical habitat and water quality situation were included; two- and three-dimensional model were employed to provide the situation of eco-factors. Based on field survey or mathematical models, the state of habitat eco-factors can be obtained. By establishing habitat suitability evaluation criteria, the impact of aquatic habitats on the particular life stage of indicate species can be assessed, the effect of reservoir regulation mode and habitat restoration projections can be predicted. Here, we summarized aquatic habitat suitability models in different spatial scale, and the advantages and disadvantages of each model were analyzed and concluded.
Robust discrimination between uncertain management alternatives by iterative reflection on crossover point scenarios: Principles, design and implementations
2016, Environmental Modelling and Software
When comparing environmental management alternatives, there is a need to assess the effect of uncertainty in the underlying model(s) and future conditions on robustness of recommendations. At times, it may be difficult or undesirable to specify the uncertainty in inputs and parameters a priori. An alternative approach instead generates crossover points, describing scenarios where the preferred alternative will change (i.e. alternatives are of equal value), and prompts the analyst to assess their plausibility a posteriori. This paper extends previous work by introducing principles, design and implementation of a new method to analyse crossover points. It reduces the complexity of dealing with many variables by identifying single crossover points of greatest concern, and progressively building understanding through three stages of analysis. We present three implementations using R, Excel and a web interface. They use two examples involving cost-benefit analysis of managed aquifer recharge and the water footprint impact of changing diets.
Multi-scale land-use disaggregation modelling: Concept and application to EU countries
2016, Environmental Modelling and Software
Changes of carbon stocks in agricultural soils, emissions of greenhouse gases from agriculture, and the delivery of ecosystem services of agricultural landscapes depend on combinations of land-use, livestock density, farming practices, climate and soil types. Many environmental processes are highly non-linear. If the analysis of the environmental impact is based on data at a relatively coarse-scale (e.g. farm, country, or large administrative regions), conclusions can be misleading. For an accurate assessment of agri-environmental indicators, data of agricultural activities and their dynamics are needed at high spatial resolution. In this paper, we develop and validate a spatial model for predicting the agricultural land-use areas within the homogenous spatial units (HSUs). For the EU-28 countries, we distinguish about 1.5 × 10⁵ HSUs and we consider 30 possible land-uses to match with the classification used in the Common Agricultural Policy Regionalized Impact (CAPRI) model. The comparison of model predictions with independent observations and with a simple rule-based approach at HSU level demonstrates that the predictions are generally accurate in more than 75 % of HSUs. The frequent crops or land-use are better predicted. For non-frequent crops and/or crops requiring specific cultivation conditions, the model needs further fine-tuning.
Application of the fuzzy analytic hierarchy process in multi-criteria decision in noise action plans: Prioritizing road stretches
2016, Environmental Modelling and Software
Traffic noise is one of the major environmental impacts of road infrastructures. Critical study of published Noise Action Plans (NAP) signals a widespread lack of objective criteria and methodologies for prioritizing actions against noise as well as the suitability of solutions. The present paper develops a methodology to sort, by priority, road stretches included in a NAP. In obtaining and allocating weights to variables involved in the decision-making problem (“Road Stretch Priority Variables”) to define a normalized numerical index (“Road Stretch Priority Index”), Fuzzy Analytic Hierarchy Process (FAHP) with two different defuzzification methods is applied to the results of an expert panel. Comparison of the outcomes of both FAHP versions, plus analysis of the results of a case study, enables to determine the relative influence of these variables in the problem. An objective and reasoned methodology for the prioritized classification of road stretches according to noise problems is thereby validated.
Efficient fuzzy Bayesian inference algorithms for incorporating expert knowledge in parameter estimation
2016, Journal of Hydrology
Citation Excerpt :
The reason is that expert knowledge often has the form of imprecisely-defined and ambiguous terms and statements rather than exact probability distributions (Li et al., 2013). So it would be more acceptable to describe expert knowledge as intervals, bounds or sets of probability distributions (Rinderknecht et al., 2012). Moreover, using single probability distributions to describe the intrinsically imprecise expert knowledge can bring new, faulty and unwarranted assumptions to the parameter estimation process (Lele and Allen, 2006; Stein et al., 2013).
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is ‘fuzzy Bayesian inference’ which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert elicitation methodology is developed and applied to the real-world test case in order to provide a road map for the use of fuzzy Bayesian inference in groundwater modeling applications.

View all citing articles on Scopus

View full text

Bridging uncertain and ambiguous knowledge with imprecise probabilities

Abstract

Introduction

Section snippets

Elicitation

Metrics of imprecision

Implementation for the Density Ratio Class

Largely uncertainty: date of maximum periphyton biomass in a riverine ecosystem model

Discussion

Conclusion

Software availability

Acknowledgments

Journal of Statistical Planning and Inference

Reliability Engineering & System Safety

Environmental Modelling & Software

Reliability Engineering & System Safety

Reliability Engineering & System Safety

Environmental Modelling & Software

Environmental Modelling & Software

Environmental Modelling & Software

International Journal of Approximate Reasoning

Environmental Modelling & Software

Information and Control

Fuzzy Sets and Systems

Uncertainty Modeling and Analysis in Engineering and the Sciences

An overview of robust Bayesian analysis

Test

Hydraulic habitat suitability for periphyton in rivers

Regulated Rivers-Research & Management

Uncertainty, imprecision, and the precautionary principle in climate change assessment

Water Science and Technology

A survival model of the effects of bottom-water hypoxia on the population density of an estuarine clam (Macoma balthica)

Canadian Journal of Fisheries and Aquatic Sciences

Recent developments in modeling preferences: uncertainty and ambiguity

Journal of Risk and Uncertainty

Decision-making with imprecise probabilities: Dempster–Shafer theory and application

Water Resources Research

Probability, frequency and reasonable expectation

American Journal of Physics

Sul signiflcato soggettivo della probabilità

Fundamenta Mathematicae

Theory of probability

Bayesian inference using intervals of measures

The Annals of Statistics

Unifying practical uncertainty representations: I. generalized p-boxes – II. clouds

International Journal of Approximate Reasoning

Assessment and propagation of model uncertainty

Journal of the Royal Statistical Society. Series B (Methodological)

Ambiguity and uncertainty in probabilistic inference

Psychological Review

Risk, ambiguity, and the Savage axioms

The Quarterly Journal of Economics

Ambiguity and rationality

Journal of Behavioral Decision Making