Optimism in the Face of Uncertainty Should be Refutable

Ortner, Ronald

doi:10.1007/s11023-008-9115-5

Optimism in the Face of Uncertainty Should be Refutable

Published: 09 September 2008

Volume 18, pages 521–526, (2008)
Cite this article

Minds and Machines Aims and scope Submit manuscript

Ronald Ortner¹

398 Accesses
4 Citations
Explore all metrics

Abstract

We give an example from the theory of Markov decision processes which shows that the “optimism in the face of uncertainty” heuristics may fail to make any progress. This is due to the impossibility to falsify a belief that a (transition) probability is larger than 0. Our example shows the utility of Popper’s demand of falsifiability of hypotheses in the area of artificial intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Axiomatic rationality and ecological rationality

Article Open access 09 July 2019

Believing in forecasts, uncertainty, and rational expectations

Article 03 September 2021

Pessimism and optimism towards new discoveries

Article 17 April 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The transition matrix P is the matrix of transition probabilities with rows and columns indexed by the states in S, so that in row s and column s′ the entry in P is the transition probability p(s, s′) from s to s′.
Note that the claim that the outcome of a random experiment has positive probability is basically an existence claim over an infinite number of trials (and hence not refutable).
Incidentally, this can also be used as criticism on Pascal’s wager. Although it is not clear whether it is appropriate to represent Pascal’s wager as an MDP similar to that in Example 1, Pascal’s argument is based on a non-refutable belief, as the assumption that there is a positive transition probability to heaven is not falsifiable.
That is, for an optimistic estimate of the transition probability p in question, one computes the expected reward which may be gained when insisting in the transition. This can be compared to the reward to be expected when ignoring the transition (i.e. setting p = 0 in the agent’s model). If the latter value is larger, the agent refutes the hypothesis that p > 0.
It is worth noting that although the UCRL algorithm assumes that the underlying MDP is ergodic or communicating, the optimistic model of the MDP it assumes in general is neither ergodic nor communicating.

References

Auer, P., & Ortner, R. (2006). Logarithmic online regret bounds for reinforcement learning. In B. Schölkopf, J. C. Platt, & T. Hofmann (Eds.), Advances in Neural Information Processing Systems (Vol. 19, pp. 49–56). Cambridge, MA: MIT Press.
Google Scholar
Auer, P., Jaksch, T., & Ortner, R. (2008). Near-optimal regret bounds for reinforcement learning (submitted).
Brafman, R. I., & Tennenholtz, M. (2002). R-max – a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213–231.
Article MathSciNet Google Scholar
Kemeny, J. G., Snell, J. L., & Knapp, A. W. (1976). Denumerable Markov chains. New York: Springer.
MATH Google Scholar
Popper, K. R. (1969). Logik der Forschung (3rd ed.). Tübingen: Mohr.
Google Scholar
Puterman, M. L. (1994). Markov decision processes. Discrete stochastic programming. New York: Wiley.
MATH Google Scholar
Strehl, A. L., & Littman, M. L. (2004). An empirical evaluation of interval estimation for Markov decision processes. In 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004) (pp.128–135). IEEE Computer Society.
Strehl, A. L., & Littman, M. L. (2005). A theoretical analysis of model-based interval estimation. In L. De Raedt & S.Wrobel (Eds.), Machine learning, Proceedings of the Twenty-Second International Conference (ICML 2005) (pp. 857–864) ACM.

Download references

Acknowledgments

The author would like to thank Georg Dorn for comments on a prior version of this paper. This work was supported in part by the Austrian Science Fund FWF (S9104-N13 SP4) and the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. This publication only reflects the authors’ views.

Author information

Authors and Affiliations

Department Mathematik und Informationstechnolgie, Montanuniversität Leoben, Franz-Josef-Strasse 18, 8700, Leoben, Austria
Ronald Ortner

Authors

Ronald Ortner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ronald Ortner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ortner, R. Optimism in the Face of Uncertainty Should be Refutable. Minds & Machines 18, 521–526 (2008). https://doi.org/10.1007/s11023-008-9115-5

Download citation

Received: 24 August 2007
Accepted: 05 August 2008
Published: 09 September 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s11023-008-9115-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimism in the Face of Uncertainty Should be Refutable

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Axiomatic rationality and ecological rationality

Believing in forecasts, uncertainty, and rational expectations

Pessimism and optimism towards new discoveries

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now