Article

A theoretical analysis of Model-Based Interval Estimation

Authors:
Alexander L. Strehl

Rutgers University, Piscataway, NJ

Rutgers University, Piscataway, NJ
View Profile

,
Michael L. Littman

Rutgers University, Piscataway, NJ

Rutgers University, Piscataway, NJ
View Profile

ICML '05: Proceedings of the 22nd international conference on Machine learningAugust 2005Pages 856–863https://doi.org/10.1145/1102351.1102459

Published:07 August 2005Publication History

ICML '05: Proceedings of the 22nd international conference on Machine learning

Pages 856–863

ABSTRACT

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents the first theoretical analysis of MBIE, proving its efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less "online" cousins from the literature.

References

Brafman, R. I., & Tennenholtz, M. (2002). R-MAX---a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213--231. Google ScholarDigital Library
Fiechter, C.-N. (1997). Expected mistake bound model for on-line reinforcement learning. Proceedings of the Fourteenth International Conference on Machine Learning (pp. 116--124). Google ScholarDigital Library
Fong, P. W. L. (1995). A quantitative study of hypothesis selection. Proceedings of the Twelfth International Conference on Machine Learning (ICML-95) (pp. 226--234).Google ScholarDigital Library
Kaelbling, L. P. (1993). Learning in embedded systems. Cambridge, MA: The MIT Press. Google ScholarDigital Library
Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, Gatsby Computational Neuroscience Unit, University College London.Google Scholar
Kearns, M. J., & Singh, S. P. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49, 209--232. Google ScholarDigital Library
Puterman, M. L. (1994). Markov decision processes---discrete stochastic dynamic programming. New York, NY: John Wiley & Sons, Inc. Google ScholarDigital Library
Strehl, A. L., & Littman, M. L. (2004). An empirical evaluation of interval estimation for Markov decision processes. The 16th IEEE International Conference on Tools with Artifical Intelligence (ICTAI-2004) (pp. 128 135). Google ScholarDigital Library
Strehl, A. L., & Littman, M. L. (2005). A theoretical analysis of model-based interval estimation: Proofs. Forthcoming tech report, Rutgers University.Google Scholar
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction The MIT Press. Google ScholarDigital Library
Voltaire (1759). Candide.Google Scholar
Weissman, T., Ordentlich, E., Seroussi, G., Verdu, S., & Weinberger, M. J. (2003). Inequalities for the L1 deviation of the empirical distribution (Technical Report HPL-2003-97R1). Hewlett-Packard Labs.Google Scholar
Wiering, M., & Schmidhuber, J. (1998). Efficient model-based exploration. Proceedings of the Fifth International Conference on, Simulation of Adaptive Behavior (SAB'98) (pp. 223 228). Google ScholarDigital Library

A theoretical analysis of Model-Based Interval Estimation
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic representations

Recommendations

An analysis of model-based Interval Estimation for Markov Decision Processes

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively ...
Read More
An Empirical Evaluation of Interval Estimation for Markov Decision Processes
ICTAI '04: Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence

This paper takes an empirical approach to evaluating three model-based reinforcement-learning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with exploration of possibly promising alternatives. We ...
Read More
A theoretical analysis of metric hypothesis transfer learning
ICML'15: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37

We consider the problem of transferring some a priori knowledge in the context of supervised metric learning approaches. While this setting has been successfully applied in some empirical contexts, no theoretical evidence exists to justify this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '05: Proceedings of the 22nd international conference on Machine learning
August 2005
1113 pages
ISBN:1595931805
DOI:10.1145/1102351
General Chair:
Saso Dzeroski
Jozef Stefan Institute, Slovenia
,
Program Chairs:
Luc De Raedt,
Stefan Wrobel
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 August 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 67
  Total Citations
  View Citations
- 551
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A theoretical analysis of Model-Based Interval Estimation

ICML '05: Proceedings of the 22nd international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

An analysis of model-based Interval Estimation for Markov Decision Processes

An Empirical Evaluation of Interval Estimation for Markov Decision Processes

A theoretical analysis of metric hypothesis transfer learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A theoretical analysis of Model-Based Interval Estimation

ICML '05: Proceedings of the 22nd international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

An analysis of model-based Interval Estimation for Markov Decision Processes

An Empirical Evaluation of Interval Estimation for Markov Decision Processes

A theoretical analysis of metric hypothesis transfer learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media