Skip to main content

Model-Based Reinforcement Learning

  • Reference work entry
Encyclopedia of Machine Learning

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Abbeel, P., Coates, A., Quigley, M., & Ng, A. Y. (2007). An application of reinforcement learning to aerobatic helicopter flight. In Advances in neural information processing systems (Vol. 19, pp. 1–8). Cambridge, MA: MIT Press.

    Google Scholar 

  • Abbeel, P., Quigley, M., & Ng, A. Y. (2006). Using inaccurate models in reinforcement learning. In Proceedings of the 23rd international conference on machine learning (pp. 1–8). ACM Press, New York, USA.

    Google Scholar 

  • Atkeson, C. G., & Santamaria, J. C. (1997). A comparison of direct and model-based reinforcement learning. In Proceedings of the international conference on robotics and automation (pp. 20–25). IEEE Press.

    Google Scholar 

  • Atkeson, C. G., & Schaal, S. (1997). Robot learning from demonstration. In Proceedings of the fourteenth international conference on machine learning (Vol. 4, pp. 12–20). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1), 81–138.

    Google Scholar 

  • Baxter, J., Tridgell, A., & Weaver, L. (1998). TDLeaf(λ): Combining temporal difference learning with game-tree search. In Proceedings of the ninth Australian conference on neural networks (ACNN’98) (pp. 168–172).

    Google Scholar 

  • Brafman, R. I., & Tennenholtz, M. (2002). R-MAX – a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 2, 213–231.

    Google Scholar 

  • Kaelbling, L. P., Littman, M. L., & Moore, A. P. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.

    Google Scholar 

  • Kearns, M., & Singh, S. (2002). Near-optimal reinforcement learning in polynomial time. Machine Learning, 49(2/3), 209–232.

    MATH  Google Scholar 

  • Moore, A. W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 103–130.

    Google Scholar 

  • Peng, J., & Williams, R. J. (1993). Efficient learning and planning within the dyna framework. Adaptive Behavior, 1(4), 437–454.

    Google Scholar 

  • Puterman, M. L. (1994). Markov decision processes: Discrete dynamic stochastic programming. New York: Wiley.

    MATH  Google Scholar 

  • Schaal, S., & Atkeson, C. G. (1994). Robot juggling: Implementation of memory-based learning. IEEE Control Systems Magazine, 14(1), 57–71.

    Google Scholar 

  • Singh, S., Kearns, M., Litman, D., & Walker, M. (1999) Reinforcement learning for spoken dialogue systems. In Advances in neural information processing systems (Vol. 11, pp. 956–962). MIT Press.

    Google Scholar 

  • Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the seventh international conference on machine learning (pp. 216–224). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

    Google Scholar 

  • Tadepalli, P., & Ok, D. (1998). Model-based average-reward reinforcement learning. Artificial Intelligence, 100, 177–224.

    MATH  Google Scholar 

  • Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3), 58–68.

    Google Scholar 

  • Wang, X., & Dietterich, T. G. (2003). Model-based policy gradient reinforcement learning. In Proceedings of the 20th international conference on machine learning (pp. 776–783). AAAI Press.

    Google Scholar 

  • Wilson, A., Fern, A., Ray, S., & Tadepalli, P. (2007). Multi-task reinforcement learning: A hierarchical Bayesian approach. In Proceedings of the 24th international conference on machine learning (pp. 1015–1022). Madison, WI: Omnipress.

    Google Scholar 

  • Zhang, W., & Dietterich, T. G. (1995). A reinforcement learning approach to job-shop scheduling. In Proceedings of the international joint conference on artificial intelligence (pp. 1114–1120). Morgan Kaufman.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Ray, S., Tadepalli, P. (2011). Model-Based Reinforcement Learning. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_556

Download citation

Publish with us

Policies and ethics