Skip to main content

Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics

  • Conference paper
Adaptive and Learning Agents (ALA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7113))

Included in the following conference series:

Abstract

We study the problem of automatically generating features for function approximation in reinforcement learning. We build on the work of Mahadevan and his colleagues, who pioneered the use of spectral clustering methods for basis function construction. Their methods work on top of a graph that captures state adjacency. Instead, we use bisimulation metrics in order to provide state distances for spectral clustering. The advantage of these metrics is that they incorporate reward information in a natural way, in addition to the state transition information. We provide bisimulation metric bounds for general feature maps. This result suggests a new way of generating features, with strong theoretical guarantees on the quality of the obtained approximation. We also demonstrate empirically that the approximation quality improves when bisimulation metrics are used in the basis function construction process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Bellman (1996)

    MATH  Google Scholar 

  2. Chung, F.: Spectral Graph Theory. CBMS Regional Conference Series in Mathematics (1997)

    Google Scholar 

  3. Ferns, N., Panangaden, P., Precup, D.: Metrics for Finite Markov Decision Processes. In: Conference on Uncertainty in Artificial Intelligence (2004)

    Google Scholar 

  4. Ferns, N., Panangaden, P., Precup, D.: Metrics for Markov Decision Processes with Infinite State Spaces. In: Conference on Uncertainty in Artificial Intelligence (2005)

    Google Scholar 

  5. Keller, P.W., Mannor, S., Precup, D.: Automatic Basis Function Construction for Approximate Dynamic Programming and Reinforcement Learning. In: International Conference on Machine Learning, pp. 449–456. ACM Press, New York (2006)

    Google Scholar 

  6. Mahadevan, S.: Proto-Value Functions: Developmental Reinforcement Learning. In: International Conference on Machine Learning, pp. 553–560 (2005)

    Google Scholar 

  7. Mahadevan, S., Maggioni, M.: Proto-Value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes. Machine Learning 8, 2169–2231 (2005)

    MathSciNet  MATH  Google Scholar 

  8. Parr, R., Painter-Wakefiled, H., Li, L., Littman, M.L.: Analyzing Feature Generation for Value Function Approximation. In: International Conference on Machine Learning, pp. 737–744 (2008)

    Google Scholar 

  9. Petrik, M.: An Analysis of Laplacian Methods for Value Function Approximation in MDPs. In: International Joint Conference on Artificial Intelligence, pp. 2574–2579 (2007)

    Google Scholar 

  10. Puterman, M.L.: Markov Decision Processes: Discrete and Stochastic Dynamic Programming. Wiley (1994)

    Google Scholar 

  11. Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press (1998)

    Google Scholar 

  12. Tsitsiklis, J.N., Van Roy, B.: An Analysis of Temporal-Difference Learning with Function Approximation. IEEE Transactions on Automatic Control 42(5), 674–690 (1997)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Comanici, G., Precup, D. (2012). Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics. In: Vrancx, P., Knudson, M., GrzeÅ›, M. (eds) Adaptive and Learning Agents. ALA 2011. Lecture Notes in Computer Science(), vol 7113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28499-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28499-1_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28498-4

  • Online ISBN: 978-3-642-28499-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics