Skip to main content

RL-Based Memory Controller for Scalable Autonomous Systems

  • Conference paper
  • 1696 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5864))

Abstract

This paper contributes on designing an autonomous system utilizing self-optimizing memory controller for non-Markovian reinforcement tasks. Instead of holistic search for the whole memory contents, the controller adopts associated feature analysis to produce the most likely relevant action from previous experiences. Actor-Critic (AC) learning is used to adaptively tuning the control parameters, while on-line variant of Random Forest (RF) learner is used as memory-capable to approximate the policy of Actor and the value function of Critic. Learning capability is experimentally examined through non-Markovian cart-pole balancing task. The result shows that the proposed controller acquired complex behaviors such as balancing two poles simultaneously and displays long-term planning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chrisman, L.: Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In: Proc. Int’l. Conf. on AAAI, pp. 183–188 (1992)

    Google Scholar 

  2. Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: Proc. Int’l. Conf. on AAAI, pp. 1023–1028 (1994)

    Google Scholar 

  3. Tsitsiklis, J.N., Van Roy, B.: Featured-based methods for large scale dynamic programming. Machine Learning 22, 59–94 (1996)

    MATH  Google Scholar 

  4. Hassab Elgawi, O.: Online Random Forests based on CorrFS and CorrBE. In: Proc. IEEE workshop on online classification, CVPR, pp. 1–7 (2008)

    Google Scholar 

  5. Jaakkola, T., Singh, S.P., Jordan, M.I.: Reinforcement learning algorithms for partially observable Markov decision. In: Advances in Neural Information Processing Systems 7, pp. 345–352. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  6. Long-Ji, L., Mitchell, T.M.: Memory approaches to reinforcement learning in non-Markovian domains. Technical Report CMU-CS-92-138, School of Computer Science, Carnegie Mellon University (1992)

    Google Scholar 

  7. Ipek, E., Mutlu, O., Martinez, J.F., Caruana, R.: Self-Optimizing Memory Controllers: A Reinforcement Learning Approach. In: Intl. Symp. on Computer Architecture (ISCA), pp. 39–50 (2008)

    Google Scholar 

  8. Meuleau, N., Peshkin, L., Kim, K.-E., Kaelbling, L.P.: Learning finite-state controllers for partially observable environments. In: Proc of the 15th Int’l. Conf. on Uncertainty in Artificial Intelligence, pp. 427–436 (1999)

    Google Scholar 

  9. Peshkin, L., Meuleau, N., Kaelbling, L.P.: Learning policies with external memory. In: Bratko, I., Dzeroski, S. (eds.) Proc. of the 16th Int’l. Conf. on Machine Learning, pp. 307–314 (1999)

    Google Scholar 

  10. Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Hassab Elgawi, O.: Architecture of behavior-based Function Approximator for Adaptive Control. In: Köppen, M., et al. (eds.) ICONIP 2008, Part II. LNCS, vol. 5507, pp. 104–111. Springer, Heidelberg (2009)

    Google Scholar 

  12. Hassab Elgawi, O.: Random-TD Function Approximator. Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 13(2), 155–161 (2009)

    Google Scholar 

  13. Kenneth, O.S.: Efficient evolution of neural networks through complexification. Ph.D. Thesis; Department of Computer Sciences, The University of Texas at Austin. Technical Report AI-TR-04-314 (2004)

    Google Scholar 

  14. Gomez. F.: Robust non-linear control through neuroevolution. Ph.D. Thesis; Department of Computer Sciences, The University of Texas at Austin. Technical Report AI-TR-03-303 (2004)

    Google Scholar 

  15. Santamaria, J.C., Sutton, R.S., Ram, A.: Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 6(2), 163–218 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Elgawi, O.H. (2009). RL-Based Memory Controller for Scalable Autonomous Systems. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10684-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10684-2_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10682-8

  • Online ISBN: 978-3-642-10684-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics