RL-Based Memory Controller for Scalable Autonomous Systems

Elgawi, Osman Hassab

doi:10.1007/978-3-642-10684-2_10

RL-Based Memory Controller for Scalable Autonomous Systems

Osman Hassab Elgawi¹⁹

Conference paper

1696 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5864))

Abstract

This paper contributes on designing an autonomous system utilizing self-optimizing memory controller for non-Markovian reinforcement tasks. Instead of holistic search for the whole memory contents, the controller adopts associated feature analysis to produce the most likely relevant action from previous experiences. Actor-Critic (AC) learning is used to adaptively tuning the control parameters, while on-line variant of Random Forest (RF) learner is used as memory-capable to approximate the policy of Actor and the value function of Critic. Learning capability is experimentally examined through non-Markovian cart-pole balancing task. The result shows that the proposed controller acquired complex behaviors such as balancing two poles simultaneously and displays long-term planning.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chrisman, L.: Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In: Proc. Int’l. Conf. on AAAI, pp. 183–188 (1992)
Google Scholar
Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: Proc. Int’l. Conf. on AAAI, pp. 1023–1028 (1994)
Google Scholar
Tsitsiklis, J.N., Van Roy, B.: Featured-based methods for large scale dynamic programming. Machine Learning 22, 59–94 (1996)
MATH Google Scholar
Hassab Elgawi, O.: Online Random Forests based on CorrFS and CorrBE. In: Proc. IEEE workshop on online classification, CVPR, pp. 1–7 (2008)
Google Scholar
Jaakkola, T., Singh, S.P., Jordan, M.I.: Reinforcement learning algorithms for partially observable Markov decision. In: Advances in Neural Information Processing Systems 7, pp. 345–352. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Long-Ji, L., Mitchell, T.M.: Memory approaches to reinforcement learning in non-Markovian domains. Technical Report CMU-CS-92-138, School of Computer Science, Carnegie Mellon University (1992)
Google Scholar
Ipek, E., Mutlu, O., Martinez, J.F., Caruana, R.: Self-Optimizing Memory Controllers: A Reinforcement Learning Approach. In: Intl. Symp. on Computer Architecture (ISCA), pp. 39–50 (2008)
Google Scholar
Meuleau, N., Peshkin, L., Kim, K.-E., Kaelbling, L.P.: Learning finite-state controllers for partially observable environments. In: Proc of the 15th Int’l. Conf. on Uncertainty in Artificial Intelligence, pp. 427–436 (1999)
Google Scholar
Peshkin, L., Meuleau, N., Kaelbling, L.P.: Learning policies with external memory. In: Bratko, I., Dzeroski, S. (eds.) Proc. of the 16th Int’l. Conf. on Machine Learning, pp. 307–314 (1999)
Google Scholar
Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1998)
Article MATH MathSciNet Google Scholar
Hassab Elgawi, O.: Architecture of behavior-based Function Approximator for Adaptive Control. In: Köppen, M., et al. (eds.) ICONIP 2008, Part II. LNCS, vol. 5507, pp. 104–111. Springer, Heidelberg (2009)
Google Scholar
Hassab Elgawi, O.: Random-TD Function Approximator. Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 13(2), 155–161 (2009)
Google Scholar
Kenneth, O.S.: Efficient evolution of neural networks through complexification. Ph.D. Thesis; Department of Computer Sciences, The University of Texas at Austin. Technical Report AI-TR-04-314 (2004)
Google Scholar
Gomez. F.: Robust non-linear control through neuroevolution. Ph.D. Thesis; Department of Computer Sciences, The University of Texas at Austin. Technical Report AI-TR-03-303 (2004)
Google Scholar
Santamaria, J.C., Sutton, R.S., Ram, A.: Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 6(2), 163–218 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering, Tokyo Institute of Technology, Japan
Osman Hassab Elgawi

Authors

Osman Hassab Elgawi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronic Engineering, City University of Hong Kong, Hong Kong,
Chi Sing Leung
School of Electrical Engineering and Computer Science, Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
School of Information Technology, King Mongkut’s University of Technology Thonburi, 126 Pracha-U-Thit Rd., Bangmod, Thungkru, 10140, Bangkok, Thailand
Jonathan H. Chan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Elgawi, O.H. (2009). RL-Based Memory Controller for Scalable Autonomous Systems. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10684-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-10684-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10682-8
Online ISBN: 978-3-642-10684-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics