Abstract
This paper contributes on designing an autonomous system utilizing self-optimizing memory controller for non-Markovian reinforcement tasks. Instead of holistic search for the whole memory contents, the controller adopts associated feature analysis to produce the most likely relevant action from previous experiences. Actor-Critic (AC) learning is used to adaptively tuning the control parameters, while on-line variant of Random Forest (RF) learner is used as memory-capable to approximate the policy of Actor and the value function of Critic. Learning capability is experimentally examined through non-Markovian cart-pole balancing task. The result shows that the proposed controller acquired complex behaviors such as balancing two poles simultaneously and displays long-term planning.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chrisman, L.: Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In: Proc. Int’l. Conf. on AAAI, pp. 183–188 (1992)
Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: Proc. Int’l. Conf. on AAAI, pp. 1023–1028 (1994)
Tsitsiklis, J.N., Van Roy, B.: Featured-based methods for large scale dynamic programming. Machine Learning 22, 59–94 (1996)
Hassab Elgawi, O.: Online Random Forests based on CorrFS and CorrBE. In: Proc. IEEE workshop on online classification, CVPR, pp. 1–7 (2008)
Jaakkola, T., Singh, S.P., Jordan, M.I.: Reinforcement learning algorithms for partially observable Markov decision. In: Advances in Neural Information Processing Systems 7, pp. 345–352. Morgan Kaufmann, San Francisco (1995)
Long-Ji, L., Mitchell, T.M.: Memory approaches to reinforcement learning in non-Markovian domains. Technical Report CMU-CS-92-138, School of Computer Science, Carnegie Mellon University (1992)
Ipek, E., Mutlu, O., Martinez, J.F., Caruana, R.: Self-Optimizing Memory Controllers: A Reinforcement Learning Approach. In: Intl. Symp. on Computer Architecture (ISCA), pp. 39–50 (2008)
Meuleau, N., Peshkin, L., Kim, K.-E., Kaelbling, L.P.: Learning finite-state controllers for partially observable environments. In: Proc of the 15th Int’l. Conf. on Uncertainty in Artificial Intelligence, pp. 427–436 (1999)
Peshkin, L., Meuleau, N., Kaelbling, L.P.: Learning policies with external memory. In: Bratko, I., Dzeroski, S. (eds.) Proc. of the 16th Int’l. Conf. on Machine Learning, pp. 307–314 (1999)
Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1998)
Hassab Elgawi, O.: Architecture of behavior-based Function Approximator for Adaptive Control. In: Köppen, M., et al. (eds.) ICONIP 2008, Part II. LNCS, vol. 5507, pp. 104–111. Springer, Heidelberg (2009)
Hassab Elgawi, O.: Random-TD Function Approximator. Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 13(2), 155–161 (2009)
Kenneth, O.S.: Efficient evolution of neural networks through complexification. Ph.D. Thesis; Department of Computer Sciences, The University of Texas at Austin. Technical Report AI-TR-04-314 (2004)
Gomez. F.: Robust non-linear control through neuroevolution. Ph.D. Thesis; Department of Computer Sciences, The University of Texas at Austin. Technical Report AI-TR-03-303 (2004)
Santamaria, J.C., Sutton, R.S., Ram, A.: Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 6(2), 163–218 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Elgawi, O.H. (2009). RL-Based Memory Controller for Scalable Autonomous Systems. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10684-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-10684-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10682-8
Online ISBN: 978-3-642-10684-2
eBook Packages: Computer ScienceComputer Science (R0)