Hierarchical Traces for Reduced NSM Memory Requirements

Dahl, Torbjørn S.

doi:10.1007/978-0-85729-130-1_12

Torbjørn S. Dahl⁴

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

684 Accesses

Abstract

This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Dicrete Event Dynamic Systems: Theory and Applications 13, 41–77 (2003)
Article MATH MathSciNet Google Scholar
Chang, C.H., Shibu, M., Xiao, R.: Salf organizing feature map for color quantization on FPGA. In: A.R. Omondi, J.C. Rajapakse (eds.) FPGA Implementations of Neural Networks. Springer (2006)
Google Scholar
Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
MATH MathSciNet Google Scholar
Digney, B.L.: Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement learning. In: P. Maes, M. Matari´c, J.A. Meyer, J. Pollack, S.W. Wilson (eds.) From Animals to Animats 4 [Proceedings of the 4th International Conference on Simulation of Adaptive Behavior (SAB’06), Cape Cod, Massachusetts, September 9 - 13, 1996], pp. 363–372. MIT Press/Bradford Books (1996)
Google Scholar
Fuster, J.M.: Cortex and Mind: Unifying Cognition. Oxford University Press, New York (2003)
Google Scholar
Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: C. Sammut, A.G. Hoffmann (eds.) Machine Learning [Proceedings of the 19th International Conference (ICML’02), Sydney, Australia, July 8 - 12, 2002], pp. 243–250. Morgan Kaufmann (2002)
Google Scholar
Hernandez-Gardiol, N., Mahadevan, S.: Hierarchical memory-based reinforcement learning. In: T.K. Leen, T.G. Dietterich, V. Tresp (eds.) Advances in Neural Information Processing Sys-tems 13 [Proceedings of the NIPS Conference, Denver, Colorado, November 28 - 30, 2000], pp. 1047–1053. MIT Press (2001)
Google Scholar
Ijspeert, A.J., Nakanishi, J., Schaal, S.:Movement imitation with nonlinear dynamical systems in humanoid robots. In: Proceedings of the 2002 IEEE International Conference on Robotics and Automation (ICRA’02), pp. 1398–1403. Washington, DC (2002)
Google Scholar
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101 (1998)
Google Scholar
Kohonen, T.: Self-organizing maps, 3rd edn. Springer, New York (2001)
MATH Google Scholar
Konidaris, G.D., Hayes, G.M.: An architechture for behavior-based reinforcement learning. Adaptive Behavior 13(1), 5–32 (2005)
Article Google Scholar
Maes, P.: How to do the right thing. Connection Science 1(3), 291–232 (1989)
Article Google Scholar
McCallum, A.: Instance-based utile distinctions for reinforcement learning with hidden state. In: Proceedings of the 12th International Conference on Machine Learning (ICML’95), pp. 387–395. Tahoe City, California (1995)
Google Scholar
McCallum, A.: Hidden state and reinforcement learning with instance-based state identification. IEEE Transactions on Systems,Man and Cybernetics, Part B: Cybernetics (Special Issue on Robot Learning) 26(3), 464–473 (1996)
Article Google Scholar
McGovern, A.: Autonomous discovery of temporal abstractions from interactions with an environment. Ph.D. thesis, University of Massachusetts, Amherst, Amherst, Massachusetts (2002)
Google Scholar
Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning 13, 103–130 (1993)
Google Scholar
Quartz, S.R.: Learning and brain development: A neural constructivist perspective. In: P.T. Quinlan (ed.) Connectionist Models of Development: Developmental Processes in Real and Artificial Neural Networks, pp. 279–310. Psychology Press (2003)
Google Scholar
Schaal, S., Atkeson, C.G.: Robot juggling: Implementation of memory-based learning. Control Systems Magazine 14(1), 57–71 (1994)
Article Google Scholar
Sun, R., Sessions, C.: Self-segmentation of sequences: Automatic formation of hierarchies of sequential behaviors. IEEE Transactions on Systems, Man and Cybernetics: Part B, Cybernetics 30(3), 403–418 (2000)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press, Cambridge, Massachusetts (1998)
Google Scholar
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs ans semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Cognitive Robotics Research Centre, University of Wales, Newport, Allt-yr-yn Avenue, Newport, NP20 5DA, UK
Torbjørn S. Dahl

Authors

Torbjørn S. Dahl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Torbjørn S. Dahl .

Editor information

Editors and Affiliations

Dept. Computer Science and, Software Engineering, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, United Kingdom
Max Bramer
School of Computing &, Mathematical Sciences, University of Greenwich, Park Row 30, London, SE10 9LS, United Kingdom
Miltos Petridis
, Faculty of Technology, De Montford University, The Gateway, Leicester, LE1 9BH, United Kingdom
Adrian Hopgood

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dahl, T.S. (2011). Hierarchical Traces for Reduced NSM Memory Requirements. In: Bramer, M., Petridis, M., Hopgood, A. (eds) Research and Development in Intelligent Systems XXVII. SGAI 2010. Springer, London. https://doi.org/10.1007/978-0-85729-130-1_12

Download citation

DOI: https://doi.org/10.1007/978-0-85729-130-1_12
Published: 29 October 2010
Publisher Name: Springer, London
Print ISBN: 978-0-85729-129-5
Online ISBN: 978-0-85729-130-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics