Skip to main content

Hierarchical Traces for Reduced NSM Memory Requirements

  • Conference paper
  • First Online:
Research and Development in Intelligent Systems XXVII (SGAI 2010)

Abstract

This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Dicrete Event Dynamic Systems: Theory and Applications 13, 41–77 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  2. Chang, C.H., Shibu, M., Xiao, R.: Salf organizing feature map for color quantization on FPGA. In: A.R. Omondi, J.C. Rajapakse (eds.) FPGA Implementations of Neural Networks. Springer (2006)

    Google Scholar 

  3. Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)

    MATH  MathSciNet  Google Scholar 

  4. Digney, B.L.: Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement learning. In: P. Maes, M. Matari´c, J.A. Meyer, J. Pollack, S.W. Wilson (eds.) From Animals to Animats 4 [Proceedings of the 4th International Conference on Simulation of Adaptive Behavior (SAB’06), Cape Cod, Massachusetts, September 9 - 13, 1996], pp. 363–372. MIT Press/Bradford Books (1996)

    Google Scholar 

  5. Fuster, J.M.: Cortex and Mind: Unifying Cognition. Oxford University Press, New York (2003)

    Google Scholar 

  6. Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: C. Sammut, A.G. Hoffmann (eds.) Machine Learning [Proceedings of the 19th International Conference (ICML’02), Sydney, Australia, July 8 - 12, 2002], pp. 243–250. Morgan Kaufmann (2002)

    Google Scholar 

  7. Hernandez-Gardiol, N., Mahadevan, S.: Hierarchical memory-based reinforcement learning. In: T.K. Leen, T.G. Dietterich, V. Tresp (eds.) Advances in Neural Information Processing Sys-tems 13 [Proceedings of the NIPS Conference, Denver, Colorado, November 28 - 30, 2000], pp. 1047–1053. MIT Press (2001)

    Google Scholar 

  8. Ijspeert, A.J., Nakanishi, J., Schaal, S.:Movement imitation with nonlinear dynamical systems in humanoid robots. In: Proceedings of the 2002 IEEE International Conference on Robotics and Automation (ICRA’02), pp. 1398–1403. Washington, DC (2002)

    Google Scholar 

  9. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101 (1998)

    Google Scholar 

  10. Kohonen, T.: Self-organizing maps, 3rd edn. Springer, New York (2001)

    MATH  Google Scholar 

  11. Konidaris, G.D., Hayes, G.M.: An architechture for behavior-based reinforcement learning. Adaptive Behavior 13(1), 5–32 (2005)

    Article  Google Scholar 

  12. Maes, P.: How to do the right thing. Connection Science 1(3), 291–232 (1989)

    Article  Google Scholar 

  13. McCallum, A.: Instance-based utile distinctions for reinforcement learning with hidden state. In: Proceedings of the 12th International Conference on Machine Learning (ICML’95), pp. 387–395. Tahoe City, California (1995)

    Google Scholar 

  14. McCallum, A.: Hidden state and reinforcement learning with instance-based state identification. IEEE Transactions on Systems,Man and Cybernetics, Part B: Cybernetics (Special Issue on Robot Learning) 26(3), 464–473 (1996)

    Article  Google Scholar 

  15. McGovern, A.: Autonomous discovery of temporal abstractions from interactions with an environment. Ph.D. thesis, University of Massachusetts, Amherst, Amherst, Massachusetts (2002)

    Google Scholar 

  16. Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning 13, 103–130 (1993)

    Google Scholar 

  17. Quartz, S.R.: Learning and brain development: A neural constructivist perspective. In: P.T. Quinlan (ed.) Connectionist Models of Development: Developmental Processes in Real and Artificial Neural Networks, pp. 279–310. Psychology Press (2003)

    Google Scholar 

  18. Schaal, S., Atkeson, C.G.: Robot juggling: Implementation of memory-based learning. Control Systems Magazine 14(1), 57–71 (1994)

    Article  Google Scholar 

  19. Sun, R., Sessions, C.: Self-segmentation of sequences: Automatic formation of hierarchies of sequential behaviors. IEEE Transactions on Systems, Man and Cybernetics: Part B, Cybernetics 30(3), 403–418 (2000)

    Article  Google Scholar 

  20. Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press, Cambridge, Massachusetts (1998)

    Google Scholar 

  21. Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs ans semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Torbjørn S. Dahl .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this paper

Cite this paper

Dahl, T.S. (2011). Hierarchical Traces for Reduced NSM Memory Requirements. In: Bramer, M., Petridis, M., Hopgood, A. (eds) Research and Development in Intelligent Systems XXVII. SGAI 2010. Springer, London. https://doi.org/10.1007/978-0-85729-130-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-130-1_12

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-129-5

  • Online ISBN: 978-0-85729-130-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics