Skip to main content

Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget

  • Conference paper
  • First Online:
  • 3035 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9949))

Abstract

Risk-sensitive reinforcement learning (Risk-sensitiveRL) has been studied by many researchers. The methods are based on a prospect method, which imitates the value function of a human. Although they are mainly intended at imitating human behaviors, there are fewer discussions about the engineering meaning of it. In this paper, we show that Risk-sensitiveRL is useful for using online-learning machines whose resources are limited. In such a learning method, a part of the learned memories should be removed to create space for recording a new important instance. The experimental results show that risk-sensitive RL is superior to normal RL. This might mean that the human brain is also constructed by a limited number of neurons, so that humans hire the risk-sensitive value function for the learning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Igushi, K., Ogiso, T., Yamauchi, K.: Acceleration of reinforcement learning via game-based renewal energy management system. In: SCISISIS 2014, pp. 415–420. The Institute of Electrical and Electronics Engineers, Inc., New York, December 2014

    Google Scholar 

  2. Ogiso, T., Yamauchi, K., Ishii, N., Suzuki, Y.: Co-learning system for humans and machines using a weighted majority-based method. Int. J. Hybrid Intell. Syst. 13, 63–76 (2016)

    Article  Google Scholar 

  3. Shen, Y., Tobia, M.J., Sommer, T., Obermayer, K.: Risk-sensitive reinforcement learning. Neural Comput. 26, 1298–1328 (2014)

    Article  MathSciNet  Google Scholar 

  4. Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. Econometrica 47(2), 263–291 (1979)

    Article  MATH  Google Scholar 

  5. Walter, F.E., Schweitzer, F.: Risk-seeking versus risk-avoiding investments in noisy periodic environments. Int. J. Mod. Phys. C 19(6), 971–994 (2008)

    Article  MATH  Google Scholar 

  6. Amari, S., Park, H., Fukumizu, K.: Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Comput. 12, 1399–1409 (2000)

    Article  Google Scholar 

  7. Dekel, O., Shalev-Shwartz, S., Singer, Y.: The forgetron: a kernel-based perceptron on a budget. SIAM J. Comput. (SICOMP) 37(5), 1342–1372 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  8. Orabona, F., Keshet, J., Caputo, B.: The projectron: a bounded kernel-based perceptron. In: ICML 2008, pp. 720–727 (2008)

    Google Scholar 

  9. He, W., Si, W.: A kernel-based perceptron with dynamic memory. Neural Netw. 25, 105–113 (2011)

    Google Scholar 

  10. Yamauchi, K.: Pruning with replacement and automatic distance metric detection in limited general regression neural networks. In: IJCNN 2011, pp. 899–906. The Institute of Electrical and Electronics Engineers, Inc., New York, July 2011

    Google Scholar 

  11. Yamauchi, K.: Incremental learning on a budget and its application to quick maximum power point tracking of photovoltaic systems. J. Adv. Comput. Intell. Intell. Inform. 18(4), 682–696 (2014)

    Article  MathSciNet  Google Scholar 

  12. Garcìa, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)

    Article  Google Scholar 

  13. Lee, D., Noh, S.H., Min, S.L., Choi, J., Kim, J.H., Cho, Y., Sang, K.C.: LRFU: a spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Trans. Comput. 50(12), 1352–1361 (2001)

    Article  MathSciNet  Google Scholar 

  14. Kondo, Y., Yamauchi, K.: A dynamic pruning strategy for incremental learning on a budget. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014, Part I. LNCS, vol. 8834, pp. 295–303. Springer, Heidelberg (2014)

    Google Scholar 

  15. Kato, H., Yamauchi, K.: Quick MPPT microconverter using a limited general regression neural network with adaptive forgetting. In: 2015 International Conference on Sustainable Energy Engineering and Application (ICSEEA), pp. 42–48. The Institute of Electrical and Electronics Engineers, Inc., New York, February 2016

    Google Scholar 

  16. Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural Inf. Process. Syst. 8, 1038–1044 (1995)

    Google Scholar 

  17. Ryuichi UEDA: Comparison of data amount for representing decision making policy. In: Burgard, W., Dillmann, R., Plagemann, C., Vahrenkamp, N. (eds.) Intelligent Autonomous Systems 10 (IAS 2010), vol. 10, pp. 26–35. IOS Press (2008)

    Google Scholar 

Download references

Acknowledgement

This research has been supported by Grant-in-Aid for Scientific Research(c) 12008012.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kazuyoshi Kato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Kato, K., Yamauchi, K. (2016). Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9949. Springer, Cham. https://doi.org/10.1007/978-3-319-46675-0_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46675-0_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46674-3

  • Online ISBN: 978-3-319-46675-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics