Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Bianchi, Reinaldo A. C.; Santos, Paulo E.; da Silva, Isaac J.; Celiberto, Luiz A.; Lopez de Mantaras, Ramon

doi:10.1007/s10846-017-0731-2

Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Published: 30 October 2017

Volume 91, pages 301–312, (2018)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Reinaldo A. C. Bianchi ORCID: orcid.org/0000-0001-9097-827X¹,
Paulo E. Santos¹,
Isaac J. da Silva¹,
Luiz A. Celiberto Jr² &
…
Ramon Lopez de Mantaras³

614 Accesses
10 Citations
Explore all metrics

Abstract

Reinforcement Learning (RL) is a well-known technique for learning the solutions of control problems from the interactions of an agent in its domain. However, RL is known to be inefficient in problems of the real-world where the state space and the set of actions grow up fast. Recently, heuristics, case-based reasoning (CBR) and transfer learning have been used as tools to accelerate the RL process. This paper investigates a class of algorithms called Transfer Learning Heuristically Accelerated Reinforcement Learning (TLHARL) that uses CBR as heuristics within a transfer learning setting to accelerate RL. The main contributions of this work are the proposal of a new TLHARL algorithm based on the traditional RL algorithm Q(λ) and the application of TLHARL on two distinct real-robot domains: a robot soccer with small-scale robots and the humanoid-robot stability learning. Experimental results show that our proposed method led to a significant improvement of the learning rate in both domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human-in-the-loop machine learning: a state of the art

Article Open access 17 August 2022

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Aha, D.W., Molineaux, M., Sukthankar, G.: Case-based reasoning in transfer learning. In: Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development, ICCBR ’09, pp. 29–44. Springer-Verlag, Berlin (2009)
Araujo, E.G., Grupen, R.A.: Learning control composition in a complex environment. In: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, pp. 333–342. MIT Press/Bradford Books (1996)
Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
Article Google Scholar
Astrom, K.J., Furuta, K.: Swinging up a pendulum by energy control. Automatica 36(2), 287–295 (2000)
Article MathSciNet MATH Google Scholar
Atkeson, C. G., Schaal, S.: Robot learning from demonstration. In: International Conference on Machine Learning, pp. 12–20 (1997)
Banerjee, B., Stone, P.: General game learning using knowledge transfer. In: The 20th International Joint Conference on Artificial Intelligence, pp. 672–677 (2007)
Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: Guyon, I., Dror, G., Lemaire, V., Taylor, G., Silver, D. (eds.) Proceedings of ICML Workshop on Unsupervised and Transfer Learning, Proceedings of Machine Learning Research, vol. 27, pp. 17–36. PMLR, Bellevue, Washington, USA. http://proceedings.mlr.press/v27/bengio12a.html (2012)
Bianchi, R., Celiberto, L.A., Matsuura, J., Santos, P, de Mántaras, R.L.: Transferring knowledge as heuristics in reinforcement learning: a case base approach. Artif. Intell. 226, 102–121 (2015)
Article MATH Google Scholar
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Heuristically Accelerated Q-Learning: a new approach to speed up reinforcement learning. Lect. Notes Artif. Intell. 3171, 245–254 (2004)
MATH Google Scholar
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Accelerating autonomous learning by using heuristic selection of actions. J. Heuristics 14(2), 135–168 (2008)
Article MATH Google Scholar
de Boer, R., Kok, J.: The Incremental Development of a Synthetic Multi-Agent System: The UvA Trilearn 2001 Robotic Soccer Simulation Team. Master’s Thesis. University of Amsterdam, Amsterdam (2002)
Google Scholar
Caruana, R.: Learning many related tasks at the same time with backpropagation. In: Advances in Neural Information Processing Systems 7, pp. 657–664. Morgan Kaufmann (1995)
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Article MathSciNet Google Scholar
Celiberto, L.A. Jr, Bianchi, R.A.C., Santos, P.E.: Transfer learning heuristically accelerated algorithm: a case study with real robots. In: 2016 Latin American Robotics Symposium and Intelligent Robotics Meeting, pp. 311–315 (2016)
Celiberto, L.A. Jr, Matsuura, J.P., de Mantaras, R.L., Bianchi, R.A.C.: Using transfer learning to speed-up reinforcement learning: A cased-based approach. In: 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting, pp. 55–60 (2010)
Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtasks. J. Artif. Intell. Res. 16, 59–104 (2002)
Article MATH Google Scholar
Du, Y., de la Cruz, G.V., Irwin, J., Taylor, M.E.: Initial progress in transfer for deep reinforcement learning algorithms. In: International Joint Conference on Artificial Intelligence (2016)
Fernández, F., Veloso, M.: Probabilistic policy reuse in a reinforcement learning agent. In: Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS ’06, pp. 720–727. ACM, New York, NY, USA (2006)
Ferreira, L.A., Costa Ribeiro, C.H., da Costa Bianchi, R.A.: Heuristically accelerated reinforcement learning modularization for multi-agent multi-objective problems. Appl. Intell. 41(2), 551–562 (2014)
Article Google Scholar
Glatt, R., da Silva, F.L., Costa, A.H.R.: Towards knowledge transfer in deep reinforcement learning. In: Proceedings of the Brazilian Conference on Intelligent System (BRACIS), pp. 91–96 (2016)
Griffith, S., Subramanian, K., Scholz, J., Isbell, C.L., Thomaz, A.L.: Policy shaping: Integrating human feedback with reinforcement learning. In: Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (eds.) NIPS, pp. 2625–2633 (2013)
Gupta, A., Devin, C., Liu, Y., Abbeel, P., Levine, S.: Learning invariant feature spaces to transfer skills with reinforcement learning. In: Proceedings of the Fifth International Conference on Learning Representations. OpenReview, Toulon, France (2017)
Ha, I., Tamura, Y., Asama, H., Han, J., Hong, D.W.: Development of open humanoid platform darwin-op. In: SICE Annual Conference 2011, pp. 2178–2181 (2011)
von Hessling, A., Goel, A.K.: Abstracting reusable cases from reinforcement learning. In: Brüninghaus, S. (ed.) 6th International Conference on Case-Based Reasoning, ICCBR 2005, Chicago, IL, USA, August 23-26, 2005, Workshop Proceedings, pp. 227–236 (2005)
Lazaric, A.: Transfer in Reinforcement Learning: A Framework and a Survey, pp. 143–173. Springer Berlin Heidelberg, Berlin (2012)
Lemke, C., Budka, M., Gabrys, B.: Metalearning: a survey of trends and technologies. Artif. Intell. Rev. 44, 1–14 (2013)
Google Scholar
Lu, J., Behbood, V., Hao, P., Zuo, H., Xue, S., Zhang, G.: Transfer learning using computational intelligence. Know.-Based Syst. 80(C), 14–23 (2015). https://doi.org/10.1016/j.knosys.2015.01.010
Article Google Scholar
de Mántaras, R.L., McSherry, D., Bridge, D., Leake, D., Smyth, B., Craw, S., Faltings, B., Maher, M.L., Cox, M.T., Forbus, K., Keane, M., Aamodt, A., Watson, I.: Retrieval, reuse, revision and retention in case-based reasoning. Knowl. Eng. Rev 20(3), 215–240 (2005)
Article Google Scholar
Nichols, B. D.: Continuous action-space reinforcement learning methods applied to the minimum-time swing-up of the acrobot. In: 2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2084–2089 (2015)
Niculescu-Mizil, A., Caruana: Inductive transfer for Bayesian network structure learning. In: Unsupervised and Transfer Learning - Workshop held at ICML 2011, Bellevue, Washington, USA, July 2, 2011, pp. 167–180 (2012)
Noda, I.: Soccer server: a simulator of robocup. In: Proceedings of AI Symposium of the Japanese Society for Artificial Intelligence, pp. 29–34 (1995)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Article Google Scholar
Parisotto, E., Ba, L.J., Salakhutdinov, R.: Actor-mimic: Deep multitask and transfer reinforcement learning. arXiv:1511.06342 (2015)
Patricia, N., Caputo, B.: Learning to learn, from transfer learning to domain adaptation: A unifying perspective. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’14, pp. 1442–1449. IEEE Computer Society, Washington, DC, USA (2014)
Perico, D.H., Silva, I.J., Vilão Junior, C.O., Homem, T.P.D., Destro, R.C., Tonidandel, F., Bianchi, R.A.C.: Newton: A high level control humanoid robot for the robocup soccer kidsize league. In: Osório, F.S., Wolf, D.F., Castelo Branco, K., Grassi, V. Jr., Becker, M., Romero, R.A.F. (eds.) Robotics: Joint Conference on Robotics, LARS 2014, SBR 2014, Robocontrol 2014, São Carlos, Brazil, October 18-23, 2014. Revised Selected Papers, pp. 53–73. Springer Berlin Heidelberg, Berlin (2015)
Rubenstein, M., Ahler, C., Nagpal, R.: Kilobot: A low cost scalable robot system for collective behaviors. In: 2012 IEEE International Conference on Robotics and Automation, pp. 3293–3298 (2012)
Singh, S.P., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. Mach. Learn. 22(1), 123–158 (1996)
MATH Google Scholar
Spiegel, M.R.: Statistics. McGraw-Hill, New York (1998)
Google Scholar
Spong, M.W.: The swing up control problem for the Acrobot. IEEE Control Syst. 15(1), 49–55 (1995)
Article Google Scholar
Student: The probable error of a mean. Biometrika 6(1), 1–25 (1908)
Article Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3(1), 9–44 (1988)
Google Scholar
Suttom, R.S.: Generalization in reinforcement learning: Successful examples using sparse coarse coding. Adv. Neural Inf. Proces. Syst. 8, 1038–1044 (1996)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Tan, B., Song, Y., Zhong, E., Yang, Q.: Transitive transfer learning. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, pp. 1155–1164. ACM, New York, NY, USA (2015)
Taylor, M.E.: Autonomous Inter-task Transfer in Reinforcement Learning Domains. Ph.D. Thesis, Department of Computer Sciences, The University of Texas at Austin (2008)
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. J. Mach. Learn. Res. 10(1), 1633–1685 (2009)
MathSciNet MATH Google Scholar
Taylor, M.E., Stone, P., Jong, N.K.: Transferring instances for model-based reinforcement learning. In: Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Artificial Intelligence, vol. 5212, pp. 488–505 (2008)
Tharin, J.: Kilobot User Manual. K-Team (2010)
Thorndike, E.L., Woodworth, R.S.: The influence of improvement in one mental function upon the efficiency of other functions. Psychol. Rev. 8, 247–261 (1901)
Article Google Scholar
Thrun, S., Mitchell, T.M.: Learning one more thing. In: IJCAI’95: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1217–1223. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1995)
Watkins, C.J.C.H.: Learning from Delayed rewards. Ph.D. Thesis. University of Cambridge, Cambridge (1989)
Google Scholar
Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)
Article Google Scholar
Welch, B. L.: The generalization of “Student’s” problem when several different population variances are involved. Biometrika 34(1), 28–35 (1947)
MathSciNet MATH Google Scholar
Wender, S., Watson, I.: Combining case-based reasoning and reinforcement learning for tactical unit selection in real-time strategy game AI, pp. 413–429. Springer International Publishing, Berlin (2016)
Google Scholar
Zhang, X., Yu, T., Yang, B., Cheng, L.: Accelerating bio-inspired optimizer with transfer reinforcement learning for reactive power optimization. Knowledge-Based Systems pp. – (2016)
Zhang, X.S., Li, Q., YU, T., Yang, B.: Consensus transfer q-learning for decentralized generation command dispatch based on virtual generation tribe. IEEE Trans. Smart Grid PP(99), 1–1 (2016). https://doi.org/10.1109/TSG.2016.2607801
Google Scholar
Zhang, A., She, J., Lai, X., Wu, M.: Motion planning and tracking control for an acrobot based on a rewinding approach. Automatica 49(1), 278–284 (2012)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Reinaldo Bianchi acknowledges support from FAPESP (2016/21047-3), Paulo E. Santos acknowledges support from FAPESP-IBM (2016/18792-9) and CNPq (307093/2014-0), Isaac J. da Silva acknowledges support from CAPES, and Ramon Lopez de Mantaras acknowledges support from Generalitat de Catalunya Research Grant 2014 SGR 118 and CSIC Project 201550E022.

Author information

Authors and Affiliations

Centro Universitario FEI, São Bernardo do Campo, SP, Brazil
Reinaldo A. C. Bianchi, Paulo E. Santos & Isaac J. da Silva
Federal University of ABC, Santo Andre, SP, Brazil
Luiz A. Celiberto Jr
Artificial Intelligence Research Institute (IIIA-CSIC), Bellaterra, Catalunya, Spain
Ramon Lopez de Mantaras

Authors

Reinaldo A. C. Bianchi
View author publications
You can also search for this author in PubMed Google Scholar
Paulo E. Santos
View author publications
You can also search for this author in PubMed Google Scholar
Isaac J. da Silva
View author publications
You can also search for this author in PubMed Google Scholar
Luiz A. Celiberto Jr
View author publications
You can also search for this author in PubMed Google Scholar
Ramon Lopez de Mantaras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reinaldo A. C. Bianchi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bianchi, R.A.C., Santos, P.E., da Silva, I.J. et al. Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning. J Intell Robot Syst 91, 301–312 (2018). https://doi.org/10.1007/s10846-017-0731-2

Download citation

Received: 19 December 2016
Accepted: 17 October 2017
Published: 30 October 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s10846-017-0731-2

Keywords

Mathematics Subject Classification (2010)

07.05.Mh

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Abstract

Access this article

Similar content being viewed by others

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning

Abstract

Access this article

Similar content being viewed by others

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation