Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base

Zhang, Meng-Yang; Tian, Guo-Hui; Li, Ci-Ci; Gong, Jing

doi:10.1007/s11633-018-1128-9

Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base

Research Article
Published: 30 May 2018

Volume 15, pages 582–592, (2018)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

267 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

In order to improve the learning ability of robots, we present a reinforcement learning approach with a knowledge base for mapping natural language instructions to executable action sequences. A simulated platform with physical engine is built as interactive environment. Based on the knowledge base, a reward function with immediate rewards and delayed rewards is designed to handle sparse reward problems. Also, a list of object states is produced by retrieving the knowledge base, as a standard to define the quality of action sequences. Experimental results demonstrate that our approach yields good performance on accuracy of action sequences production.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on large language model based autonomous agents

Article Open access 22 March 2024

Human-in-the-loop machine learning: a state of the art

Article Open access 17 August 2022

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

References

W. Wang, Q. F. Zhao, T. H. Zhu. Research of natural language understanding in human-service robot interaction. Microcomputer Applications, vol. 3, no. 1, pp. 45–49, 2015.
Google Scholar
L. F. Shang, Z. D. Lu, H. Li. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, IEEE, Beijing, China, pp. 1577–1586, 2015. Doi: 10.3115/v1/P15-1152.
Google Scholar
J. M. Ji, X. P. Chen. A weighted causal theory for acquiring and utilizing open knowledge. International Journal of Approximate Reasoning, vol. 55, no. 9, pp. 2071–2082, 2014. Doi: 10.1016/j.ijar.2014.03.002.
Google Scholar
M. Tenorth, M. Beetz. Know rob-knowledge processing for autonomous personal robots. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, St. Louis, USA, pp. 4261266, 2009. Doi: 10.1109/IRGS.2009.5354602.
Google Scholar
M. Waibel, M. Beetz, J. Civera, R. D’Andrea, J. Elfring, D. Galvez-Lopez, K. Haussermann, R. Janssen, J. M. M. Montiel, A. Perzylo, B. Schiessle, M. Tenorth, O. Zweigle, R. van de Molengraft. Roboearth. IEEE Robotics and Automation Magazine, vol. 18, no. 2, pp. 69–82, 2011. DOI: 10.1109/MRA.2011.941632.
Article Google Scholar
R. Reiter. Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems, Cambridge, USA: MIT Press, 2001.
MATH Google Scholar
D. McDermott. The formal semantics of processes in PDDL. In Proceedings of the 23th International Conference on Automated Planning Scheduling, Rome, Italy, 2003.
Google Scholar
M. Fox, D. Long. PDDL2.1: An extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, vol. 20, pp. 61–124, 2003. DOI: 10. 1613/jair.1129.
Article MATH Google Scholar
L. P. Kaelbling, M. L. Littman, A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, vol. 101, no. 1–2, pp. 99–134, 1998. DOI: 10.1016/S0004-3702(98)00023-X.
Google Scholar
I. A. Hameed. Using natural language processing (NLP) for designing socially intelligent robots. In Proceedings of Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, IEEE, Cergy-Pontoises, France, pp. 268–269, 2016. DOI: 10.1109/DEVLRN. 2016.7846830.
Google Scholar
M. Tenorth, D. Nyga, M. Beetz. Understanding and executing instructions for everyday manipulation tasks from the World Wide Web. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Anchorage, USA, pp. 1486–1491, 2010. DOI: 10.1109/ROBOT.2010.5509955.
Google Scholar
M. Tenorth, U. Klank, D. Pangercic, M. Beetz. Web-enabled robots. IEEE Robotics & Automation Magazine, vol. 18, no. 2, pp. 58–68, 2011. DOI: 10.1109/MRA.2011. 940993.
Article Google Scholar
Y. LeCun, Y. G. Bengio, G. Hinton. Deep learning. Nature, vol. 521, no. 7553, pp. 436–444, 2015. DOI: 10.1038/nature14539.
Article Google Scholar
L. Deng, D. Yu. Deep learning: Methods and applications. Foundations and Trends in Signal Processing, vol. 7, no. 3–4, pp. 197–387, 2014. DOI: 10.1561/2000000039.
Google Scholar
G. Hinton, L. Deng, D. Yu, G. Dahl, A. R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, 2012. DOI: 10.1109/MSP.2012.2205597.
Article Google Scholar
A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, Lake Tahoe, USA, pp. 1097–1105, 2012.
Google Scholar
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis. Human-level control through deep reinforcement learning. Nature, vol. 518, no. 7540, pp. 529–533, 2015. DOI: 10.1038/nature14236.
Article Google Scholar
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, vol. 529, no. 7587, pp. 484–489, 2016. DOI: 10.1038/nature16961.
Article Google Scholar
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra. Continuous control with deep reinforcement learning. Computer Science, vol. 529, no. 7587, pp. 484–489, 2015.
Google Scholar
Y. Duan, X. Chen, R. Houthooft, J. Schulman, P. Abbeel. Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33rd International Conference on Machine Learning, ACM, New York, USA, pp. 1329–1338, 2016.
Google Scholar
R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction, Cambridge, UK: MIT Press, 1998.
Google Scholar
J. He, M. Ostendorf, X. D. He, J. S. Chen, J. F. Gao, L. H. Li, L. Deng. Deep reinforcement learning with a combinatorial action space for predicting popular Reddit threads. https://doi.org/arxir.org/abs/1606.03667.
D. Dowty. Compositionality as an empirical problem. Direct Compositionality, C. Barker, P. I. Jacobson, Eds., Oxford, UK: Oxford University Press, pp. 23–101, 2007.
K. S. Tai, R. Socher, C. D. Manning. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1556–1566, 2015.
Google Scholar
S. R. Bowman, J. Gauthier, A. Rastogi, R. Gupta, C. D. Manning, C. Potts. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 1466–1477, 2016.
Google Scholar
R. Kaplan, C. Sauer, A. Sosa. Beating Atari with natural language guided reinforcement learning. Computer Science. https://doi.org/adsabs.harvard.edu/abs/2017arXiv170405539K.
F. Wu, Z. W. Xu, Y. Yang. An end-to-end approach to natural language object retrieval via context-aware deep reinforcement learning. https://doi.org/arxir.org/abs/1703.07579.
S. R. K. Branavan, H. Chen, L. S. Zettlemoyer, R. Barzilay. Reinforcement learning for mapping instructions to actions. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 82–90, 2009. DOI: 10.3115/1687878.1687892.
Google Scholar
A. Pritzel, B. Uria, S. Srinivasan, A. Puigdomenech, O. Vinyals, D. Hassabis, D. Wierstra, C. Blundell. Neural episodic control. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 963–975, 2017.
Google Scholar
A. S. Vezhnevets, S. Osindero, T. Schaul, N. Heess, M. Jaderberg, D. Silver, K. Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017.
Google Scholar
M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo, D. Silver, K. Kavukcuoglu. Reinforcement learning with unsupervised auxiliary tasks. Computer Science. https://doi.org/adsabs.harvard.edu/abs/2016arXiv161105397J.
G. Lample, D. S. Chaplot. Playing FPS games with deep reinforcement learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2140–2146, 2017.
Google Scholar
Q. Y. Gu, I. Ishii. Review of some advances and applications in real-time high-speed vision: our views and experiences. International Journal of Automation and Computing, vol. 13, no. 4, pp. 305–318, 2016. DOI: 10.1007/s11633-016-1024-0.
Article Google Scholar
S. Miyashita, X. Y. Lian, X. Zeng, T. Matsubara, K. Uehara. Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning. In Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, IEEE, Kanazawa, Japan, pp. 489–494, 2017. Doi: 10.1109/SNPD. 2017.8022767.
Google Scholar
Y. K. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, F. F. Li, A. Farhadi. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Singapore, pp. 3357–3364, 2017. Doi: 10.1109/ICRA.2017.7989381.
Google Scholar
Q. V. Le. Building high-level features using large scale unsupervised learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 8595–8598, 2013. Doi: 10.1109/ICASSP.2013.6639343.
Google Scholar
R. S. Sutton, D. McAllester, S. Singh, Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of Advances in Neural Information Processing Systems, Denver, USA, pp. 1057–1063, 2000.
Google Scholar
D. R. Liu, H. L. Li, D. Wang. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey. International Journal of Automation and Computing, vol. 12, no. 3, pp. 229–242, 2015. Doi: 10.1007/s11633-015-0893-y.
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61773239) and Shenzhen Future Industry Special Fund (No. JCYJ20160331174814755).

Author information

Authors and Affiliations

School of Control Science and Engineering, Shandong University, Jinan, 253000, China
Meng-Yang Zhang, Guo-Hui Tian, Ci-Ci Li & Jing Gong
Shenzhen Research Institute, Shandong University, Shenzhen, 518000, China
Meng-Yang Zhang, Guo-Hui Tian & Ci-Ci Li

Authors

Meng-Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Hui Tian
View author publications
You can also search for this author in PubMed Google Scholar
Ci-Ci Li
View author publications
You can also search for this author in PubMed Google Scholar
Jing Gong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guo-Hui Tian.

Additional information

Recommended by Guest Editor Jun-Zhi Yu

Meng-Yang Zhang received the B. Sc. and M. Sc. degrees in automation from Qingdao University of Technology, China in 2012 and 2014, respectively. He is currently a Ph. D. degree candidate in control theory and control engineering at Shandong University, China.

His research interests include intelligent space technology and service robot, reinforcement learning, and knowledge construction based on ontology.

Guo-Hui Tian received the B. Sc. degree from Department of Mathematics, Shandong University, China in 1990, the M. Sc. degree in automation from Department of Automation, Shandong University of Technology, China in 1993, and the Ph. D. degree in automatic control theory and application from School of Automation, Northeastern University, China in 1997. He studied as a post-doctoral researcher in School of Mechanical Engineering, Shandong University from 1999 to 2001, and worked as a visiting professor in Graduate School of Engineering, Tokyo University of Japan from 2003 to 2005. He was a lecturer from 1997 to 1998 and an associate professor from 1998 to 2002 in Shandong University. At present, he is the professor in School of Control Science and Engineering, Shandong University, China. And also he is the vice director of the Intelligence Robot Specialized Committee of Chinese Association for Artificial Intelligence, the vice director of the Intelligent Manufacturing System Specialized Committee of Chinese Association for Automation, and the member of the IEEE Robotics and Automation Society.

His research interests include service robot, intelligent space, cloud robotics and brain-inspired intelligent robotics.

Ci-Ci Li received the B. Sc. degree in automation from the Northeastern University, China in 2014. She is currently the Ph. D. degree candidate in control science and engineering at Shandong University, China.

Her research interests include home service robot and object cognition.

Jing Gong received the B. Sc. degree in automation from the Zhengzhou University, China in 2015. He is currently a master student in control science and engineering at Shandong University, China.

His research interests include home service robot, natural language processing and cloud robot system.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, MY., Tian, GH., Li, CC. et al. Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base. Int. J. Autom. Comput. 15, 582–592 (2018). https://doi.org/10.1007/s11633-018-1128-9

Download citation

Received: 19 January 2018
Accepted: 20 March 2018
Published: 30 May 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11633-018-1128-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base

Abstract

Access this article

Similar content being viewed by others

A survey on large language model based autonomous agents

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base

Abstract

Access this article

Similar content being viewed by others

A survey on large language model based autonomous agents

Human-in-the-loop machine learning: a state of the art

Multi-agent deep reinforcement learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation