GRU: optimization of NPI performance

Liu, Wei; Wang, Quan; Zhu, Yunlong; Chen, Hanning

doi:10.1007/s11227-018-2634-9

GRU: optimization of NPI performance

Published: 19 October 2018

Volume 76, pages 3542–3554, (2020)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Wei Liu¹,
Quan Wang²,
Yunlong Zhu³ &
…
Hanning Chen ORCID: orcid.org/0000-0002-9885-6653²

381 Accesses
8 Citations
Explore all metrics

Abstract

Currently, artificial intelligence is being used in automatic programming by producing snippets of code. NPI (neural programmer-interpreter) is the most used technology that uses machine learning to implement automatic programming. This paper is aimed to improve the performance of traditional NPI and improve the speed of NPI training without loss of precision. To achieve this goal, we changed the core structure of NPI by adopting the GRU (gated recurrent unit) to replace LSTM (long short-term memory) in NPI. GRU has a control unit that regulates the flow of information within the hidden unit while without single memory unit. Numerical results have been presented to demonstrate the performance of the proposed methodology. That is, GRU-based NPI improved the performance of the original LSTM-based NPI by nearly 33% under the premise of ensuring equal accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Translation from Natural Language to Code Using Long-Short Term Memory

The Recurrent Neural Network for Program Synthesis

Lightweight Neural Programming: The GRPU

Abbreviations

NPI:: Neural programmer-interpreter
LSTM:: Long short-term memory
GRU:: Gated recurrent unit
NTM:: Neural Turing Machine

References

Rumelhart DE, Hinton GE, McClelland JL (1986) Parallel distributed processing: explorations in the microstructure of cognition, vol. 1. Chapter. In: A general framework for parallel distributed processing, MIT Press, pp 45–76
Sutskever I, Hinton GE (2009) Using matrices to model symbolic relationship. In: Advances in neural information processing systems, pp 1593–1600
Donnarumma F, Prevete R, Chersi F, Pezzulo G (2015) A programmer interpreter neural network architecture for prefrontal cognitive control. Int J Neural Syst 25(6):1550017
Article Google Scholar
Schmidhuber J (1992) Learning to control fast-weight memories: an alternative to dynamic recurrent networks. Neural Comput 4(1):131–139
Article Google Scholar
Schneider W, Chein JM (2003) Controlled and automatic processing: behavior, theory, and biological mechanisms. Cogn Sci 27(3):525–559
Article Google Scholar
Anderson ML (2010) Neural reuse: a fundamental organizational principle of the brain. Behav Brain Sci 33:245–266
Article Google Scholar
Brito R, Fong S, Cho K (2016) Towards implementation of residual-feedback GMDH neural network on parallel GPU memory guided by a regression curve. J Supercomput 72(10):1–28
Google Scholar
Graves A, Wayne G, Danihelka, I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. In: International Conference on Neural Information Processing Systems. MIT Press
Banzhaf W, Nordin P, Keller RE, Francone FD (1998) Genetic programming: an introduction, vol 1. Morgan Kaufmann, San Francisco
Book Google Scholar
Mou L, Li G, Liu Y, Peng H, Jin Z, Xu Y, Zhang L (2014) Building program vector representations for deep learning. arXiv preprint arXiv:1409.3358
Zaremba W, Sutskever I (2014) Learning to execute. arXiv preprint arXiv:1410.4615
Joulin A, Mikolov T (2015) Inferring algorithmic patterns with stack-augmented recurrent nets. In NIPS
Zaremba W, Sutskever I (2015) Reinforcement learning neural turing machines. arXiv preprint arXiv:1505.00521
Zaremba W, Mikolov T, Joulin A, Fergus R (2015) Learning simple algorithms from examples. arXiv preprint arXiv:1511.07275
Kaiser Ł, Sutskever I (2015) Neural gpus learn algorithms. arXiv preprint arXiv:1511.08228
Kurach K, Andrychowicz M, Sutskever I (2015) Neural random-access machines. arXiv preprint arXiv:1511.06392
Cong G, Bhardwaj O, Feng M (2017) An efficient, distributed stochastic gradient descent algorithm for deep-learning applications. In: International Conference on Parallel Processing. IEEE Computer Society, pp 11–20
Gers FA, Schmidhuber J (2000) Recurrent nets ¨ that time and count. In: Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on, IEEE, volume 3, pp 189–194. ISBN 0769506194
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610
Article Google Scholar
Gers FA, Perez-Ortiz JA, Eck D, Schmidhuber J (2002) DEFK-LSTM. In: ESANN 2002, Proceedings of the 10th Eurorean Symposium on Artificial Neural Networks
Schmidhuber J, Wierstra D, Gagliolo M, Gomez FJ (2007) Training recurrent networks by EVOLINO. Neural Comput 19(3):757–779
Article Google Scholar
Bayer J, Wierstra D, Togelius J, Schmidhuber J (2009) Evolving memory cell structures for sequence learning. In: Artificial Neural Networks–ICANN 2009, Springer, pp 755–764
Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Proceedings of the Annual Conference of International Speech Communication Association (INTERSPEECH)
Doetsch P, Kozielski M, Ney H (2014) Fast and robust training of recurrent neural networks for offline handwriting recognition. In: 14th International Conference on Frontiers in Handwriting Recognition
Otte S, Liwicki M, Zell A (2014) Dynamic cortex memory: enhancing recurrent neural networks for gradient-based sequence learning. In: Artificial Neural Networks and Machine Learning—ICANN 2014, number 8681 in Lecture Notes in Computer Science. Springer International Publishing, pp 1–8
Cho K, van Merrienboer B, Gulcehre C, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

Download references

Acknowledgements

The research presented in this paper was supported by Ministry of Science and Technology of the People’s Republic of China and National Natural Science Foundation of China.

Availability of data and materials

The simulation code can be obtained by contacting the Email of corresponding authors.

Funding

This research is partially supported by the National key Research and Development Plan of China under grant No. (2016YFB1100501, 2017YFB1103603, 2017YFB-1103000), National Natural Science Foundation of China under grant No. (61772365, 41772123, 61602343, 51607122, 51575158 and 51378350), Tianjin Province Science and Technology Projects under grant No. (16JCYBJC18400, 16ZLZDZF-00150, 17ZLZXZF00310, 17JCQNJC04500, 17JCYBJC15100).

Author information

Authors and Affiliations

School of Information and Technology, Jilin Normal University, Siping, 136000, China
Wei Liu
School of Computer Science and Software, Tianjin Polytechnic University, Tianjin, 300387, China
Quan Wang & Hanning Chen
School of Electrical Engineering and Intellgentization, Dongguan University of Technology, Dongguan, 523000, China
Yunlong Zhu

Authors

Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Quan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yunlong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hanning Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Wei Liu is the main writer of this paper. She proposed the main idea, completed the experiment, and analyzed the result. The other authors gave some important suggestions for the experiment. All the authors read and approved the final manuscript.

Corresponding author

Correspondence to Hanning Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Appendix 1: NPI flowchart based on GRU

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, W., Wang, Q., Zhu, Y. et al. GRU: optimization of NPI performance. J Supercomput 76, 3542–3554 (2020). https://doi.org/10.1007/s11227-018-2634-9

Download citation

Published: 19 October 2018
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11227-018-2634-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GRU: optimization of NPI performance

Abstract

Access this article

Similar content being viewed by others

Machine Translation from Natural Language to Code Using Long-Short Term Memory

The Recurrent Neural Network for Program Synthesis

Lightweight Neural Programming: The GRPU

Abbreviations

References

Acknowledgements

Availability of data and materials

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Appendix 1: NPI flowchart based on GRU

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GRU: optimization of NPI performance

Abstract

Access this article

Similar content being viewed by others

Machine Translation from Natural Language to Code Using Long-Short Term Memory

The Recurrent Neural Network for Program Synthesis

Lightweight Neural Programming: The GRPU

Abbreviations

References

Acknowledgements

Availability of data and materials

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Appendix 1: NPI flowchart based on GRU

Appendix 1: NPI flowchart based on GRU

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation