Generative API usage code recommendation with parameter concretization

Chen, Chi; Peng, Xin; Sun, Jun; Xing, Zhenchang; Wang, Xin; Zhao, Yifan; Zhang, Hairui; Zhao, Wenyun

doi:10.1007/s11432-018-9821-9

Generative API usage code recommendation with parameter concretization

Research Paper
Published: 30 July 2019

Volume 62, article number 192103, (2019)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Chi Chen^1,2,3,
Xin Peng^1,2,3,
Jun Sun⁴,
Zhenchang Xing⁵,
Xin Wang^1,2,3,
Yifan Zhao^1,2,3,
Hairui Zhang^1,2,3 &
…
Wenyun Zhao^1,2,3

390 Accesses
Explore all metrics

Abstract

Many programming languages and development frameworks have extensive libraries (e.g., JDK and Android libraries) that ease the task of software engineering if used effectively. With numerous library classes and sometimes intricate API (application programming interface) usage constraints, programmers often have difficulty remembering the library APIs and/or using them correctly. This study addresses this problem by developing an engine called DeepAPIRec, which automatically recommends the API usage code. Compared to the existing proposals, our approach distinguishes itself in two ways. First, it is based on a tree-based long short-term memory (LSTM) neural network inspired by recent developments in the machine-learning community. A tree-based LSTM neural network allows us to model and reason about variable-length, preceding and succeeding code contexts, and to make precise predictions. Second, we apply data-flow analysis to generate concrete parameters for the API usage code, which not only allows us to generate complete code recommendations but also improves the accuracy of the learning results according to the tree-based LSTM neural network. Our approach has been implemented for supporting Java programs. Our experimental studies on the JDK library show that at statement-level recommendations, DeepAPIRec can achieve a top-1 accuracy of about 37% and a top-5 accuracy of about 64%, which are significantly better than the existing approaches. Our user study further confirms that DeepAPIRec can help developers to complete a segment of code faster and more accurately as compared to IntelliJ IDEA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic Code Search in Software Repositories using Neural Machine Translation

ASTSDL: predicting the functionality of incomplete programming code via an AST-sequence-based deep learning model

Article 27 December 2023

“More Than Deep Learning”: post-processing for API sequence recommendation

Article 29 October 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Stylos J, Myers B A. Mica: a web-search tool for finding API components and examples. In: Proceedings of IEEE Symposium on Visual Languages and Human-Centric Computing, Brighton, 2006. 195–202
Gu X D, Zhang H Y, Zhang D M, et al. Deep API learning. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Seattle, 2016. 631–642
Raghothaman M, Wei Y, Hamadi Y. SWIM: synthesizing what i mean: code search and idiomatic snippet synthesis. In: Proceedings of the 38th International Conference on Software Engineering, Austin, 2016. 357–367
Nguyen A T, Nguyen T N. Graph-based statistical language model for code. In: Proceedings of the 37th IEEE/ACM International Conference on Software Engineering, Florence, 2015. 858–868
Nguyen A T, Nguyen T T, Nguyen H A, et al. Graph-based pattern-oriented, context-sensitive source code completion. In: Proceedings of the 34th International Conference on Software Engineering, Zurich, 2012. 69–79
Nguyen A T, Hilton M, Codoban M, et al. API code recommendation using statistical learning from fine-grained changes. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Seattle, 2016. 511–522
Hindle A, Barr E T, Su Z D, et al. On the naturalness of software. In: Proceedings of the 34th International Conference on Software Engineering, Zurich, 2012. 837–847
Raychev V, Vechev M T, Yahav E. Code completion with statistical language models. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Edinburgh, 2014. 419–428
Graves A, Jaitly N, Mohamed A. Hybrid speech recognition with deep bidirectional LSTM. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, 2013. 273–278
Socher R, Karpathy A, Le Q V, et al. Grounded compositional semantics for finding and describing images with sentences. Trans Association Comput Linguist, 2014, 2: 207–218
Article Google Scholar
Tai K S, Socher R, Manning C D. Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Beijing, 2015. 1556–1566
Zhang X X, Lu L, Lapata M. Top-down tree long short-term memory networks. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, 2016. 310–320
Duchi J C, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res, 2011, 12: 2121–2159
MathSciNet MATH Google Scholar
Montemurro M A, Zanette D H. Universal entropy of word ordering across linguistic families. PLoS ONE, 2011, 6: e19875
Article Google Scholar
Looks M, Herreshoff M, Hutchins D, et al. Deep learning with dynamic computation graphs. 2017. ArXiv: 1702.02181
Hellendoorn V J, Devanbu P T. Are deep neural networks the best choice for modeling source code? In: Proceedings of the 11th Joint Meeting on Foundations of Software Engineering, Paderborn, 2017. 763–773
Dam H K, Tran T, Pham T. A deep language model for software code. 2016. ArXiv: 1608.02715
Mei H, Zhang L. Can big data bring a breakthrough for software automation? Sci China Inf Sci, 2018, 61: 056101
Article Google Scholar
Mou L L, Men R, Li G, et al. On end-to-end program generation from user intention by deep neural networks. 2015. ArXiv: 1510.07211
Zhou X, Wu K D, Cai H Q, et al. LogPruner: detect, analyze and prune logging calls in Android apps. Sci China Inf Sci, 2018, 61: 050107
Article Google Scholar
Huang G, Cai H Q, Swiech M, et al. DelayDroid: an instrumented approach to reducing tail-time energy of Android apps. Sci China Inf Sci, 2017, 60: 12106
Article Google Scholar
Pletcher D M, Hou D Q. BCC: enhancing code completion for better API usability. In: Proceedings of the 25th IEEE International Conference on Software Maintenance, Edmonton, 2009. 393–394
Hou D Q, Pletcher D M. Towards a better code completion system by API grouping, filtering, and popularity-based ranking. In: Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, Cape Town, 2010. 26–30
Hou D Q, Pletcher D M. An evaluation of the strategies of sorting, filtering, and grouping API methods for code completion. In: Proceedings of IEEE 27th International Conference on Software Maintenance, Williamsburg, 2011. 233–242
Mandelin D, Xu L, BodíR, et al. Jungloid mining: helping to navigate the API jungle. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, 2005. 48–61
Bruch M, Monperrus M, Mezini M. Learning from examples to improve code completion systems. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, Amsterdam, 2009. 213–222
Asaduzzaman M, Roy C K, Schneider K A, et al. A simple, efficient, context-sensitive approach for code completion. J Softw Evol Proc, 2016, 28: 512–541
Article Google Scholar
Allamanis M, Sutton C A. Mining source code repositories at massive scale using language modeling. In: Proceedings of the 10th Working Conference on Mining Software Repositories, San Francisco, 2013. 207–216
Tu Z P, Su Z D, Devanbu P T. On the localness of software. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, 2014. 269–280
Nguyen T T, Nguyen A T, Nguyen H A, et al. A statistical semantic language model for source code. In: Proceedings of Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, Saint Petersburg, 2013. 532–542
Galenson J, Reames P, BodíR, et al. CodeHint: dynamic and interactive synthesis of code snippets. In: Proceedings of the 36th International Conference on Software Engineering, Hyderabad, 2014. 653–663
Fowkes J M, Sutton C A. Parameter-free probabilistic API mining across GitHub. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Seattle, 2016. 254–265
Wang J, Dang Y N, Zhang H Y, et al. Mining succinct and high-coverage API usage patterns from source code. In: Proceedings of the 10th Working Conference on Mining Software Repositories, San Francisco, 2013. 319–328
Zhong H, Xie T, Zhang L, et al. MAPO: mining and recommending API usage patterns. In: Proceedings of the 23rd European Conference on Object-Oriented Programming, Genoa, 2009. 318–343
Nguyen T T, Nguyen H A, Pham N H, et al. Graph-based mining of multiple object usage patterns. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, Amsterdam, 2009. 383–392
Mou L L, Li G, Zhang L, et al. Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, Phoenix, 2016. 1287–1293
Allamanis M, Peng H, Sutton C A. A convolutional attention network for extreme summarization of source code. In: Proceedings of the 33rd International Conference on Machine Learning, New York City, 2016. 2091–2100
Wang S, Liu T Y, Tan L. Automatically learning semantic features for defect prediction. In: Proceedings of the 38th International Conference on Software Engineering, Austin, 2016. 297–308
Peng X, Xing Z C, Pan S, et al. Reflective feature location: knowledge in mind meets information in system. Sci China Inf Sci, 2017, 60: 072102
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Key Research and Development Program of China (Grant No. 2016YFB1000801), and Shanghai Science and Technology Development Funds (Grant No. 16JC1400801).

Author information

Authors and Affiliations

School of Computer Science, Fudan University, Shanghai, 201203, China
Chi Chen, Xin Peng, Xin Wang, Yifan Zhao, Hairui Zhang & Wenyun Zhao
Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, 201203, China
Chi Chen, Xin Peng, Xin Wang, Yifan Zhao, Hairui Zhang & Wenyun Zhao
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, 200433, China
Chi Chen, Xin Peng, Xin Wang, Yifan Zhao, Hairui Zhang & Wenyun Zhao
Pillar of Information System Technology and Design, Singapore University of Technology and Design, Singapore, 487372, Singapore
Jun Sun
Research School of Computer Science, Australian National University, Acton, ACT, 2601, Australia
Zhenchang Xing

Authors

Chi Chen
View author publications
You can also search for this author inPubMed Google Scholar
Xin Peng
View author publications
You can also search for this author inPubMed Google Scholar
Jun Sun
View author publications
You can also search for this author inPubMed Google Scholar
Zhenchang Xing
View author publications
You can also search for this author inPubMed Google Scholar
Xin Wang
View author publications
You can also search for this author inPubMed Google Scholar
Yifan Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Hairui Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Wenyun Zhao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Xin Peng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, C., Peng, X., Sun, J. et al. Generative API usage code recommendation with parameter concretization. Sci. China Inf. Sci. 62, 192103 (2019). https://doi.org/10.1007/s11432-018-9821-9

Download citation

Received: 10 July 2018
Revised: 20 October 2018
Accepted: 27 February 2019
Published: 30 July 2019
DOI: https://doi.org/10.1007/s11432-018-9821-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generative API usage code recommendation with parameter concretization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Semantic Code Search in Software Repositories using Neural Machine Translation

ASTSDL: predicting the functionality of incomplete programming code via an AST-sequence-based deep learning model

“More Than Deep Learning”: post-processing for API sequence recommendation

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now