skip to main content
10.1145/3386164.3389083acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiscsicConference Proceedingsconference-collections
research-article

Deep Learning Based Code Completion Models for Programming Codes

Authors Info & Claims
Published:06 June 2020Publication History

ABSTRACT

With the fast development of Information Technology, program software and mobile applications have been widely used in the world, and are playing important roles in human's daily life. Thus, writing programming codes has been important work in many fields. however, it is a hard and time-cost task which presents a great amount of workload to programmers. To make programmers' work easier, intelligent code completion models have been a popular research topic in recent years. This paper designs Deep Learning based models to automatically complete programming codes, which are LSTM-based neural networks, and are combined with several techniques such as Word Embedding models in NLP (Natural Language Processing), and Multihead Attention Mechanism. Moreover, in the models, this paper raises a new algorithm of generating input sequences from partial AST (Abstract Syntax Tree) that have most relevance with nodes to be predicted which is named as RZT (Reverse Zig-zag Traverse) Algorithm, and is the first work of applying Multihead Attention Block into this task. This paper makes insight into codes of several different programming languages, and the models this paper presents show good performances in accuracy comparing with the state-of-art models.

References

  1. Allamanis, Miltiadis, Barr, Earl T, Devanbu, Premkumar, et al. 2018. A survey of machine learning for big code and naturalness. ACM Computing Surveys (CSUR), 51(4), 81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Allamanis, Miltiadis, Brockschmidt, Marc, and Khademi, Mahmoud. 2017. Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740.Google ScholarGoogle Scholar
  3. Allamanis, Miltiadis, and Sutton, Charles. 2013. Mining source code repositories at massive scale using language modeling. In Proceedings of the 10th Working Conference on Mining Software Repositories. 207--216.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Amodio, Matthew, Chaudhuri, Swarat, and Reps, Thomas. 2017. Neural Attribute Machines for Program Generation. arXiv preprint arXiv:1705.09231.Google ScholarGoogle Scholar
  5. [5] Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.Google ScholarGoogle Scholar
  6. Bhoopchand, Avishkar, Rocktäschel, Tim, Barr, Earl, et al. 2016. Learning Python code suggestion with a sparse pointer network. arXiv preprint arXiv:1611.08307.Google ScholarGoogle Scholar
  7. Bielik, Pavol, Raychev, Veselin, and Vechev, Martin. 2016. PHOG: probabilistic model for code. In International Conference on Machine Learning. 2933--2942.Google ScholarGoogle Scholar
  8. Bojanowski, Piotr, Grave, Edouard, Joulin, Armand, et al. 2016. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.Google ScholarGoogle Scholar
  9. Corley, Christopher S, Damevski, Kostadin, and Kraft, Nicholas A. 2015. Exploring the use of deep learning for feature location. In 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). 556--560.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hindle, Abram, Barr, Earl T, Su, Zhendong, et al. 2012. On the naturalness of software. In Software Engineering (ICSE), 2012 34th International Conference on. 837--847.Google ScholarGoogle ScholarCross RefCross Ref
  11. Hochreiter, Sepp, and Schmidhuber, Jürgen. 1997. Long short-term memory. Neural computation, 9(8), 1735--1780.Google ScholarGoogle Scholar
  12. Hou, Daqing, and Pletcher, David M. 2011. An evaluation of the strategies of sorting, filtering, and grouping API methods for code completion. In Software Maintenance (ICSM), 2011 27th IEEE International Conference on. 233--242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Li, Jian, Wang, Yue, King, Irwin, et al. 2017. Code Completion with Neural Attention and Pointer Networks. arXiv preprint arXiv:1711.09573.Google ScholarGoogle Scholar
  14. Li, Zhenmin, and Zhou, Yuanyuan. 2005. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In ACM SIGSOFT Software Engineering Notes. 306--315.Google ScholarGoogle Scholar
  15. Liu, Chang, Wang, Xin, Shin, Richard, et al. 2016. Neural Code Completion.Google ScholarGoogle Scholar
  16. Luong, Minh Thang, Pham, Hieu, and Manning, Christopher D. 2015. Effective Approaches to Attention-based Neural Machine Translation. Computer Science.Google ScholarGoogle Scholar
  17. Maddison, Chris, and Tarlow, Daniel. 2014. Structured generative models of natural source code. In International Conference on Machine Learning. 649--657.Google ScholarGoogle Scholar
  18. Mikolov, Tomas, Chen, Kai, Corrado, Greg, et al. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.Google ScholarGoogle Scholar
  19. Mikolov, Tomas, Sutskever, Ilya, Chen, Kai, et al. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google ScholarGoogle Scholar
  20. Mou, Lili, Li, Ge, Liu, Yuxuan, et al. 2014. Building program vector representations for deep learning. arXiv preprint arXiv:1409.3358.Google ScholarGoogle Scholar
  21. Nguyen, Anh Tuan, Nguyen, Tung Thanh, Nguyen, Hoan Anh, et al. 2012. Graph-based pattern-oriented, context-sensitive source code completion. In Software Engineering (ICSE), 2012 34th International Conference on. 69--79.Google ScholarGoogle ScholarCross RefCross Ref
  22. Nguyen, Tam The, Pham, Hung Viet, Vu, Phong Minh, et al. 2015. Recommending API usages for mobile apps with hidden markov model. In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 795--800.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nguyen, Trong Duc, Nguyen, Anh Tuan, and Nguyen, Tien N. 2016. Mapping API elements for code migration with vector representations. In Software Engineering Companion (ICSE-C), IEEE/ACM International Conference on. 756--758.Google ScholarGoogle Scholar
  24. Nguyen, Trong Duc, Nguyen, Anh Tuan, Phan, Hung Dang, et al. 2017. Exploring API embedding for API usages and applications. In Software Engineering (ICSE), 2017 IEEE/ACM 39th International Conference on. 438--449.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Proksch, Sebastian, Lerch, Johannes, and Mezini, Mira. 2015. Intelligent code completion with Bayesian networks. ACM Transactions on Software Engineering and Methodology (TOSEM), 25(1), 3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Raychev, Veselin, Vechev, Martin, and Krause, Andreas. 2015. Predicting program properties from big code. In ACM SIGPLAN Notices. 111--124.Google ScholarGoogle Scholar
  27. Raychev, Veselin, Vechev, Martin, and Yahav, Eran. 2014. Code completion with statistical language models. In Acm Sigplan Notices. 419--428.Google ScholarGoogle Scholar
  28. Thung, Ferdian, Wang, Shaowei, Lo, David, et al. 2013. Automatic recommendation of API methods from feature requests. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering. 290--300.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, et al. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998--6008.Google ScholarGoogle Scholar
  30. White, Martin, Vendome, Christopher, Linares-Vásquez, Mario, et al. 2015. Toward deep learning software repositories. In Proceedings of the 12th Working Conference on Mining Software Repositories. 334--345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Xie, Tao, and Pei, Jian. 2006. MAPO: Mining API usages from open source repositories. In Proceedings of the 2006 international workshop on Mining software repositories. 54--57.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Learning Based Code Completion Models for Programming Codes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ISCSIC 2019: Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control
      September 2019
      397 pages
      ISBN:9781450376617
      DOI:10.1145/3386164

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 June 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      ISCSIC 2019 Paper Acceptance Rate77of152submissions,51%Overall Acceptance Rate192of401submissions,48%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader