ABSTRACT
Since human-written programs have useful local regularities, the ability to adapt to unseen, local context is an important challenge that successful models of source code must overcome. However, the current source code models mostly learn a common code pattern from large scale open-source codebases, which cannot make use of the localness nor satisfy developers’ personal preferences. Consequently, fast learning and adapting to unseen code patterns from limited developers’ code can provide new insights into source code completion. In this work, we train a base code model that is best able to learn semantic and structural information from context to improve predictions of unseen local tokens and propose an adaptive code model leveraging meta-learning techniques. We demonstrate highly improved performance in experiments on a large scale Java GitHub corpus compared with baselines.
- Miltiadis Allamanis, Earl T Barr, Christian Bird, and Charles Sutton. 2015. Suggesting accurate method and class names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 38–49.Google ScholarDigital Library
- Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2017. Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740(2017).Google Scholar
- Miltiadis Allamanis and Charles Sutton. 2013. Mining source code repositories at massive scale using language modeling. In 2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, 207–216.Google ScholarCross Ref
- Pavol Bielik, Veselin Raychev, and Martin Vechev. 2016. PHOG: probabilistic model for code. In International Conference on Machine Learning. 2933–2942.Google Scholar
- Marcel Bruch, Martin Monperrus, and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. 213–222.Google ScholarDigital Library
- Milan Cvitkovic, Badal Singh, and Anima Anandkumar. 2019. Open Vocabulary Learning on Source Code with a Graph-Structured Cache. ArXiv abs/1810.08305(2019).Google Scholar
- Hoa Khanh Dam, Truyen Tran, and Trang Thi Minh Pham. 2016. A deep language model for software code. In FSE 2016: Proceedings of the Foundations Software Engineering International Symposium. [The Conference], 1–4.Google ScholarDigital Library
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 1126–1135.Google ScholarDigital Library
- Mark Gabel and Zhendong Su. 2010. A study of the uniqueness of source code. Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering (01 2010), 147–156. https://doi.org/10.1145/1882291.1882315Google ScholarDigital Library
- Philip Gage. 1994. A new algorithm for data compression. C Users Journal 12, 2 (1994), 23–38.Google ScholarDigital Library
- Vincent J. Hellendoorn and Premkumar Devanbu. 2017. Are Deep Neural Networks the Best Choice for Modeling Source Code?. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA, 763–773. https://doi.org/10.1145/3106237.3106290Google ScholarDigital Library
- Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, 837–847.Google ScholarDigital Library
- Daqing Hou and David M Pletcher. 2011. An evaluation of the strategies of sorting, filtering, and grouping API methods for code completion. In 2011 27th IEEE International Conference on Software Maintenance (ICSM). IEEE, 233–242.Google ScholarDigital Library
- Yasir Hussain, Zhiqiu Huang, Yu Zhou, and Senzhang Wang. 2020. CodeGRU: Context-aware deep learning with gated recurrent unit for source code modeling. Information and Software Technology 125 (2020), 106309.Google ScholarCross Ref
- Xianhao Jin and Francisco Servant. 2018. The hidden cost of code completion: Understanding the impact of the recommendation-list length on its efficiency. In Proceedings of the 15th International Conference on Mining Software Repositories. 70–73.Google ScholarDigital Library
- Rafael-Michael Karampatsis and Charles Sutton. 2019. Maybe deep neural networks are the best choice for modeling source code. arXiv preprint arXiv:1903.05734(2019).Google Scholar
- Hoa Khanh Dam, Truyen Tran, and Trang Pham. 2016. A deep language model for software code. arXiv (2016), arXiv–1608.Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).Google Scholar
- Jian Li, Yue Wang, Michael R. Lyu, and Irwin King. 2018. Code Completion with Neural Attention and Pointer Networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, Jérôme Lang (Ed.). ijcai.org, 4159–4165. https://doi.org/10.24963/ijcai.2018/578Google ScholarCross Ref
- Xi Victoria Lin, Chenglong Wang, Deric Pang, Kevin Vu, and Michael D Ernst. 2017. Program synthesis from natural language using recurrent neural networks. University of Washington Department of Computer Science and Engineering, Seattle, WA, USA, Tech. Rep. UW-CSE-17-03-01 (2017).Google Scholar
- Chang Liu, Xin Wang, Richard Shin, Joseph E Gonzalez, and Dawn Song. 2016. Neural code completion. (2016).Google Scholar
- Chris Maddison and Daniel Tarlow. 2014. Structured generative models of natural source code. In International Conference on Machine Learning. 649–657.Google Scholar
- A. T. Nguyen and T. N. Nguyen. 2015. Graph-Based Statistical Language Model for Code. 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (2015).Google Scholar
- Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N Nguyen. 2013. A statistical semantic language model for source code. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. 532–542.Google ScholarDigital Library
- Romain Robbes and Michele Lanza. 2008. How program history can improve code completion. In 2008 23rd IEEE/ACM International Conference on Automated Software Engineering. IEEE, 317–326.Google ScholarDigital Library
- Juliana Saraiva, Christian Bird, and Thomas Zimmermann. 2015. Products, developers, and milestones: how should I build my N-Gram language model. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 998–1001.Google ScholarDigital Library
- Disha Shrivastava, Hugo Larochelle, and Daniel Tarlow. 2020. On-the-Fly Adaptation of Source Code Models using Meta-Learning. arXiv preprint arXiv:2003.11768(2020).Google Scholar
- Zhaopeng Tu, Zhendong Su, and Premkumar Devanbu. 2014. On the localness of software. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 269–280.Google ScholarDigital Library
- Martin White, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2015. Toward deep learning software repositories. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 334–345.Google ScholarCross Ref
- Pengcheng Yin and Graham Neubig. 2017. A Syntactic Neural Model for General-Purpose Code Generation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 440–450.Google ScholarCross Ref
- Hao Zhong and Xiaoyin Wang. 2017. Boosting complete-code tool for partial program. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 671–681.Google ScholarCross Ref
Index Terms
- Adaptive Code Completion with Meta-learning
Recommendations
Syntax-aware on-the-fly code completion
Abstract Context:Code completion aims to help improve developers’ productivity by suggesting the next code tokens from a given context. Various approaches have been proposed to incorporate abstract syntax tree (AST) information for model training, ...
Principled syntactic code completion using placeholders
SLE 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language EngineeringPrincipled syntactic code completion enables developers to change source code by inserting code templates, thus increasing developer efficiency and supporting language exploration. However, existing code completion systems are ad-hoc and neither ...
Learning from examples to improve code completion systems
ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineeringThe suggestions made by current IDE's code completion features are based exclusively on static type system of the programming language. As a result, often proposals are made which are irrelevant for a particular working context. Also, these suggestions ...
Comments