skip to main content
10.1145/3568199.3568228acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmlmiConference Proceedingsconference-collections
research-article

A Practical Three-phase Approach To Fully Automated Programming Using System Decomposition And Coding Copilots

Authors Info & Claims
Published:06 March 2023Publication History

ABSTRACT

Very large-scale (VLS) deep learning models are capable of generating meaningful code snippets, yet the performance drops dramatically when the coding task becomes more complex. Although fully neural approaches have been proposed to solve this problem, the value of the application is still limited. In our work, we propose a neuro-symbolic approach that integrates the symbolic natures of programming and the existing neural language models. We divide a programming task into three phases: forming a hierarchical task composed of functions, completing each function, and fulfilling the corner cases. Because each phase can be completed by language models, the coding process can be fully automated. Our contribution is three-fold. Firstly, we show that with little help from humans, VLS language models are capable of completing non-trivial programming tasks. Secondly, we provide a number of empirical insights to create prompt templates that help the language models generate better code. Thirdly, compared to the existing approaches, our work provides a much more practical approach for programmers and researchers to follow. The generated programming project using our fully automated programming approach and part of the ablation study code are available at https://github.com/BiEchi/FAP.

Skip Supplemental Material Section

Supplemental Material

References

  1. Hao Bai. 2022. Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For NLP models. In 2022 6th International Conference on High Performance Compilation, Computing and Communications (HP3C). 44–53.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google ScholarGoogle Scholar
  3. Roderic G Cattell. 1977. A survey and critique of some models of code generation. Technical Report. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE.Google ScholarGoogle Scholar
  4. Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).Google ScholarGoogle Scholar
  5. Conrad Czejdo and Sambit Bhattacharya. 2022. Increasing Accessibility of Language Models with Multi-stage Information Extraction. Journal of Advances in Information Technology Vol 13, 2 (2022).Google ScholarGoogle ScholarCross RefCross Ref
  6. GitHub. 2022. Github copilot: your ai pair programmer. (2022). https://github.com/features/copilotGoogle ScholarGoogle Scholar
  7. Shirley Anugrah Hayati, Raphael Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic, and Graham Neubig. 2018. Retrieval-based neural code generation. arXiv preprint arXiv:1808.10025 (2018).Google ScholarGoogle Scholar
  8. Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, 2022. Competition-level code generation with alphacode. arXiv preprint arXiv:2203.07814 (2022).Google ScholarGoogle Scholar
  9. Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, 2021. Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021).Google ScholarGoogle Scholar
  11. Tomáš Mikolov, Anoop Deoras, Daniel Povey, Lukáš Burget, and Jan Černock`y. 2011. Strategies for training large scale neural network language models. In 2011 IEEE Workshop on Automatic Speech Recognition & Understanding. IEEE, 196–201.Google ScholarGoogle Scholar
  12. Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, and Hannaneh Hajishirzi. 2021. Reframing Instructional Prompts to GPTk's Language. arXiv preprint arXiv:2109.07830 (2021).Google ScholarGoogle Scholar

Index Terms

  1. A Practical Three-phase Approach To Fully Automated Programming Using System Decomposition And Coding Copilots
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            MLMI '22: Proceedings of the 2022 5th International Conference on Machine Learning and Machine Intelligence
            September 2022
            215 pages
            ISBN:9781450397551
            DOI:10.1145/3568199

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 6 March 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format