research-article

A Practical Three-phase Approach To Fully Automated Programming Using System Decomposition And Coding Copilots

Author:
Hao Bai

Department of Electrical and Computer Engineering, University of Illinois Urbana Champaign, USA

Department of Electrical and Computer Engineering, University of Illinois Urbana Champaign, USA

0000-0001-9723-7490
View Profile

MLMI '22: Proceedings of the 2022 5th International Conference on Machine Learning and Machine IntelligenceSeptember 2022Pages 183–189https://doi.org/10.1145/3568199.3568228

Published:06 March 2023Publication History

MLMI '22: Proceedings of the 2022 5th International Conference on Machine Learning and Machine Intelligence

Pages 183–189

ABSTRACT

Very large-scale (VLS) deep learning models are capable of generating meaningful code snippets, yet the performance drops dramatically when the coding task becomes more complex. Although fully neural approaches have been proposed to solve this problem, the value of the application is still limited. In our work, we propose a neuro-symbolic approach that integrates the symbolic natures of programming and the existing neural language models. We divide a programming task into three phases: forming a hierarchical task composed of functions, completing each function, and fulfilling the corner cases. Because each phase can be completed by language models, the coding process can be fully automated. Our contribution is three-fold. Firstly, we show that with little help from humans, VLS language models are capable of completing non-trivial programming tasks. Secondly, we provide a number of empirical insights to create prompt templates that help the language models generate better code. Thirdly, compared to the existing approaches, our work provides a much more practical approach for programmers and researchers to follow. The generated programming project using our fully automated programming approach and part of the ablation study code are available at https://github.com/BiEchi/FAP.

Supplemental Material

Available for Download

zip

Part of code used in the paper. (71.4 KB)

References

Hao Bai. 2022. Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For NLP models. In 2022 6th International Conference on High Performance Compilation, Computing and Communications (HP3C). 44–53.Google ScholarDigital Library
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.Google Scholar
Roderic G Cattell. 1977. A survey and critique of some models of code generation. Technical Report. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE.Google Scholar
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).Google Scholar
Conrad Czejdo and Sambit Bhattacharya. 2022. Increasing Accessibility of Language Models with Multi-stage Information Extraction. Journal of Advances in Information Technology Vol 13, 2 (2022).Google ScholarCross Ref
GitHub. 2022. Github copilot: your ai pair programmer. (2022). https://github.com/features/copilotGoogle Scholar
Shirley Anugrah Hayati, Raphael Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic, and Graham Neubig. 2018. Retrieval-based neural code generation. arXiv preprint arXiv:1808.10025 (2018).Google Scholar
Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, 2022. Competition-level code generation with alphacode. arXiv preprint arXiv:2203.07814 (2022).Google Scholar
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021).Google ScholarDigital Library
Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, 2021. Codexglue: A machine learning benchmark dataset for code understanding and generation. arXiv preprint arXiv:2102.04664 (2021).Google Scholar
Tomáš Mikolov, Anoop Deoras, Daniel Povey, Lukáš Burget, and Jan Černock`y. 2011. Strategies for training large scale neural network language models. In 2011 IEEE Workshop on Automatic Speech Recognition & Understanding. IEEE, 196–201.Google Scholar
Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, and Hannaneh Hajishirzi. 2021. Reframing Instructional Prompts to GPTk's Language. arXiv preprint arXiv:2109.07830 (2021).Google Scholar

Index Terms

A Practical Three-phase Approach To Fully Automated Programming Using System Decomposition And Coding Copilots

Index terms have been assigned to the content through auto-classification.

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MLMI '22: Proceedings of the 2022 5th International Conference on Machine Learning and Machine Intelligence
September 2022
215 pages
ISBN:9781450397551
DOI:10.1145/3568199

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 March 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 111
  Total Downloads
- Downloads (Last 12 months)91
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Practical Three-phase Approach To Fully Automated Programming Using System Decomposition And Coding Copilots

MLMI '22: Proceedings of the 2022 5th International Conference on Machine Learning and Machine Intelligence

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Practical C++ Programming

Practical C++ Programming

AOPS: An Abstraction Oriented Programming System for Literate Programming

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Practical Three-phase Approach To Fully Automated Programming Using System Decomposition And Coding Copilots

MLMI '22: Proceedings of the 2022 5th International Conference on Machine Learning and Machine Intelligence

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Practical C++ Programming

Practical C++ Programming

AOPS: An Abstraction Oriented Programming System for Literate Programming

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media