Abstract
Large language models (LLMs) have been successfully applied to software engineering tasks, including program repair. However, their application in search-based techniques such as Genetic Improvement (GI) is still largely unexplored. In this paper, we evaluate the use of LLMs as mutation operators for GI to improve the search process. We expand the Gin Java GI toolkit to call OpenAI’s API to generate edits for the JCodec tool. We randomly sample the space of edits using 5 different edit types. We find that the number of patches passing unit tests is up to \(75\%\) higher with LLM-based edits than with standard Insert edits. Further, we observe that the patches found with LLMs are generally less diverse compared to standard edits. We ran GI with local search to find runtime improvements. Although many improving patches are found by LLM-enhanced GI, the best improving patch was found by standard GI.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Data Availability Statement
The code, LLMs prompt and experimental infrastructure, data from the evaluation, and results are available as open source at [1]. The code is also under the ‘llm’ branch of github.com/gintool/gin (commit 9fe9bdf; branched from master commit 2359f57 pending full integration with Gin).
References
Artifact of Enhancing Genetic Improvement Mutations Using Large Language Models. Zenodo (2023). https://doi.org/10.5281/zenodo.8304433
Böhme, M., Soremekun, E.O., Chattopadhyay, S., Ugherughe, E., Zeller, A.: Where is the bug and how is it fixed? an experiment with practitioners. In: Proceedings of ACM Symposium on the Foundations of Software Engineering, pp. 117–128 (2017)
Brownlee, A.E., Petke, J., Alexander, B., Barr, E.T., Wagner, M., White, D.R.: Gin: genetic improvement research made easy. In: GECCO, pp. 985–993 (2019)
Brownlee, A.E., Petke, J., Rasburn, A.F.: Injecting shortcuts for faster running Java code. In: IEEE CEC 2020, pp. 1–8 (2020)
Chen, M., et al.: Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021)
Fan, A., et al.: Large language models for software engineering: survey and open problems (2023)
Github - jcodec/jcodec: Jcodec main repo. https://github.com/jcodec/jcodec
Han, S.J., Ransom, K.J., Perfors, A., Kemp, C.: Inductive reasoning in humans and large language models. Cogn. Syst. Res. 83, 101155 (2023)
Hou, X., et al.: Large language models for software engineering: a systematic literature review. arXiv:2308.10620 (2023)
Kang, S., Yoo, S.: Towards objective-tailored genetic improvement through large language models. arXiv:2304.09386 (2023)
Kim, D., Nam, J., Song, J., Kim, S.: Automatic patch generation learned from human-written patches (2013). http://logging.apache.org/log4j/
Kirbas, S., et al.: On the introduction of automatic program repair in bloomberg. IEEE Softw. 38(4), 43–51 (2021)
Marginean, A., et al.: Sapfix: automated end-to-end repair at scale. In: ICSE-SEIP, pp. 269–278 (2019)
Petke, J., Alexander, B., Barr, E.T., Brownlee, A.E., Wagner, M., White, D.R.: Program transformation landscapes for automated program modification using Gin. Empir. Softw. Eng. 28(4), 1–41 (2023)
Petke, J., Haraldsson, S.O., Harman, M., Langdon, W.B., White, D.R., Woodward, J.R.: Genetic improvement of software: a comprehensive survey. IEEE Trans. Evol. Comput. 22, 415–432 (2018)
Siddiq, M.L., Santos, J., Tanvir, R.H., Ulfat, N., Rifat, F.A., Lopes, V.C.: Exploring the effectiveness of large language models in generating unit tests. arXiv preprint arXiv:2305.00418 (2023)
Sobania, D., Briesch, M., Hanna, C., Petke, J.: An analysis of the automatic bug fixing performance of chatGPT. In: 2023 IEEE/ACM International Workshop on Automated Program Repair (APR), pp. 23–30. IEEE Computer Society (2023)
Xia, C.S., Paltenghi, M., Tian, J.L., Pradel, M., Zhang, L.: Universal fuzzing via large language models. arXiv preprint arXiv:2308.04748 (2023)
Xia, C.S., Zhang, L.: Keep the conversation going: fixing 162 out of 337 bugs for \$0.42 each using chatgpt. arXiv preprint arXiv:2304.00385 (2023)
Acknowledgements
This work was supported by the UKRI EPSRC grant no. EP/P023991/1 and the ERC advanced fellowship grant no. 741278.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Brownlee, A.E.I. et al. (2024). Enhancing Genetic Improvement Mutations Using Large Language Models. In: Arcaini, P., Yue, T., Fredericks, E.M. (eds) Search-Based Software Engineering. SSBSE 2023. Lecture Notes in Computer Science, vol 14415. Springer, Cham. https://doi.org/10.1007/978-3-031-48796-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-48796-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48795-8
Online ISBN: 978-3-031-48796-5
eBook Packages: Computer ScienceComputer Science (R0)