CFStra: Enhancing Configurable Program Analysis Through LLM-Driven Strategy Selection Based on Code Features

Su, Jie; Deng, Liansai; Wen, Cheng; Qin, Shengchao; Tian, Cong

doi:10.1007/978-3-031-64626-3_22

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14777))

Included in the following conference series:

International Symposium on Theoretical Aspects of Software Engineering

588 Accesses

Abstract

Configurable Program Analysis (CPA) allows users to customize program analysis based on their preferences. However, current program verification tools like Cpachecker require manual strategy selection, which can be complex and error-prone. In this paper, we present a novel approach to efficiently perform program verification tasks by harnessing the capabilities of Large Language Models (LLMs) to automatically select verification strategies based on code features and specifications. Specifically, we begin by extracting relevant code snippets and querying LLMs to identify code features. Based on the identified code features, we propose a strategy selector to automatically choose the verification strategy. Finally, we execute the Cpachecker with the selected verification strategy. We evaluated our approach using a diverse set of 600 verification tasks. The results demonstrate the effectiveness of our approach, surpassing basic strategies and SOTA combination strategies while also standing out for its simplicity and ease of understanding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CPAchecker 2.3 with Strategy Selection

Automated Test Case Generation for the CTRL Programming Language Using Pex: Lessons Learned

Evaluating CAVM: A New Search-Based Test Data Generation Tool for C

References

Allen, J.R., Kennedy, K., Porterfield, C., Warren, J.: Conversion of control dependence to data dependence. In: Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 177–189 (1983)
Google Scholar
Bartocci, E., Falcone, Y., Francalanza, A., Reger, G.: Introduction to runtime verification. In: Lectures on Runtime Verification: Introductory and Advanced Topics, pp. 1–33 (2018)
Google Scholar
Beyer, D., Dangl, M.: Strategy selection for software verification based on boolean features. In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11245, pp. 144–159. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03421-4_11
Chapter MATH Google Scholar
Beyer, D., Dangl, M., Wendler, P.: Boosting k-induction with continuously-refined invariants. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 622–640. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21690-4_42
Chapter MATH Google Scholar
Beyer, D., Keremoglu, M.E.: CPAchecker: a tool for configurable software verification. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 184–190. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_16
Chapter MATH Google Scholar
Beyer, D., Keremoglu, M.E., Wendler, P.: Predicate abstraction with adjustable-block encoding. In: Formal Methods in Computer Aided Design, pp. 189–197. IEEE (2010)
Google Scholar
Beyer, D., Löwe, S.: Explicit-state software model checking based on CEGAR and interpolation. In: Cortellessa, V., Varró, D. (eds.) FASE 2013. LNCS, vol. 7793, pp. 146–162. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37057-1_11
Chapter MATH Google Scholar
Beyer, D., Löwe, S., Wendler, P.: Reliable benchmarking: requirements and solutions. Int. J. Softw. Tools Technol. Transfer 21, 1–29 (2019)
Article MATH Google Scholar
Biere, A., Cimatti, A., Clarke, E.M., Strichman, O., Zhu, Y.: Bounded model checking. In: Handbook of Satisfiability, vol. 185, no. 99, pp. 457–481 (2009)
Google Scholar
Chakraborty, S., et al.: Ranking LLM-generated loop invariants for program verification. arXiv preprint arXiv:2310.09342 (2023)
Christ, J., Hoenicke, J., Nutz, A.: SMTInterpol: an interpolating SMT solver. In: Donaldson, A., Parker, D. (eds.) SPIN 2012. LNCS, vol. 7385, pp. 248–254. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31759-0_19
Chapter MATH Google Scholar
Cimatti, A., Griggio, A., Schaafsma, B.J., Sebastiani, R.: The MathSAT5 SMT solver. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 93–107. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36742-7_7
Chapter MATH Google Scholar
Clarke, E.M.: Model checking. In: Ramesh, S., Sivakumar, G. (eds.) FSTTCS 1997. LNCS, vol. 1346, pp. 54–56. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0058022
Chapter MATH Google Scholar
Czech, M., Hüllermeier, E., Jakobs, M.C., Wehrheim, H.: Predicting rankings of software verification tools. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Software Analytics, pp. 23–26 (2017)
Google Scholar
Demyanova, Y., Pani, T., Veith, H., Zuleger, F.: Empirical software metrics for benchmarking of verification tools. Formal Methods Syst. Des. 50, 289–316 (2017)
Article MATH Google Scholar
Falcone, Y., Havelund, K., Reger, G.: A tutorial on runtime verification. Eng. Dependable Softw. Syst. 141–175 (2013)
Google Scholar
Fan, A., et al.: Large language models for software engineering: survey and open problems. arXiv preprint arXiv:2310.03533 (2023)
Fan, Z., Gao, X., Mirchev, M., Roychoudhury, A., Tan, S.H.: Automated repair of programs from large language models. In: 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, 14–20 May 2023, pp. 1469–1481. IEEE (2023)
Google Scholar
Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. 9(3), 319–349 (1987). https://doi.org/10.1145/24039.24041
Article MATH Google Scholar
Fetzer, J.H.: Program verification: the very idea. Commun. ACM 31(9), 1048–1063 (1988)
Article MATH Google Scholar
First, E., Rabe, M., Ringer, T., Brun, Y.: Baldur: whole-proof generation and repair with large language models. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1229–1241 (2023)
Google Scholar
Hou, X., et al.: Large language models for software engineering: a systematic literature review. arXiv preprint arXiv:2308.10620 (2023)
Kang, S., Yoon, J., Askarbekkyzy, N., Yoo, S.: Evaluating diverse large language models for automatic and general bug reproduction. arXiv preprint arXiv:2311.04532 (2023)
Kang, S., Yoon, J., Yoo, S.: Large language models are few-shot testers: exploring LLM-based general bug reproduction. In: 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, 14–20 May 2023, pp. 2312–2323. IEEE (2023)
Google Scholar
Kocoń, J., et al.: Chatgpt: jack of all trades, master of none. Inf. Fusion 101861 (2023)
Google Scholar
Kroening, D., Tautschnig, M.: CBMC – C bounded model checker. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 389–391. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54862-8_26
Chapter MATH Google Scholar
Lemieux, C., Inala, J.P., Lahiri, S.K., Sen, S.: Codamosa: escaping coverage plateaus in test generation with pre-trained large language models. In: 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, 14–20 May 2023, pp. 919–931. IEEE (2023)
Google Scholar
Li, H., Hao, Y., Zhai, Y., Qian, Z.: Assisting static analysis with large language models: a chatgpt experiment. In: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 2107–2111 (2023)
Google Scholar
Li, Y., et al.: Competition-level code generation with alphacode. Science 378(6624), 1092–1097 (2022)
Article MATH Google Scholar
Liu, Y., et al.: Summary of chatgpt-related research and perspective towards the future of large language models. Meta-Radiol. 100017 (2023)
Google Scholar
Löwe, S., Mandrykin, M., Wendler, P.: CPAchecker with sequential combination of explicit-value analyses and predicate analyses. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 392–394. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54862-8_27
Chapter MATH Google Scholar
Matthews, J., Moore, J.S., Ray, S., Vroon, D.: Verification condition generation via theorem proving. In: Hermann, M., Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 362–376. Springer, Heidelberg (2006). https://doi.org/10.1007/11916277_25
Chapter MATH Google Scholar
OpenAI: GPT-4 technical report. arxiv:2303.08774. View in Article, vol. 2, p. 13 (2023)
Ouimet, M., Lundqvist, K.: Formal software verification: model checking and theorem proving. Embedded Systems Laboratory Technical Report ESL-TIK-00214, Cambridge USA (2007)
Google Scholar
Pei, K., Bieber, D., Shi, K., Sutton, C., Yin, P.: Can large language models reason about program invariants? In: International Conference on Machine Learning, pp. 27496–27520. PMLR (2023)
Google Scholar
Richter, C., Wehrheim, H.: PeSCo: predicting sequential combinations of verifiers. In: Beyer, D., Huisman, M., Kordon, F., Steffen, B. (eds.) TACAS 2019. LNCS, vol. 11429, pp. 229–233. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17502-3_19
Chapter MATH Google Scholar
Ruben: A code-understanding, code-browsing or code-search tool. This is a tool to index, then query or search C, C++, java, python, ruby, go and javascript source code. https://github.com/ruben2020/codequery. Accessed 04 Mar 2024
Sieber, K.: The Foundations of Program Verification. Springer, Wiesbaden (2013). https://doi.org/10.1007/978-3-322-96753-4
Book MATH Google Scholar
Su, J., Tian, C., Duan, Z.: Conditional interpolation: making concurrent program verification more effective. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 144–154 (2021)
Google Scholar
Su, J., Tian, C., Yang, Z., Yang, J., Yu, B., Duan, Z.: Prioritized constraint-aided dynamic partial-order reduction. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, pp. 1–13 (2022)
Google Scholar
Su, J., Yang, Z., Xing, H., Yang, J., Tian, C., Duan, Z.: PIChecker: a POR and interpolation based verifier for concurrent programs (competition contribution). In: Sankaranarayanan, S., Sharygina, N. (eds.) TACAS 2023. LNCS, vol. 13994, pp. 571–576. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-30820-8_38
Chapter MATH Google Scholar
Tsigkanos, C., Rani, P., Müller, S., Kehrer, T.: Large language models: the next frontier for variable discovery within metamorphic testing? In: 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 678–682. IEEE (2023)
Google Scholar
Wen, C., et al.: Automatically inspecting thousands of static bug warnings with large language model: How far are we? ACM Trans. Knowl. Discov. Data (2024)
Google Scholar
Wen, C., et al.: Enchanting program specification synthesis by large language models using static analysis and program verification. In: International Conference on Computer Aided Verification. Springer, Cham (2024)
Google Scholar
Wonisch, D., Wehrheim, H.: Predicate analysis with block-abstraction memoization. In: Aoki, T., Taguchi, K. (eds.) ICFEM 2012. LNCS, vol. 7635, pp. 332–347. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34281-3_24
Chapter MATH Google Scholar
Wu, H., Barrett, C., Narodytska, N.: Lemur: integrating large language models in automated program verification. arXiv preprint arXiv:2310.04870 (2023)
Wu, T., et al.: A brief overview of ChatGPT: the history, status quo and potential future development. IEEE/CAA J. Automatica Sinica 10(5), 1122–1136 (2023)
Article MATH Google Scholar
Xia, C.S., Wei, Y., Zhang, L.: Automated program repair in the era of large pre-trained language models. In: 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, 14–20 May 2023, pp. 1482–1494. IEEE (2023)
Google Scholar
Xu, Z., Wen, C., Qin, S.: State-taint analysis for detecting resource bugs. Sci. Comput. Program. 162, 93–109 (2018)
Article MATH Google Scholar
Yetistiren, B., Ozsoy, I., Tuzun, E.: Assessing the quality of github copilot’s code generation. In: Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, pp. 62–71 (2022)
Google Scholar
Zhao, W.X., et al.: A survey of large language models. arXiv preprint arXiv:2303.18223 (2023)

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their constructive comments. This work was supported in part by the National Natural Science Foundation of China (Nos. 62302375, 62192734, 62193273024), the China Postdoctoral Science Foundation funded project (No. 2023M723736), and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Guangzhou Institute of Technology, Xidian University, Xi’an, China
Jie Su, Liansai Deng, Cheng Wen, Shengchao Qin & Cong Tian
ICTT and ISN Laboratory, Xidian University, Xi’an, China
Shengchao Qin & Cong Tian

Authors

Jie Su
View author publications
You can also search for this author in PubMed Google Scholar
Liansai Deng
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Wen
View author publications
You can also search for this author in PubMed Google Scholar
Shengchao Qin
View author publications
You can also search for this author in PubMed Google Scholar
Cong Tian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shengchao Qin or Cong Tian .

Editor information

Editors and Affiliations

National University of Singapore, Singapore, Singapore
Wei-Ngan Chin
Shenzhen University, Guangdong, China
Zhiwu Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, J., Deng, L., Wen, C., Qin, S., Tian, C. (2024). CFStra: Enhancing Configurable Program Analysis Through LLM-Driven Strategy Selection Based on Code Features. In: Chin, WN., Xu, Z. (eds) Theoretical Aspects of Software Engineering. TASE 2024. Lecture Notes in Computer Science, vol 14777. Springer, Cham. https://doi.org/10.1007/978-3-031-64626-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-64626-3_22
Published: 14 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64625-6
Online ISBN: 978-3-031-64626-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CFStra: Enhancing Configurable Program Analysis Through LLM-Driven Strategy Selection Based on Code Features