Towards Specification-Driven LLM-Based Generation of Embedded Automotive Software

Patil, Minal Suresh; Ung, Gustav; Nyberg, Mattias

doi:10.1007/978-3-031-75434-0_9

Minal Suresh Patil^8,9,
Gustav Ung⁹ &
Mattias Nyberg⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15217))

Included in the following conference series:

International Conference on Bridging the Gap between AI and Reality

89 Accesses
1 Citations

Abstract

The paper studies how code generation by LLMs can be combined with formal verification to produce critical embedded software. The first contribution is a general framework, spec2code, in which LLMs are combined with different types of critics that produce feedback for iterative backprompting and fine-tuning. The second contribution presents a first feasibility study, where a minimalistic instantiation of spec2code, without iterative backprompting and fine-tuning, is empirically evaluated using three industrial case studies from the heavy vehicle manufacturer Scania. The goal is to automatically generate industrial-quality code from specifications only. Different combinations of formal ACSL specifications and natural language specifications are explored. The results indicate that formally correct code can be generated even without the application of iterative backprompting and fine-tuning.

M. S. Patil—Work was done while the author was at Scania.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LLM-Based Scheme for Synthesis of Formal Verification Algorithms

Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program Verification

A journey with ASMETA from requirements to code: application to an automotive system with adaptive features

Article Open access 17 May 2024

References

Ahrendt, W., Beckert, B., Bubel, R., Hähnle, R., Schmitt, P.H., Ulbrich, M.: Deductive software verification-the key book. Lect. Notes Comput. Sci. 10001 (2016)
Google Scholar
Ahrendt, W., Gurov, D., Johansson, M., Rümmer, P.: Trico-triple co-piloting of implementation, specification and tests. In: International Symposium on Leveraging Applications of Formal Methods, pp. 174–187. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-19849-6_11
Baudin, P., Bobot, F., Correnson, L., Dargaye, Z., Blanchard, A.: WP Plug-in Manual. CEA LIST, Inria (2020). https://www.frama-c.com/download/frama-c-wp-manual.pdf
Baudin, P., Filliâtre, J.C., Marché, C., Monate, B., Moy, Y., Prevosto, V.: Acsl: Ansi/iso c specification (2021). https://frama-c.com/html/acsl.html
Chen, M., et al.: Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021)
Conchon, S., Coquereau, A., Iguernlala, M., Mebsout, A.: Alt-ergo 2.2. In: SMT Workshop: International Workshop on Satisfiability Modulo Theories (2018)
Google Scholar
Correnson, L., et al.: Frama-C User Manual. CEA LIST, Inria (2020). http://frama-c.com/download/frama-c-user-manual.pdf
Cosler, M., Hahn, C., Mendoza, D., Schmitt, F., Trippel, C.: nl2spec: Interactively translating unstructured natural language to temporal logics with large language models. In: International Conference on Computer Aided Verification, pp. 383–396. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-37703-7_18
Cosler, M., Schmitt, F., Hahn, C., Finkbeiner, B.: Iterative circuit repair against formal specifications. arXiv preprint arXiv:2303.01158 (2023)
de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Chapter MATH Google Scholar
Hahn, C., Schmitt, F., Kreber, J.U., Rabe, M.N., Finkbeiner, B.: Teaching temporal logics to neural networks. arXiv preprint arXiv:2003.04218 (2020)
Hahn, C., Schmitt, F., Tillman, J.J., Metzger, N., Siber, J., Finkbeiner, B.: Formal specifications from natural language. arXiv preprint arXiv:2206.01962 (2022)
Hähnle, R., Huisman, M.: Deductive software verification: from pen-and-paper proofs to industrial tools. In: Computing and Software Science: State of the Art and Perspectives, pp. 345–373 (2019)
Google Scholar
Holzmann, G.J.: The power of 10: rules for developing safety-critical code. Computer 39(6), 95–99 (2006)
MATH Google Scholar
International Organization for Standardization: Programming languages—C. ISO/IEC 9899:1999 (1999)
Google Scholar
International Organization for Standardization: Road vehicles controller area network (CAN). ISO 11898-1:2015 (2015)
Google Scholar
Kambhampati, S., et al.: Llms can’t plan, but can help planning in llm-modulo frameworks. arXiv preprint arXiv:2402.01817 (2024)
Khattab, O., et al.: Dspy: compiling declarative language model calls into self-improving pipelines. arXiv preprint arXiv:2310.03714 (2023)
Kojima, T., Gu, S.S., Reid, M., Matsuo, Y., Iwasawa, Y.: Large language models are zero-shot reasoners. Adv. Neural. Inf. Process. Syst. 35, 22199–22213 (2022)
Google Scholar
Leino, K.R.M.: Efficient weakest preconditions. Inf. Process. Lett. 93(6), 281–288 (2005)
Article MathSciNet MATH Google Scholar
Lewkowycz, A., et al.: Solving quantitative reasoning problems with language models. Adv. Neural. Inf. Process. Syst. 35, 3843–3857 (2022)
MATH Google Scholar
Malík, V., Vojnar, T.: Automatically checking semantic equivalence between versions of large-scale c projects. In: 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST), pp. 329–339. IEEE (2021). https://github.com/diffkemp/diffkemp
MIRA Ltd: MISRA-C:2004 Guidelines for the use of the C language in critical systems (2004). https://misra.org.uk/misra-c
Olausson, T.X., Inala, J.P., Wang, C., Gao, J., Solar-Lezama, A.: Demystifying gpt self-repair for code generation. arXiv preprint arXiv:2306.09896 (2023)
OpenAI: gpt-3.5-turbo-0125 (2022). https://www.openai.com. Accessed 25 Apr 2024
OpenAI: gpt-4-turbo (2023). https://www.openai.com. Accessed 25 Apr 2024
Ouyang, L., et al.: Training language models to follow instructions with human feedback. Adv. Neural. Inf. Process. Syst. 35, 27730–27744 (2022)
MATH Google Scholar
Rafailov, R., Sharma, A., Mitchell, E., Manning, C.D., Ermon, S., Finn, C.: Direct preference optimization: your language model is secretly a reward model. Adv. Neural Inf. Process. Syst. 36 (2024)
Google Scholar
Ross, S.I., Martinez, F., Houde, S., Muller, M., Weisz, J.D.: The programmer’s assistant: conversational interaction with a large language model for software development. In: Proceedings of the 28th International Conference on Intelligent User Interfaces, pp. 491–514 (2023)
Google Scholar
Schäfer, M., Nadi, S., Eghbali, A., Tip, F.: An empirical evaluation of using large language models for automated unit test generation. IEEE Trans. Softw. Eng. (2023)
Google Scholar
Shi, F., et al.: Language models are multilingual chain-of-thought reasoners. arXiv preprint arXiv:2210.03057 (2022)
For Standardization (ISO), I.O.: Iso 26262-1: 2018–road vehicles-functional safety. Geneva, Switzerland (2018)
Google Scholar
Tambon, F., Dakhel, A.M., Nikanjam, A., Khomh, F., Desmarais, M.C., Antoniol, G.: Bugs in large language models generated code. arXiv preprint arXiv:2403.08937 (2024)
Ung, G., Amilon, J., Gurov, D., Lidström, C., Nyberg, M., Palmskog, K.: Post-hoc formal verification of automotive software with informal requirements: an experience report. In: Accepted at 2024 IEEE 32nd International Requirements Engineering Conference (RE). IEEE (2024)
Google Scholar
Vaithilingam, P., Zhang, T., Glassman, E.L.: Expectation vs. experience: evaluating the usability of code generation tools powered by large language models. In: Chi Conference on Human Factors in Computing Systems Extended Abstracts, pp. 1–7 (2022)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Wang, B., et al.: Towards understanding chain-of-thought prompting: an empirical study of what matters. arXiv preprint arXiv:2212.10001 (2022)
Wei, X., et al.: Zero-shot information extraction via chatting with chatgpt. arXiv preprint arXiv:2302.10205 (2023)
Wikipedia contributors: Assertion (software development)—Wikipedia, the free encyclopedia (2023). https://en.wikipedia.org/w/index.php?title=Assertion_(software_development)&oldid=1179241560. Accessed 27 May 2024
Yuksekgonul, M., et al.: Textgrad: Automatic “differentiation" via text (2024). arXiv preprint arXiv:2406.07496
Zhong, L., Wang, Z.: Can llm replace stack overflow? a study on robustness and reliability of large language model code generation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 21841–21849 (2024)
Google Scholar

Download references

Author information

Authors and Affiliations

Umeå Universitet, 90187, Umeå, Sweden
Minal Suresh Patil
Scania, Granparksvägen 10, 15148, Södertälje, Sweden
Minal Suresh Patil, Gustav Ung & Mattias Nyberg

Authors

Minal Suresh Patil
View author publications
You can also search for this author in PubMed Google Scholar
Gustav Ung
View author publications
You can also search for this author in PubMed Google Scholar
Mattias Nyberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minal Suresh Patil .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Bernhard Steffen

Appendix A

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patil, M.S., Ung, G., Nyberg, M. (2025). Towards Specification-Driven LLM-Based Generation of Embedded Automotive Software. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2024. Lecture Notes in Computer Science, vol 15217. Springer, Cham. https://doi.org/10.1007/978-3-031-75434-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-75434-0_9
Published: 30 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-75433-3
Online ISBN: 978-3-031-75434-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Specification-Driven LLM-Based Generation of Embedded Automotive Software

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LLM-Based Scheme for Synthesis of Formal Verification Algorithms

Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program Verification

A journey with ASMETA from requirements to code: application to an automotive system with adaptive features

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Towards Specification-Driven LLM-Based Generation of Embedded Automotive Software

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LLM-Based Scheme for Synthesis of Formal Verification Algorithms

Enchanting Program Specification Synthesis by Large Language Models Using Static Analysis and Program Verification

A journey with ASMETA from requirements to code: application to an automotive system with adaptive features

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Appendix A

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation