Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Russodivito, Marco; Spina, Angelica; Scalabrino, Simone; Oliveto, Rocco

doi:10.1007/978-3-031-70245-7_27

Marco Russodivito⁷,
Angelica Spina⁷,
Simone Scalabrino⁷ &
…
Rocco Oliveto⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2178))

Included in the following conference series:

International Conference on the Quality of Information and Communications Technology

386 Accesses

Abstract

Large Language Models (LLMs) have demonstrated effectiveness in tackling coding tasks, leading to their growing popularity in commercial solutions like GitHub Copilot and ChatGPT. These models, however, may be trained on proprietary code, raising concerns about potential leaks of intellectual property. A recent study indicates that LLMs can memorize parts of the source code, rendering them vulnerable to extraction attacks. However, it used white-box attacks which assume that adversaries have partial knowledge of the training set.

This paper presents a pioneering effort to conduct a black-box attack (reconstruction attack) on an LLM designed for a specific coding task – code summarization. The results achieved reveal that while the attack is generally unsuccessful (with an average BLEU score below 0.1), it succeeds in a few instances, reconstructing versions of the code that closely resemble the original.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.99; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Investigating large language models capabilities for automatic code repair in Python

Article 09 May 2024

Towards Establishing the Role of LLMs in Botnet Detection: Effective Prompts for Source Code Analysis

A review of automatic source code summarization

Article 07 October 2024

Notes

1.
https://www.microsoft.com/en-us/microsoft-copilot (Verified on April 26th, 2024).
2.
https://chat.openai.com.

References

Al-Kaswan, A., Izadi, M., van Deursen, A.: Targeted attack on GPT-neo for the SATML language model data extraction challenge. arXiv:2302.07735 (2023)
Al-Kaswan, A., Izadi, M., Van Deursen, A.: Traces of memorisation in large language models for code. In: IEEE/ACM International Conference on Software Engineering, pp. 1–12 (2024)
Google Scholar
Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: USENIX Security Symposium, pp. 267–284 (2019)
Google Scholar
Carlini, N., et al.: Extracting training data from large language models. In: USENIX Security Symposium, pp. 2633–2650 (2021)
Google Scholar
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 1322–1333 (2015)
Google Scholar
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an $\{$End-to-End$\}$ case study of personalized warfarin dosing. In: USENIX Security Symposium, pp. 17–32 (2014)
Google Scholar
Ganju, K., Wang, Q., Yang, W., Gunter, C.A., Borisov, N.: Property inference attacks on fully connected neural networks using permutation invariant representations. In: Computer and Communications Security Conference, pp. 619–633 (2018)
Google Scholar
Haque, S., LeClair, A., Wu, L., McMillan, C.: Improved automatic summarization of subroutines via attention to file context. In: International Conference on Mining Software Repositories, pp. 300–310 (2020)
Google Scholar
Hidano, S., Murakami, T., Katsumata, S., Kiyomoto, S., Hanaoka, G.: Model inversion attacks for prediction systems: without knowledge of non-sensitive attributes. In: IEEE International Conference on Privacy, Security, and Trust, pp. 115–11509 (2017)
Google Scholar
Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618 (2017)
Google Scholar
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: IEEE/ACM International Conference on Program Comprehension, pp. 200–210 (2018)
Google Scholar
Hu, X., Li, G., Xia, X., Lo, D., Lu, S., Jin, Z.: Summarizing source code with transferred API knowledge. In: International Joint Conference on Artificial Intelligence, pp. 2269–2275 (2018)
Google Scholar
Husain, H., Wu, H.H., Gazit, T., Allamanis, M., Brockschmidt, M.: Codesearchnet challenge: evaluating the state of semantic code search. arXiv:1909.09436 (2019)
Mastropaolo, A., et al.: Studying the usage of text-to-text transfer transformer to support code-related tasks. In: IEEE/ACM International Conference on Software Engineering, pp. 336–347 (2021)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Annual Meeting of the ACL, pp. 311–318 (2002)
Google Scholar
Parikh, R., Dupuy, C., Gupta, R.: Canary extraction in natural language understanding models. In: Annual Meeting of the ACL (2022)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. (JMLR) 21(1), 5485–5551 (2020)
MathSciNet Google Scholar
Rigaki, M., Garcia, S.: A survey of privacy attacks in machine learning. ACM Comput. Surv. 56(4), 1–34 (2023)
Article Google Scholar
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)
Google Scholar
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: USENIX Security Symposium, pp. 601–618 (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Yang, Z., et al.: Unveiling memorization in code models. In: IEEE/ACM International Conference on Software Engineering, pp. 856–856 (2024)
Google Scholar
Yang, Z., Zhang, J., Chang, E.C., Liang, Z.: Neural network inversion in adversarial setting via background knowledge alignment. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 225–240 (2019)
Google Scholar
Zhang, Y., Jia, R., Pei, H., Wang, W., Li, B., Song, D.: The secret revealer: generative model-inversion attacks against deep neural networks. In: IEEE/CVF Computer Vision and Pattern Recognition Conference, pp. 253–261 (2020)
Google Scholar
Zhang, Z., Wen, J., Huang, M.: ETHICIST: targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation. In: Annual Meeting of the ACL (2023)
Google Scholar
Ziegler, A., et al.: Measuring github copilot’s impact on productivity. Commun. ACM 67(3), 54–63 (2024)
Article Google Scholar

Download references

Acknowledgment

This publication is part of the project PNRR-NGEU which has received funding from the MUR - DM 118/2023. This work has been supported by the European Union - NextGenerationEU through the Italian Ministry of University and Research, Projects PRIN 2022 “QualAI: Continuous Quality Improvement of AI-based Systems”, grant n. 2022B3BP5S, CUP: H53D23003510006.

Author information

Authors and Affiliations

University of Molise, Campobasso, Italy
Marco Russodivito, Angelica Spina, Simone Scalabrino & Rocco Oliveto

Authors

Marco Russodivito
View author publications
You can also search for this author in PubMed Google Scholar
Angelica Spina
View author publications
You can also search for this author in PubMed Google Scholar
Simone Scalabrino
View author publications
You can also search for this author in PubMed Google Scholar
Rocco Oliveto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simone Scalabrino .

Editor information

Editors and Affiliations

Istituto di Scienza e Tecnologie, Consiglio Nazionale delle Ricerche, Pisa, Italy
Antonia Bertolino
Faculty of Engineering, University of Porto, Porto, Portugal
João Pascoal Faria
Dept of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Patricia Lago
Dipartimento di Informatica, Università di Pisa, Pisa, Italy
Laura Semini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Russodivito, M., Spina, A., Scalabrino, S., Oliveto, R. (2024). Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization. In: Bertolino, A., Pascoal Faria, J., Lago, P., Semini, L. (eds) Quality of Information and Communications Technology. QUATIC 2024. Communications in Computer and Information Science, vol 2178. Springer, Cham. https://doi.org/10.1007/978-3-031-70245-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-70245-7_27
Published: 11 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70244-0
Online ISBN: 978-3-031-70245-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Investigating large language models capabilities for automatic code repair in Python

Towards Establishing the Role of LLMs in Botnet Detection: Effective Prompts for Source Code Analysis

A review of automatic source code summarization

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Investigating large language models capabilities for automatic code repair in Python

Towards Establishing the Role of LLMs in Botnet Detection: Effective Prompts for Source Code Analysis

A review of automatic source code summarization

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation