Skip to main content

Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization

  • Conference paper
  • First Online:
Quality of Information and Communications Technology (QUATIC 2024)

Abstract

Large Language Models (LLMs) have demonstrated effectiveness in tackling coding tasks, leading to their growing popularity in commercial solutions like GitHub Copilot and ChatGPT. These models, however, may be trained on proprietary code, raising concerns about potential leaks of intellectual property. A recent study indicates that LLMs can memorize parts of the source code, rendering them vulnerable to extraction attacks. However, it used white-box attacks which assume that adversaries have partial knowledge of the training set.

This paper presents a pioneering effort to conduct a black-box attack (reconstruction attack) on an LLM designed for a specific coding task – code summarization. The results achieved reveal that while the attack is generally unsuccessful (with an average BLEU score below 0.1), it succeeds in a few instances, reconstructing versions of the code that closely resemble the original.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.microsoft.com/en-us/microsoft-copilot (Verified on April 26th, 2024).

  2. 2.

    https://chat.openai.com.

References

  1. Al-Kaswan, A., Izadi, M., van Deursen, A.: Targeted attack on GPT-neo for the SATML language model data extraction challenge. arXiv:2302.07735 (2023)

  2. Al-Kaswan, A., Izadi, M., Van Deursen, A.: Traces of memorisation in large language models for code. In: IEEE/ACM International Conference on Software Engineering, pp. 1–12 (2024)

    Google Scholar 

  3. Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: USENIX Security Symposium, pp. 267–284 (2019)

    Google Scholar 

  4. Carlini, N., et al.: Extracting training data from large language models. In: USENIX Security Symposium, pp. 2633–2650 (2021)

    Google Scholar 

  5. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 1322–1333 (2015)

    Google Scholar 

  6. Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an \(\{\)End-to-End\(\}\) case study of personalized warfarin dosing. In: USENIX Security Symposium, pp. 17–32 (2014)

    Google Scholar 

  7. Ganju, K., Wang, Q., Yang, W., Gunter, C.A., Borisov, N.: Property inference attacks on fully connected neural networks using permutation invariant representations. In: Computer and Communications Security Conference, pp. 619–633 (2018)

    Google Scholar 

  8. Haque, S., LeClair, A., Wu, L., McMillan, C.: Improved automatic summarization of subroutines via attention to file context. In: International Conference on Mining Software Repositories, pp. 300–310 (2020)

    Google Scholar 

  9. Hidano, S., Murakami, T., Katsumata, S., Kiyomoto, S., Hanaoka, G.: Model inversion attacks for prediction systems: without knowledge of non-sensitive attributes. In: IEEE International Conference on Privacy, Security, and Trust, pp. 115–11509 (2017)

    Google Scholar 

  10. Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618 (2017)

    Google Scholar 

  11. Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: IEEE/ACM International Conference on Program Comprehension, pp. 200–210 (2018)

    Google Scholar 

  12. Hu, X., Li, G., Xia, X., Lo, D., Lu, S., Jin, Z.: Summarizing source code with transferred API knowledge. In: International Joint Conference on Artificial Intelligence, pp. 2269–2275 (2018)

    Google Scholar 

  13. Husain, H., Wu, H.H., Gazit, T., Allamanis, M., Brockschmidt, M.: Codesearchnet challenge: evaluating the state of semantic code search. arXiv:1909.09436 (2019)

  14. Mastropaolo, A., et al.: Studying the usage of text-to-text transfer transformer to support code-related tasks. In: IEEE/ACM International Conference on Software Engineering, pp. 336–347 (2021)

    Google Scholar 

  15. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Annual Meeting of the ACL, pp. 311–318 (2002)

    Google Scholar 

  16. Parikh, R., Dupuy, C., Gupta, R.: Canary extraction in natural language understanding models. In: Annual Meeting of the ACL (2022)

    Google Scholar 

  17. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. (JMLR) 21(1), 5485–5551 (2020)

    MathSciNet  Google Scholar 

  18. Rigaki, M., Garcia, S.: A survey of privacy attacks in machine learning. ACM Comput. Surv. 56(4), 1–34 (2023)

    Article  Google Scholar 

  19. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)

    Google Scholar 

  20. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: USENIX Security Symposium, pp. 601–618 (2016)

    Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  22. Yang, Z., et al.: Unveiling memorization in code models. In: IEEE/ACM International Conference on Software Engineering, pp. 856–856 (2024)

    Google Scholar 

  23. Yang, Z., Zhang, J., Chang, E.C., Liang, Z.: Neural network inversion in adversarial setting via background knowledge alignment. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 225–240 (2019)

    Google Scholar 

  24. Zhang, Y., Jia, R., Pei, H., Wang, W., Li, B., Song, D.: The secret revealer: generative model-inversion attacks against deep neural networks. In: IEEE/CVF Computer Vision and Pattern Recognition Conference, pp. 253–261 (2020)

    Google Scholar 

  25. Zhang, Z., Wen, J., Huang, M.: ETHICIST: targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation. In: Annual Meeting of the ACL (2023)

    Google Scholar 

  26. Ziegler, A., et al.: Measuring github copilot’s impact on productivity. Commun. ACM 67(3), 54–63 (2024)

    Article  Google Scholar 

Download references

Acknowledgment

This publication is part of the project PNRR-NGEU which has received funding from the MUR - DM 118/2023. This work has been supported by the European Union - NextGenerationEU through the Italian Ministry of University and Research, Projects PRIN 2022 “QualAI: Continuous Quality Improvement of AI-based Systems”, grant n. 2022B3BP5S, CUP: H53D23003510006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Scalabrino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Russodivito, M., Spina, A., Scalabrino, S., Oliveto, R. (2024). Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization. In: Bertolino, A., Pascoal Faria, J., Lago, P., Semini, L. (eds) Quality of Information and Communications Technology. QUATIC 2024. Communications in Computer and Information Science, vol 2178. Springer, Cham. https://doi.org/10.1007/978-3-031-70245-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70245-7_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70244-0

  • Online ISBN: 978-3-031-70245-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics