Abstract
Large Language Models (LLMs) have demonstrated effectiveness in tackling coding tasks, leading to their growing popularity in commercial solutions like GitHub Copilot and ChatGPT. These models, however, may be trained on proprietary code, raising concerns about potential leaks of intellectual property. A recent study indicates that LLMs can memorize parts of the source code, rendering them vulnerable to extraction attacks. However, it used white-box attacks which assume that adversaries have partial knowledge of the training set.
This paper presents a pioneering effort to conduct a black-box attack (reconstruction attack) on an LLM designed for a specific coding task – code summarization. The results achieved reveal that while the attack is generally unsuccessful (with an average BLEU score below 0.1), it succeeds in a few instances, reconstructing versions of the code that closely resemble the original.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://www.microsoft.com/en-us/microsoft-copilot (Verified on April 26th, 2024).
- 2.
References
Al-Kaswan, A., Izadi, M., van Deursen, A.: Targeted attack on GPT-neo for the SATML language model data extraction challenge. arXiv:2302.07735 (2023)
Al-Kaswan, A., Izadi, M., Van Deursen, A.: Traces of memorisation in large language models for code. In: IEEE/ACM International Conference on Software Engineering, pp. 1–12 (2024)
Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: USENIX Security Symposium, pp. 267–284 (2019)
Carlini, N., et al.: Extracting training data from large language models. In: USENIX Security Symposium, pp. 2633–2650 (2021)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 1322–1333 (2015)
Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an \(\{\)End-to-End\(\}\) case study of personalized warfarin dosing. In: USENIX Security Symposium, pp. 17–32 (2014)
Ganju, K., Wang, Q., Yang, W., Gunter, C.A., Borisov, N.: Property inference attacks on fully connected neural networks using permutation invariant representations. In: Computer and Communications Security Conference, pp. 619–633 (2018)
Haque, S., LeClair, A., Wu, L., McMillan, C.: Improved automatic summarization of subroutines via attention to file context. In: International Conference on Mining Software Repositories, pp. 300–310 (2020)
Hidano, S., Murakami, T., Katsumata, S., Kiyomoto, S., Hanaoka, G.: Model inversion attacks for prediction systems: without knowledge of non-sensitive attributes. In: IEEE International Conference on Privacy, Security, and Trust, pp. 115–11509 (2017)
Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep models under the GAN: information leakage from collaborative deep learning. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618 (2017)
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: IEEE/ACM International Conference on Program Comprehension, pp. 200–210 (2018)
Hu, X., Li, G., Xia, X., Lo, D., Lu, S., Jin, Z.: Summarizing source code with transferred API knowledge. In: International Joint Conference on Artificial Intelligence, pp. 2269–2275 (2018)
Husain, H., Wu, H.H., Gazit, T., Allamanis, M., Brockschmidt, M.: Codesearchnet challenge: evaluating the state of semantic code search. arXiv:1909.09436 (2019)
Mastropaolo, A., et al.: Studying the usage of text-to-text transfer transformer to support code-related tasks. In: IEEE/ACM International Conference on Software Engineering, pp. 336–347 (2021)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Annual Meeting of the ACL, pp. 311–318 (2002)
Parikh, R., Dupuy, C., Gupta, R.: Canary extraction in natural language understanding models. In: Annual Meeting of the ACL (2022)
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. (JMLR) 21(1), 5485–5551 (2020)
Rigaki, M., Garcia, S.: A survey of privacy attacks in machine learning. ACM Comput. Surv. 56(4), 1–34 (2023)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: USENIX Security Symposium, pp. 601–618 (2016)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Yang, Z., et al.: Unveiling memorization in code models. In: IEEE/ACM International Conference on Software Engineering, pp. 856–856 (2024)
Yang, Z., Zhang, J., Chang, E.C., Liang, Z.: Neural network inversion in adversarial setting via background knowledge alignment. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 225–240 (2019)
Zhang, Y., Jia, R., Pei, H., Wang, W., Li, B., Song, D.: The secret revealer: generative model-inversion attacks against deep neural networks. In: IEEE/CVF Computer Vision and Pattern Recognition Conference, pp. 253–261 (2020)
Zhang, Z., Wen, J., Huang, M.: ETHICIST: targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation. In: Annual Meeting of the ACL (2023)
Ziegler, A., et al.: Measuring github copilot’s impact on productivity. Commun. ACM 67(3), 54–63 (2024)
Acknowledgment
This publication is part of the project PNRR-NGEU which has received funding from the MUR - DM 118/2023. This work has been supported by the European Union - NextGenerationEU through the Italian Ministry of University and Research, Projects PRIN 2022 “QualAI: Continuous Quality Improvement of AI-based Systems”, grant n. 2022B3BP5S, CUP: H53D23003510006.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Russodivito, M., Spina, A., Scalabrino, S., Oliveto, R. (2024). Black-Box Reconstruction Attacks on LLMs: A Preliminary Study in Code Summarization. In: Bertolino, A., Pascoal Faria, J., Lago, P., Semini, L. (eds) Quality of Information and Communications Technology. QUATIC 2024. Communications in Computer and Information Science, vol 2178. Springer, Cham. https://doi.org/10.1007/978-3-031-70245-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-70245-7_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70244-0
Online ISBN: 978-3-031-70245-7
eBook Packages: Computer ScienceComputer Science (R0)