Privacy dilemmas and opportunities in large language models: a brief review

Li, Hongyi; Ye, Jiawei; Wu, Jie

doi:10.1007/s11704-024-40583-8

Privacy dilemmas and opportunities in large language models: a brief review

Review Article
Published: 28 January 2025

Volume 19, article number 1910356, (2025)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Hongyi Li¹,
Jiawei Ye¹ &
Jie Wu¹

154 Accesses
Explore all metrics

Abstract

The growing number of cases indicates that large language model (LLM) brings transformative advancements while raising privacy concerns. Despite promising recent surveys proposed in the literature, there is still a lack of comprehensive analysis dedicated to text privacy specifically for LLM. By comprehensively collecting LLM privacy research, we summarize five privacy issues and their corresponding solutions during both model training and invocation and extend our analysis to three research focuses in LLM application. Moreover, we propose five further research directions and provide prospects for LLM native security mechanisms. Notably, we find that most LLM privacy research is still in the technical exploration phase, with the hope that this work can assist in LLM privacy development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PrivacyChat: Utilizing Large Language Model for Fine-Grained Information Extraction over Privacy Policies

The ethical security of large language models: A systematic review

Article 28 January 2025

Invited Paper: Security and Privacy in Large Language and Foundation Models: A Survey on GenAI Attacks

References

Mitchell R. Samsung fab data leak: how ChatGPT exposed sensitive information. See electropages.com/blog/2023/04/how-chatgpt-exposed-sensitive-information website, 2023
Google Scholar
Gupta M, Akiri C, Aryal K, Parker E, Praharaj L. From ChatGPT to THreatGPT: impact of generative AI in cybersecurity and privacy. IEEE Access, 2023, 11: 80218–80245
Article Google Scholar
Nasr M, Carlini N, Hayase J, Jagielski M, Cooper A F, Ippolito D, Choquette-Choo C A, Wallace E, Tramèr F, Lee K. Scalable extraction of training data from (production) language models. 2023, arXiv preprint arXiv: 2311.17035
Google Scholar
Coles C. 11% of data employees paste into ChatGPT is confidential. Cyberhaven. See cyberhaven.com/blog/4–2-of-workers-have-pasted-company-data-into-chatgpt website, 2023
MATH Google Scholar
Staab R, Vero M, Balunovic M, Vechev M. Beyond memorization: violating privacy via inference with large language models. In: Proceedings of the 12th International Conference on Learning Representations. 2024
MATH Google Scholar
Mehnaz S, Dibbo S V, Kabir E, Li N, Bertino E. Are your sensitive attributes private? Novel model inversion attribute inference attacks on classification models. In: Proceedings of the 31st USENIX Security Symposium (USENIX Security 22). 2022, 4579–4596
Google Scholar
Carlini N, Tramèr F, Wallace E, Jagielski M, Herbert-Voss A, Lee K, Roberts A, Brown T B, Song D, Erlingsson U, Oprea A, Raffel C. Extracting training data from large language models. In: Bailey M D, Greenstadt R, eds. 30th USENIX Security Symposium. Berkeley: USENIX Association, 2021, 2633–2650
Google Scholar
Huang L, Yu W, Ma W, Zhong W, Feng Z, Wang H, Chen Q, Peng W, Feng X, Qin B, Liu T. A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions. 2023, arXiv preprint arXiv: 2311.05232
MATH Google Scholar
Hartmann V, Suri A, Bindschaedler V, Evans D, Tople S, West R. SoK: memorization in general-purpose large language models. 2023, arXiv preprint arXiv: 2310.18362
Google Scholar
Wang Y, Pan Y, Yan M, Su Z, Luan T H. A survey on ChatGPT: AI-generated contents, challenges, and solutions. IEEE Open Journal of the Computer Society, 2023, 4: 280–302
Article MATH Google Scholar
Kshetri N. Cybercrime and privacy threats of large language models. IT Professional, 2023, 25(3): 9–13
Article Google Scholar
Wang T, Zhang Y, Qi S, Zhao R, Xia Z, Weng J. Security and privacy on generative data in AIGC: a survey. ACM Computing Surveys, 2024
Google Scholar
Mozes M, He X, Kleinberg B, Griffin L D. Use of LLMs for illicit purposes: threats, prevention measures, and vulnerabilities. 2023, arXiv preprint arXiv: 2308.12833
Google Scholar
Smith V, Shamsabadi A S, Ashurst C, Weller A. Identifying and mitigating privacy risks stemming from language models: a survey. 2023, arXiv preprint arXiv: 2310.01424
Google Scholar
Yao Y, Duan J, Xu K, Cai Y, Sun Z, Zhang Y. A survey on large language model (LLM) security and privacy: the good, the bad, and the ugly. High-Confidence Computing, 2024, 4(2): 100211
Article Google Scholar
Iqbal U, Kohno T, Roesner F. LLM platform security: applying a systematic evaluation framework to OpenAI’s ChatGPT plugins. In: Proceedings of the 7th AAAI/ACM Conference on AI, Ethics, and Society. 2024, 611–623
Google Scholar
Brown H, Lee K, Mireshghallah F, Shokri R, Tramèr F. What does it mean for a language model to preserve privacy? In: Proceedings of 2022 ACM Conference on Fairness, Accountability, and Transparency. 2022, 2280–2292
Chapter MATH Google Scholar
Zhang D, Finckenberg-Broman P, Hoang T, Pan S, Xing Z, Staples M, Xu X. Right to be forgotten in the era of large language models: implications, challenges, and solutions. AI and Ethics, 20241-10
Neel S, Chang P. Privacy issues in large language models: a survey. 2023, arXiv preprint arXiv: 2312.06717
MATH Google Scholar
Das B C, Amini M H, Wu Y. Security and privacy challenges of large language models: a survey. 2024, arXiv preprint arXiv: 2402.00888
MATH Google Scholar
Ye J, Chen X, Xu N, Zu C, Shao Z, Liu S, Cui Y, Zhou Z, Gong C, Shen Y, Zhou J, Chen S, Gui T, Zhang Q, Huang X. A comprehensive capability analysis of GPT-3 and GPT-3.5 series models. 2023, arXiv preprint arXiv: 2303.10420
Google Scholar
Hoffmann J, Borgeaud S, Mensch A, Buchatskaya E, Cai T, Rutherford E, de Las Casas D, Hendricks L A, Welbl J, Clark A, Hennigan T, Noland E, Millican K, van den Driessche G, Damoc B, Guy A, Osindero S, Simonyan K, Elsen E, Vinyals O, Rae J W, Sifre L. An empirical analysis of compute-optimal large language model training. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 30016–30030
Google Scholar
Yang H, Liu X, Wang C D. FinGPT: open-source financial large language models. 2023, arXiv preprint arXiv: 2306.06031
MATH Google Scholar
Li Y, Li Z, Zhang K, Dan R, Jiang S, Zhang Y. ChatDoctor: a medical chat model fine-tuned on a large language model meta-AI (LLaMA) using medical domain knowledge. 2023, arXiv preprint arXiv: 2303.14070
MATH Google Scholar
Hu E J, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W. LoRA: low-rank adaptation of large language models. In: Proceedings of the 10th International Conference on Learning Representations. 2022
MATH Google Scholar
Lester B, Al-Rfou R, Constant N. The power of scale for parameter-efficient prompt tuning. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 3045–3059
Chapter MATH Google Scholar
Zhao Z, Wallace E, Feng S, Klein D, Singh S. Calibrate before use: improving few-shot performance of language models. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 12697–12706
MATH Google Scholar
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J, Hilton J, Kelton F, Miller L, Simens M, Askell A, Welinder P, Christiano P, Leike J, Lowe L. Training language models to follow instructions with human feedback. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 27730–27744
Google Scholar
Ullah I, Hassan N, Gill S S, Suleiman B, Ahanger T A, Shah Z, Qadir J, Kanhere S S. Privacy preserving large language models: ChatGPT case study based vision and framework. 2023, arXiv preprint arXiv: 2310.12523
Google Scholar
Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, et al. GPT-4 technical report. 2023, arXiv preprint arXiv: 2303.08774
Google Scholar
Jayaraman B, Ghosh E, Chase M, Roy S, Dai W, Evans D. Combing for credentials: active pattern extraction from smart reply. In: Proceedings of 2024 IEEE Symposium on Security and Privacy (SP). 2023, 1443–1461
Google Scholar
Mireshghallah F, Goyal K, Uniyal A, Berg-Kirkpatrick T, Shokri R. Quantifying privacy risks of masked language models using membership inference attacks. In: Proceedings of 2022 Conference on Empirical Methods in Natural Language Processing. 2022, 8332–8347
Chapter Google Scholar
Mouhammad N, Daxenberger J, Schiller B, Habernal I. Crowdsourcing on sensitive data with privacy-preserving text rewriting. In: Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII). 2023, 73–84
Chapter Google Scholar
Tang X, Shin R, Inan H A, Manoel A, Mireshghallah F, Lin Z, Gopi S, Kulkarni J, Sim R. Privacy-preserving in-context learning with differentially private few-shot generation. In: Proceedings of the 12th International Conference on Learning Representations. 2024
Google Scholar
Tang C, Liu Z, Ma C, Wu Z, Li Y, Liu W, Zhu D, Li Q, Li X, Liu T, Fan L. PolicyGPT: automated analysis of privacy policies with large language models. 2023, arXiv preprint arXiv: 2309.10238
MATH Google Scholar
Pan S, Hoang T, Zhang D, Xing Z, Xu X, Lu Q, Staples M. Toward the cure of privacy policy reading phobia: automated generation of privacy nutrition labels from privacy policies. 2023, arXiv preprint arXiv: 2306.10923
Google Scholar
Chanenson J, Pickering M, Apthorpe N. Automating governing knowledge commons and contextual integrity (GKC-CI) privacy policy annotations with large language models. 2023, arXiv preprint arXiv: 2311.02192
Google Scholar
Lukas N, Salem A, Sim R, Tople S, Wutschitz L, Zanella-Béguelin S. Analyzing leakage of personally identifiable information in language models. In: Proceedings of 2013 IEEE Symposium on Security and Privacy (SP). 2023, 346–363
MATH Google Scholar
Chen X, Tang S, Zhu R, Yan S, Jin L, Wang Z, Su L, Zhang Z, Wang X, Tang H. The Janus interface: how fine-tuning in large language models amplifies the privacy risks. 2023, arXiv preprint arXiv: 2310.15469
MATH Google Scholar
Huang J, Shao H, Chang K C C. Are large pre-trained language models leaking your personal information? In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022. 2022, 2038–2047
Chapter MATH Google Scholar
Kandpal N, Wallace E, Raffel C. Deduplicating training data mitigates privacy risks in language models. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 10697–10707
MATH Google Scholar
Lee K, Ippolito D, Nystrom A, Zhang C, Eck D, Callison-Burch C, Carlini N. Deduplicating training data makes language models better. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022, 8424–8445
Chapter Google Scholar
Zhang C, Ippolito D, Lee K, Jagielski M, Tramèr F, Carlini N. Counterfactual memorization in neural language models. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 39321–39362
MATH Google Scholar
Yue X, Inan H A, Li X, Kumar G, McAnallen J, Shajari H, Sun H, Levitan D, Sim R. Synthetic text generation with differential privacy: a simple and practical recipe. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023, 1321–1342
Chapter Google Scholar
Majmudar J, Dupuy C, Peris C, Smaili S, Gupta R, Zemel R. Differentially private decoding in large language models. 2022, arXiv preprint arXiv: 2205.13621
Google Scholar
Xiao Y, Jin Y, Bai Y, Wu Y, Yang X, Luo X, Yu W, Zhao X, Liu Y, Gu Q, Chen H, Wang W, Cheng W. PrivacyMind: large language models can be contextual privacy protection learners. 2023, arXiv preprint arXiv: 2310.02469
Google Scholar
Behnia R, Ebrahimi M R, Pacheco J, Padmanabhan B. EW-Tune: a framework for privately fine-tuning large language models with differential privacy. In: Proceedings of 2022 IEEE International Conference on Data Mining Workshops (ICDMW). 2022, 560–566
Chapter MATH Google Scholar
West P, Lu X, Holtzman A, Bhagavatula C, Hwang J D, Choi Y. Reflective decoding: beyond unidirectional generation with off-the-shelf language models. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021, 1435–1450
Google Scholar
Gokaslan A, Cohen V. Openwebtext corpus. See Skylion007.github.io/OpenWebTextCorpus website, 2019
Google Scholar
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21(1): 140
MathSciNet Google Scholar
Beyer A, Kauermann G, Schütze H. Embedding space correlation as a measure of domain similarity. In: Proceedings of the 12th Language Resources and Evaluation Conference. 2020, 2431–2439
MATH Google Scholar
Chelba C, Mikolov T, Schuster M, Ge Q, Brants T, Koehn P, Robinson T. One billion word benchmark for measuring progress in statistical language modeling. 2013, arXiv preprint arXiv: 1312.3005
Google Scholar
Zellers R, Holtzman A, Rashkin H, Bisk Y, Farhadi A, Roesner F, Choi Y. Defending against neural fake news. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 812
Google Scholar
Tur G, Hakkani-Tür D, Heck L. What is left to be understood in ATIS? In: Proceedings of 2010 IEEE Spoken Language Technology Workshop. 2010, 19–24
Chapter MATH Google Scholar
Coucke A, Saade A, Ball A, Bluche T, Caulier A, Leroy D, Doumouro C, Gisselbrecht T, Caltagirone F, Lavril T, Primet M, Dureau J. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. 2018, arXiv preprint arXiv: 1805.10190
Google Scholar
Li J, Ott M, Cardie C. Identifying manipulated offerings on review portals. In: Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing. 2013, 1933–1942
Chapter MATH Google Scholar
Wang A, Singh A, Michael J, Hill F, Levy O, Bowman S R. Glue: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 2018, 353–355
Chapter Google Scholar
Socher R, Perelygin A, Wu J, Chuang J, Manning C D, Ng A Y, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing. 2013, 1631–1642
Chapter Google Scholar
Williams A, Nangia N, Bowman S R. A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2017, 1112–1122
Google Scholar
Chalkidis I, Androutsopoulos I, Aletras N. Neural legal judgment prediction in English. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4317–4323
Chapter MATH Google Scholar
Klimt B, Yang Y. The Enron corpus: a new dataset for email classification research. In: Proceedings of the 15th European Conference on Machine Learning. 2004, 217–226
MATH Google Scholar
Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015, 649–657
MATH Google Scholar
Liu J, Cyphers S, Pasupat P, McGraw I, Glass J. A conversational movie search system based on conditional random fields. In: Proceedings of the 13th Annual Conference of the International Speech Communication Association. 2012, 2454–2457
Google Scholar
Villalobos P, Sevilla J, Heim L, Besiroglu T, Hobbhahn M, Ho A. Will we run out of data? An analysis of the Limits of scaling datasets in machine learning. 2022, arXiv preprint arXiv: 2211.04325
Google Scholar
Chen J, Yang D. Unlearn what you want to forget: efficient unlearning for LLMs. In: Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing. 2023, 12041–12052
Chapter MATH Google Scholar
Borkar J. What can we learn from data leakage and unlearning for law? 2023, arXiv preprint arXiv: 2307.10476
MATH Google Scholar
Hu Z, Zhang Y, Xiao M, Wang W, Feng F, He X. Exact and efficient unlearning for large language model-based recommendation. 2024, arXiv preprint arXiv: 2404.10327
Ziegler C N, McNee S M, Konstan J A, Lausen G. Improving recommendation lists through topic diversification. In: Proceedings of the 14th International Conference on World Wide Web. 2005, 22–32
Chapter MATH Google Scholar
Harper F M, Konstan J A. The MovieLens datasets: history and context. ACM Transactions on Interactive Intelligent Systems (TiiS), 2015, 5(4): 19
MATH Google Scholar
Jang J, Yoon D, Yang S, Cha S, Lee M, Logeswaran L, Seo M. Knowledge unlearning for mitigating privacy risks in language models. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023, 14389–14408
Chapter Google Scholar
Wang L, Chen T, Yuan W, Zeng X, Wong K F, Yin H. KGA: a general machine unlearning framework based on knowledge gap alignment. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023, 13264–13276
Chapter Google Scholar
Tuggener D, von Däniken P, Peetz T, Cieliebak M. LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Proceedings of the 12th Language Resources and Evaluation Conference. 2020, 1235–1241
Google Scholar
Cettolo M, Niehues J, Stüker S, Bentivogli L, Federico M. Report on the 11th IWSLT evaluation campaign. In: Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign. 2014, 2–17
Google Scholar
Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J. Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 6th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018, 2204–2213
Chapter Google Scholar
Maas A, Daly R E, Pham P T, Huang D, Ng A Y, Potts C. Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011, 142–150
Google Scholar
Gliwa B, Mochol I, Biesek M, Wawer A. SAMSum corpus: a human-annotated dialogue dataset for abstractive summarization. In: Proceedings of the 2nd Workshop on New Frontiers in Summarization. 2019, 70–79
Chapter MATH Google Scholar
Zaman K, Choshen L, Srivastava S. Fuse to forget: bias reduction and selective memorization through model fusion. In: Proceedings of 2024 Conference on Empirical Methods in Natural Language Processing. 2024, 18763–18783
Chapter Google Scholar
Rangel F, Rosso P, Verhoeven B, Daelemans W, Potthast M, Stein B. Overview of the 4th author profiling task at pan 2016: cross-genre evaluations. In: Proceedings of the Working Notes Papers of the CLEF 2016 Evaluation Labs. 2016, 750–784
Google Scholar
Pawelczyk M, Neel S, Lakkaraju H. In-context unlearning: language models as few-shot unlearners. In: Proceedings of the 41st International Conference on Machine Learning. 2024
MATH Google Scholar
Bourtoule L, Chandrasekaran V, Choquette-Choo C A, Jia H, Travers A, Zhang B, Lie D, Papernot N. Machine unlearning. In: Proceedings of 2021 IEEE Symposium on Security and Privacy (SP). 2021, 141–159
Chapter Google Scholar
Koch K, Soll M. No matter how you slice it: machine unlearning with SISA comes at the expense of minority classes. In: Proceedings of 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). 2023, 622–637
Chapter MATH Google Scholar
Li X L, Liang P. Prefix-tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021, 4582–4597
MATH Google Scholar
Wen R, Wang T, Backes M, Zhang Y, Salem A. Last one standing: a comparative analysis of security and privacy of soft prompt tuning, LoRA, and in-context learning. 2023, arXiv preprint arXiv: 2310.11397
Google Scholar
Liu R, Wang T, Cao Y, Xiong L. PreCurious: how innocent pre-trained language models turn into privacy traps. 2024, arXiv preprint arXiv: 2403.09562
Google Scholar
Qi X, Zeng Y, Xie T, Chen P Y, Jia R, Mittal P, Henderson P. Fine-tuning aligned language models compromises safety, even when users do not intend to! In: Proceedings of the 12th International Conference on Learning Representations. 2024
MATH Google Scholar
Li H, Guo D, Fan W, Xu M, Huang J, Meng F, Song Y. Multi-step jailbreaking privacy attacks on ChatGPT. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023. 2023, 4138–4153
Chapter MATH Google Scholar
Deng G, Liu Y, Li Y, Wang K, Zhang Y, Li Z, Wang H, Zhang T, Liu Y. MASTERKEY: automated jailbreaking of large language model Chatbots. In: Proc. ISOC NDSS. 2024
Google Scholar
Carlini N, Liu C, Erlingsson Ú, Kos J, Song D. The secret sharer: evaluating and testing unintended memorization in neural networks. In: Proceedings of the 28th USENIX Conference on Security Symposium. 2019, 267–284
Google Scholar
Ai4Privacy. PII Masking 300k Dataset. See huggingface.co/datasets/ai4privacy/pii-masking-300k website, 2024
Google Scholar
Merity S, Xiong C, Bradbury J, Socher R. Pointer sentinel mixture models. In: Proceedings of the 5th International Conference on Learning Representations. 2016
MATH Google Scholar
Kornblith S, Norouzi M, Lee H, Hinton G. Similarity of neural network representations revisited. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 3519–3529
MATH Google Scholar
Zhang Y, Ippolito D. Prompts should not be seen as secrets: systematically measuring prompt extraction attack success. 2023, arXiv preprint arXiv: 2307.06865
Google Scholar
Duan H, Dziedzic A, Papernot N, Boenisch F. Flocks of stochastic parrots: differentially private prompt learning for large language models. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2024, 3358
MATH Google Scholar
Zhang X, Chen C, Xie Y, Chen X, Zhang J, Xiang Y. A survey on privacy inference attacks and defenses in cloud-based deep neural network. Computer Standards & Interfaces, 2023, 83: 103672
Article MATH Google Scholar
Plant R, Gkatzia D, Giuffrida V. CAPE: Context-Aware Private Embeddings for private language learning. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 7970–7978
Chapter MATH Google Scholar
Li Y, Tan Z, Liu Y. Privacy-preserving prompt tuning for large language model services. 2023, arXiv preprint arXiv: 2305.06212
MATH Google Scholar
Feyisetan O, Balle B, Drake T, Diethe T. Privacy- and utility-preserving textual analysis via calibrated multivariate perturbations. In: Proceedings of the 13th International Conference on Web Search and Data Mining. 2020, 178–186
Chapter Google Scholar
Zhang Z, Zhang X, Xie W, Lu Y. Responsible task automation: empowering large language models as responsible task Automators. 2023, arXiv preprint arXiv: 2306.01242
MATH Google Scholar
Ribeiro B, Rolla V, Santos R. INCOGNITUS: a toolbox for automated clinical notes anonymization. In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 2023, 187–194
MATH Google Scholar
Chen Y, Li T, Liu H, Yu Y. Hide and seek (HaS): a lightweight framework for prompt privacy protection. 2023, arXiv preprint arXiv: 2309.03057
MATH Google Scholar
Kan Z, Qiao L, Yu H, Peng L, Gao Y, Li D. Protecting user privacy in remote conversational systems: a privacy-preserving framework based on text sanitization. 2023, arXiv preprint arXiv: 2306.08223
Google Scholar
Goyal S, Choudhury A R, Raje S, Chakaravarthy V, Sabharwal Y, Verma A. PoWER-BERT: accelerating BERT inference via progressive word-vector elimination. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 346
MATH Google Scholar
Modarressi A, Mohebbi H, Pilehvar M T. AdapLeR: speeding up inference by adaptive length reduction. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022, 1–15
MATH Google Scholar
Zhou X, Lu J, Gui T, Ma R, Fei Z, Wang Y, Ding Y, Cheung Y, Zhang Q, Huang X J. TextFusion: privacy-preserving pre-trained model inference via token fusion. In: Proceedings of 2022 Conference on Empirical Methods in Natural Language Processing. 2022, 8360–8371
Chapter MATH Google Scholar
Beckerich M, Plein L, Coronado S. RatGPT: Turning online LLMs into proxies for malware attacks. 2023, arXiv preprint arXiv: 2308.09183
Google Scholar
Li Y, Wei F, Zhao J, Zhang C, Zhang H. Rain: your language models can align themselves without finetuning. 2023, arXiv preprint arXiv: 2309.07124
MATH Google Scholar
Pisano M, Ly P, Sanders A, Yao B, Wang D, Strzalkowski T, Si M. Bergeron: combating adversarial attacks through a conscience-based alignment framework. 2023, arXiv preprint arXiv: 2312.00029
Google Scholar
Cao B, Cao Y, Lin L, Chen J. Defending against alignment-breaking attacks via robustly aligned LLM. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024, 10542–10560
Chapter MATH Google Scholar
Inan H, Upasani K, Chi J, Rungta R, Iyer K, Mao Y, Tontchev M, Hu Q, Fuller B, Testuggine D, Testuggine M. Llama guard: LLM-based input-output safeguard for human-AI conversations. 2023, arXiv preprint arXiv: 2312.06674
Google Scholar
Meta. Llm-guard. See llm-guard.com website, 2024
MATH Google Scholar
Microsoft. Presidio analyzer. See pypi.org/project/presidio-analyzer/website, 2024
Google Scholar
Xu Z, Liu Y, Deng G, Li Y, Picek S. LLM jailbreak attack versus defense techniques–a comprehensive study. 2024, arXiv preprint arXiv: 2402.13457
Google Scholar
Yang X, Chen K, Zhang W, Liu C, Qi Y, Zhang J, Fang H, Yu N. Watermarking text generated by black-box language models. 2023, arXiv preprint arXiv: 2305.08883
MATH Google Scholar
Tu S, Sun Y, Bai Y, Yu J, Hou L, Li J. WaterBench: towards holistic evaluation of watermarks for large language models. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024, 1517–1542
Chapter MATH Google Scholar
Singh K, Zou J. New evaluation metrics capture quality degradation due to LLM watermarking. 2023, arXiv preprint arXiv: 2312.02382
MATH Google Scholar
Li Y, Li Q, Cui L, Bi W, Wang Z, Wang L, Yang L, Shi S, Zhang Y. MAGE: machine-generated text detection in the wild. 2023, arXiv preprint arXiv: 2305.13242
MATH Google Scholar
Antoun W, Mouilleron V, Sagot B, Seddah D. Towards a robust detection of language model generated text: is ChatGPT that easy to detect? 2023, arXiv preprint arXiv: 2306.05871
MATH Google Scholar
Chen Y, Kang H, Zhai V, Li L, Singh R, Ramakrishnan B. GPT-sentinel: distinguishing human and ChatGPT generated content. 2023, arXiv preprint arXiv: 2305.07969
Google Scholar
Sarvazyan A M, González J Á, Rosso P, Franco-Salvador M. Supervised machine-generated text detectors: family and scale matters. In: Proceedings of the 14th International Conference of the CLEF Association on Experimental IR Meets Multilinguality, Multimodality, and Interaction. 2023, 121–132
Chapter Google Scholar
Bhattacharjee A, Kumarage T, Moraffah R, Liu H. ConDA: contrastive domain adaptation for AI-generated text detection. In: Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 2023, 598–610
Google Scholar
Liu Y, Zhang Z, Zhang W, Yue S, Zhao X, Cheng X, Zhang Y, Hu H. ArguGPT: evaluating, understanding and identifying argumentative essays generated by GPT models. 2023, arXiv preprint arXiv: 2304.07666
MATH Google Scholar
Bhattacharjee A, Liu H. Fighting fire with fire: can ChatGPT detect AI-generated text?. ACM SIGKDD Explorations Newsletter, 2023, 25(2): 14–21
Article MATH Google Scholar
Koike R, Kaneko M, Okazaki N. OUTFOX: LLM-generated essay detection through in-context learning with adversarially generated examples. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 2024, 21258–21266
MATH Google Scholar
Yu X, Qi Y, Chen K, Chen G, Yang X, Zhu P, Zhang W, Yu N. GPT paternity test: GPT generated text detection with GPT genetic inheritance. 2023, arXiv preprint arXiv: 2305.12519
MATH Google Scholar
Qiu W, Lie D, Austin L. Calpric: inclusive and fine-grain labeling of privacy policies with crowdsourcing and active learning. In: Proceedings of the 32nd USENIX Conference on Security Symposium. 2023, 60
MATH Google Scholar
Cui H, Trimananda R, Markopoulou A, Jordan S. POLIGRAPH: automated privacy policy analysis using knowledge graphs. In: Proceedings of the 32nd USENIX Conference on Security Symposium. 2023, 59
Google Scholar
Chen C, Feng X, Li Y, Lyu L, Zhou J, Zheng X, Yin J. Integration of large language models and federated learning. 2023, arXiv preprint arXiv: 2307.08925
MATH Google Scholar
Kuang W, Qian B, Li Z, Chen D, Gao D, Pan X, Xie Y, Li Y, Ding B, Zhou J. FederatedScope-LLM: a comprehensive package for fine-tuning large language models in federated learning. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024, 5260–5271
Chapter MATH Google Scholar
Gupta S, Huang Y, Zhong Z, Gao T, Li K, Chen D. Recovering private text in federated learning of language models. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 8130–8143
MATH Google Scholar
Xiao G, Lin J, Han S. Offsite-tuning: transfer learning without full model. 2023, arXiv preprint arXiv: 2302.04870
MATH Google Scholar
Fan T, Kang Y, Ma G, Chen W, Wei W, Fan L, Yang Q. Fate-LLM: a industrial grade federated learning framework for large language models. 2023, arXiv preprint arXiv: 2310.10049
MATH Google Scholar
Gong Y. Multilevel large language models for everyone. 2023, arXiv preprint arXiv: 2307.13221
MATH Google Scholar
Xi Z, Chen W, Guo X, He W, Ding Y, et al. The rise and potential of large language model based agents: a survey. 2023, arXiv preprint arXiv: 2309.07864
MATH Google Scholar
Wu Q, Bansal G, Zhang J, Wu Y, Zhang S, Zhu E, Li B, Jiang L, Zhang X, Wang C. AutoGen: enabling next-gen LLM applications via multi-agent conversation. 2023, arXiv preprint arXiv: 2308.08155
MATH Google Scholar
Zhang H, Du W, Shan J, Zhou Q, Du Y, Tenenbaum J B, Shu T, Gan C. Building cooperative embodied agents modularly with large language models. In: Proceedings of the 12th International Conference on Learning Representations. 2024
MATH Google Scholar
Liu X, Lai H, Yu H, Xu Y, Zeng A, Du Z, Zhang P, Dong Y, Tang J. WebGLM: towards an efficient web-enhanced question answering system with human preferences. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023, 4549–4560
Chapter MATH Google Scholar
Zeng S, Zhang J, He P, Liu Y, Xing Y, Xu H, Ren J, Chang Y, Wang S, Yin D, Tang J. The good and the bad: Exploring privacy issues in retrieval-augmented generation (RAG). In: Proceedings of the Findings of the Association for Computational Linguistics: ACL 2024. 2024, 4505–4524
Chapter Google Scholar
Chen Y, Mendes E, Das S, Xu W, Ritter A. Can language models be instructed to protect personal information? 2023, arXiv preprint arXiv: 2310.02224
Google Scholar
Niu L, Mirza S, Maradni Z, Pöpper C. CodexLeaks: privacy leaks from code generation language models in GitHub copilot. In: Proceedings of the 32nd USENIX Conference on Security Symposium. 2023, 120
MATH Google Scholar
He X, Zannettou S, Shen Y, Zhang Y. You only prompt once: on the capabilities of prompt learning on large language models to tackle toxic content. In: Proceedings of 2024 IEEE Symposium on Security and Privacy (SP). 2024, 770–787
Chapter MATH Google Scholar
Samson L, Barazani N, Ghebreab S, Asano Y M. Privacy-aware visual language models. 2024, arXiv preprint arXiv: 2405.17423
Google Scholar
Caldarella S, Mancini M, Ricci E, Aljundi R. The phantom menace: unmasking privacy leakages in vision-language models. 2024, arXiv preprint arXiv: 2408.01228
Google Scholar
Bengio Y, Hu E J. Scaling in the service of reasoning & model-based ML. See yoshuabengio.org/2023/03/21/scaling-in-the-service-of-reasoning-model-based-ml/website, 2023
MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (2021YFC3300600).

Author information

Authors and Affiliations

School of Computer Science, Fudan University, Shanghai, 200433, China
Hongyi Li, Jiawei Ye & Jie Wu

Authors

Hongyi Li
View author publications
You can also search for this author inPubMed Google Scholar
Jiawei Ye
View author publications
You can also search for this author inPubMed Google Scholar
Jie Wu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jiawei Ye.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Hongyi LI received the BE degree from Tianjin University, China in 2022. She is currently working toward the PhD degree with the School of Computer Science, Fudan University, China. Her current research interests include financial technology security and data privacy.

Jiawei YE is currently a faculty member with the School of Computer Science at Fudan University and a senior engineer. His primary research interests include network and information security, sensitive information protection.

Jie WU is currently a professor in the School of Computer Science, Fudan University, China. His main research interests include network multimedia and information security.

Electronic Supplementary Material