Skip to main content
Log in

Safeguarding text generation API’s intellectual property through meaning-preserving lexical watermarks

  • Letter
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Conclusion

We aim to protect text generation APIs in this work. Previous LW methods compromised text quality and made watermarks easy to detect through error analysis due to not considering polysemy. To fit this, we propose meaning-preserving lexical substitution method that considers the target word’s correct meaning in context x. This enables high-confidence identification while making watermarks more invisible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, et al. PaLM: scaling language modeling with pathways. Journal of Machine Learning Research, 2023, 24(240): 1–113

    Google Scholar 

  2. He X, Xu Q, Lyu L, Wu F, Wang C. Protecting intellectual property of language generation APIs with lexical watermark. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence, 34th Conference on Innovative Applications of Artificial Intelligence, 12th Symposium on Educational Advances in Artificial Intelligence. 2022, 10758–10766

  3. He X, Xu Q, Zeng Y, Lyu L, Wu F, Li J, Jia R. CATER: intellectual property protection on text generation APIs via conditional watermarks. In: Proceedings of the 36th Conference on Neural Information Processing Systems. 2022

  4. Qiang J, Zhu S, Li Y, Zhu Y, Yuan Y, Wu X. Natural language watermarking via paraphraser-based lexical substitution. Artificial Intelligence, 2023, 317: 103859

    Article  MATH  Google Scholar 

  5. Qiang J, Liu K, Li Y, Yuan Y, Zhu Y. ParaLS: lexical substitution via pretrained paraphraser. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 3731–3746

  6. Yuan W, Neubig G, Liu P. BARTScore: evaluating generated text as text generation. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 34

  7. Sellam T, Das D, Parikh A. BLEURT: learning robust metrics for text generation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 7881–7892

  8. Miller G A. WordNet: An Electronic Lexical Database. Cambridge: MIT Press, 1998

    MATH  Google Scholar 

  9. Venugopal A, Uszkoreit J, Talbot D, Och F J, Ganitkevitch J. Watermarking the outputs of structured prediction with an application in statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2011, 1363–1372

Download references

Acknowledgements

This research was partially supported by the National Natural Science Foundation of China (Grant Nos. 62076217 and U22B2037), and the Blue Project of Yangzhou University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Li.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Supporting information The supporting information is available online at https://journal.hep.com.cn and https://link.springer.com.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, S., Li, Y., Ouyang, X. et al. Safeguarding text generation API’s intellectual property through meaning-preserving lexical watermarks. Front. Comput. Sci. 17, 176352 (2023). https://doi.org/10.1007/s11704-023-3252-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-023-3252-0

Navigation