Skip to main content

Searching for Textual Adversarial Examples with Learned Strategy

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1791))

Included in the following conference series:

  • 769 Accesses

Abstract

Adversarial attacks can help to reveal the vulnerability of neural networks. In the text classification domain, synonym replacement is an effective way to generate adversarial examples. However, the number of replacement combinations grows exponentially with the text length, making the search difficult. In this work, we propose an attack method which combines a synonym selection network and search strategies of beam search and Monte Carlo tree search (MCTS). The synonym selection network learns the patterns of synonyms which have high attack effect. We combine the network with beam search to gain a broader view by multiple search paths, and with MCTS to gain a deeper view by the exploration feedback, so as to effectively avoid local optimum. We evaluate our method with four datasets in a challenging black box setting which requires no access to the victim model’s parameters. Experimental results show that our method can generate high-quality adversarial examples with higher attack success rate and fewer number of victim model queries, and further experiments show that our method has higher transferability on the victim models. The code and data can be obtained via https://github.com/CMACH508/SearchTextualAdversarialExamples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Srivastava, M., Chang, K.W.: Generating natural language adversarial examples. In: Proceedings of EMNLP (2018)

    Google Scholar 

  2. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of EMNLP (2015)

    Google Scholar 

  3. Browne, C.B., et al.: A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games (2012)

    Google Scholar 

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL (2019)

    Google Scholar 

  5. Dong, Z., Dong, Q.: Hownet and the Computation of Meaning. World Scientific (2006)

    Google Scholar 

  6. Ebrahimi, J., Rao, A., Lowd, D., Dou, D.: HotFlip: white-box adversarial examples for text classification. In: Proceedings of ACL (2018)

    Google Scholar 

  7. Eger, S., et al.: Text processing like humans do: visually attacking and shielding NLP systems. In: Proceedings of NAACL (2019)

    Google Scholar 

  8. Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers (2018)

    Google Scholar 

  9. Garg, S., Ramakrishnan, G.: BAE: BERT-based adversarial examples for text classification. In: Proceedings of EMNLP (2020)

    Google Scholar 

  10. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Proceedings of ICLR (2015)

    Google Scholar 

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. (1997)

    Google Scholar 

  12. Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. In: Proceedings of EMNLP (2017)

    Google Scholar 

  13. Jin, D., Jin, Z., Zhou, J.T., Szolovits, P.: Is BERT really robust? A strong baseline for natural language attack on text classification and entailment. In: Proceedings of AAAI (2020)

    Google Scholar 

  14. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. In: ICLR Workshop (2017)

    Google Scholar 

  15. Li, D., et al.: Contextualized perturbation for textual adversarial attack. In: Proceedings of NAACL (2021)

    Google Scholar 

  16. Li, L., Ma, R., Guo, Q., Xue, X., Qiu, X.: BERT-ATTACK: adversarial attack against BERT using BERT. In: Proceedings of EMNLP (2020)

    Google Scholar 

  17. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of ACL (2011)

    Google Scholar 

  18. Maheshwary, R., Maheshwary, S., Pudi, V.: Generating natural language attacks in a hard label black box setting (2021)

    Google Scholar 

  19. Maheshwary, R., Maheshwary, S., Pudi, V.: A strong baseline for query efficient attacks in a black box setting (2021)

    Google Scholar 

  20. Miller, G.A.: WordNet: a lexical database for English. In: Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, 23–26 February 1992 (1992)

    Google Scholar 

  21. Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of ACL (2005)

    Google Scholar 

  22. Pruthi, D., Dhingra, B., Lipton, Z.C.: Combating adversarial misspellings with robust word recognition. In: Proceedings of ACL (2019)

    Google Scholar 

  23. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

    Google Scholar 

  24. Reddy, R.: Speech understanding systems: summary of results of the five-year research effort. Carnegie Mellon University (1976)

    Google Scholar 

  25. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of EMNLP (2019)

    Google Scholar 

  26. Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of ACL (2019)

    Google Scholar 

  27. Williams, A., Nangia, N., Bowman, S.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of NAACL (2018)

    Google Scholar 

  28. Xiangzhe, G., Shikui, T., Lei, X.: Learning to generate textual adversarial examples. In: The 31st International Conference on Artificial Neural Networks (2022)

    Google Scholar 

  29. Yoo, J.Y., Morris, J., Lifland, E., Qi, Y.: Searching for a search method: benchmarking search algorithms for generating NLP adversarial examples. In: Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (2020)

    Google Scholar 

  30. Zang, Y., et al.: Word-level textual adversarial attacking as combinatorial optimization. In: Proceedings of ACL (2020)

    Google Scholar 

  31. Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of NeurIPS (2015)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Key R &D Program of China (2018AAA0100700), and Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shikui Tu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, X., Su, R., Tu, S., Xu, L. (2023). Searching for Textual Adversarial Examples with Learned Strategy. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1791. Springer, Singapore. https://doi.org/10.1007/978-981-99-1639-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-1639-9_32

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-1638-2

  • Online ISBN: 978-981-99-1639-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics