Skip to main content

Attention-Based Genetic Algorithm for Adversarial Attack in Natural Language Processing

  • Conference paper
  • First Online:
  • 1747 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13398))

Abstract

Many recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples. Adversarial attacks on DNNs for natural language processing tasks are notoriously more challenging than that in computer vision. This paper proposes an attention-based genetic algorithm (dubbed AGA) for generating adversarial examples under a black-box setting. In particular, the attention mechanism helps identify the relatively more important words in a given text. Based on this information, bespoke crossover and mutation operators are developed to navigate AGA to focus on exploiting relatively more important words thus leading to a save of computational resources. Experiments on three widely used datasets demonstrate that AGA achieves a higher success rate with less than \(48\%\) of the number of queries than the peer algorithms. In addition, the underlying DNN can become more robust by using the adversarial examples obtained by AGA for adversarial training.

This work was supported by UKRI Future Leaders Fellowship (MR/S017062/1), EPSRC (2404317), NSFC (62076056), Royal Society (IES/R2/212077) and Amazon Research Award.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews.

  2. 2.

    https://www.kaggle.com/amananandrai/ag-news-classification-dataset.

  3. 3.

    https://nlp.stanford.edu/projects/snli/.

  4. 4.

    https://huggingface.co/.

References

  1. Alzantot, M., Sharma, Y., Elgohary, A., Ho, B., Srivastava, M.B., Chang, K.: Generating natural language adversarial examples. In: EMNLP’18: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2890–2896. Association for Computational Linguistics (2018)

    Google Scholar 

  2. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: EMNLP’15: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642. The Association for Computational Linguistics (2015). https://doi.org/10.18653/v1/d15-1075

  3. Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy, SP, pp. 39–57. IEEE Computer Society (2017). https://doi.org/10.1109/SP.2017.49

  4. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL’19: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/n19-1423

  5. Garg, S., Ramakrishnan, G.: BAE: bert-based adversarial examples for text classification. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) EMNLP’20: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 6174–6181. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.498, https://doi.org/10.18653/v1/2020.emnlp-main.498

  6. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR’15: Proceedings of the 2019 International Conference on Learning Representations (2015). http://arxiv.org/abs/1412.6572

  7. Jia, R., Raghunathan, A., Göksel, K., Liang, P.: Certified robustness to adversarial word substitutions. In: EMNLP-IJCNLP’19 : Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 4127–4140. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1423

  8. Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP’14: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. ACL (2014). https://doi.org/10.3115/v1/d14-1181

  9. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: ICLR’20: Proceedings of the 2020 International Conference on Learning Representations. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS

  10. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: ACL’11: Proc. of the 2011 Association for Computational Linguistics: Human Language Technologies. pp. 142–150. The Association for Computer Linguistics (2011), https://aclanthology.org/P11-1015/

  11. Maheshwary, R., Maheshwary, S., Pudi, V.: Generating natural language attacks in a hard label black box setting. In: AAAI’21: Proc. of the Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI, pp. 13525–13533. AAAI Press (2021). https://ojs.aaai.org/index.php/AAAI/article/view/17595

  12. Morris, J.X., Lifland, E., Yoo, J.Y., Grigsby, J., Jin, D., Qi, Y.: Textattack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In: EMNLP’20: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 119–126. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.16

  13. Papernot, N., McDaniel, P.D., Swami, A., Harang, R.E.: Crafting adversarial input sequences for recurrent neural networks. In: Brand, J., Valenti, M.C., Akinpelu, A., Doshi, B.T., Gorsic, B.L. (eds.) MILCOM’16: Proceedings of the 2016 IEEE Military Communications Conference, pp. 49–54. IEEE (2016). https://doi.org/10.1109/MILCOM.2016.7795300

  14. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP’14: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014). https://doi.org/10.3115/v1/d14-1162

  15. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS’15: Proceedings of the 2015 Advances in Neural Information Processing Systems, pp. 91–99 (2015). https://proceedings.neurips.cc/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html

  16. Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: ACL’19: Proceedings of the 2019 Association for Computational Linguistics, pp. 1085–1097. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1103

  17. Samanta, S., Mehta, S.: Towards crafting text adversarial samples. CoRR abs/1707.02812 (2017). http://arxiv.org/abs/1707.02812

  18. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019). http://arxiv.org/abs/1910.01108

  19. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS’14: Proc. of the 2014 Advances in Neural Information Processing Systems. pp. 3104–3112 (2014), https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html

  20. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., Fergus, R.: Intriguing properties of neural networks. In: Bengio, Y., LeCun, Y. (eds.) ICLR’14: Proc. of the 2014 International Conference on Learning Representations (2014), http://arxiv.org/abs/1312.6199

  21. Wang, X., Jin, H., He, K.: Natural language adversarial attacks and defenses in word level. CoRR abs/1909.06723 (2019). http://arxiv.org/abs/1909.06723

  22. Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: EMNLP’16: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/d16-1058

  23. Wang, Y., Huang, R., Song, S., Huang, Z., Huang, G.: Not all images are worth 16x16 words: Dynamic transformers for efficient image recognition. In: NIPS’21: Proc. of the 2021 Advances in Neural Information Processing Systems, vol. 34 (2021). https://proceedings.neurips.cc/paper/2021/hash/64517d8435994992e682b3e4aa0a0661-Abstract.html

  24. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL’16: Proc. of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/n16-1174, https://doi.org/10.18653/v1/n16-1174

  25. Zang, Y., Qi, F., Yang, C., Liu, Z., Zhang, M., Liu, Q., Sun, M.: Word-level textual adversarial attacking as combinatorial optimization. In: ACL’20: Proceedings of the 2020 Annual Meeting of the Association for Computational Linguistics, pp. 6066–6080. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.540

  26. Zhang, W.E., Sheng, Q.Z., Alhazmi, A., Li, C.: Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Trans. Intell. Syst. Technol. 11(3), 24:1–24:41 (2020). https://doi.org/10.1145/3374217

  27. Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS’15: Proceedings of the 2015 Advances in Neural Information Processing Systems, pp. 649–657 (2015). https://proceedings.neurips.cc/paper/2015/hash/250cf8b51c773f3f8dc8b4be867a9a02-Abstract.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, S., Li, K., Min, G. (2022). Attention-Based Genetic Algorithm for Adversarial Attack in Natural Language Processing. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds) Parallel Problem Solving from Nature – PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13398. Springer, Cham. https://doi.org/10.1007/978-3-031-14714-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-14714-2_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-14713-5

  • Online ISBN: 978-3-031-14714-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics