Attention-Based Genetic Algorithm for Adversarial Attack in Natural Language Processing

Zhou, Shasha; Li, Ke; Min, Geyong

doi:10.1007/978-3-031-14714-2_24

Attention-Based Genetic Algorithm for Adversarial Attack in Natural Language Processing

Shasha Zhou¹³,
Ke Li ORCID: orcid.org/0000-0001-7200-4244¹⁴ &
Geyong Min¹⁴

Conference paper
First Online: 14 August 2022

1747 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13398))

Abstract

Many recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples. Adversarial attacks on DNNs for natural language processing tasks are notoriously more challenging than that in computer vision. This paper proposes an attention-based genetic algorithm (dubbed AGA) for generating adversarial examples under a black-box setting. In particular, the attention mechanism helps identify the relatively more important words in a given text. Based on this information, bespoke crossover and mutation operators are developed to navigate AGA to focus on exploiting relatively more important words thus leading to a save of computational resources. Experiments on three widely used datasets demonstrate that AGA achieves a higher success rate with less than \(48\%\) of the number of queries than the peer algorithms. In addition, the underlying DNN can become more robust by using the adversarial examples obtained by AGA for adversarial training.

This work was supported by UKRI Future Leaders Fellowship (MR/S017062/1), EPSRC (2404317), NSFC (62076056), Royal Society (IES/R2/212077) and Amazon Research Award.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Alzantot, M., Sharma, Y., Elgohary, A., Ho, B., Srivastava, M.B., Chang, K.: Generating natural language adversarial examples. In: EMNLP’18: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2890–2896. Association for Computational Linguistics (2018)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: EMNLP’15: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642. The Association for Computational Linguistics (2015). https://doi.org/10.18653/v1/d15-1075
Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy, SP, pp. 39–57. IEEE Computer Society (2017). https://doi.org/10.1109/SP.2017.49
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL’19: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/n19-1423
Garg, S., Ramakrishnan, G.: BAE: bert-based adversarial examples for text classification. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) EMNLP’20: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 6174–6181. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.498, https://doi.org/10.18653/v1/2020.emnlp-main.498
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR’15: Proceedings of the 2019 International Conference on Learning Representations (2015). http://arxiv.org/abs/1412.6572
Jia, R., Raghunathan, A., Göksel, K., Liang, P.: Certified robustness to adversarial word substitutions. In: EMNLP-IJCNLP’19 : Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 4127–4140. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1423
Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP’14: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. ACL (2014). https://doi.org/10.3115/v1/d14-1181
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: ICLR’20: Proceedings of the 2020 International Conference on Learning Representations. OpenReview.net (2020). https://openreview.net/forum?id=H1eA7AEtvS
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: ACL’11: Proc. of the 2011 Association for Computational Linguistics: Human Language Technologies. pp. 142–150. The Association for Computer Linguistics (2011), https://aclanthology.org/P11-1015/
Maheshwary, R., Maheshwary, S., Pudi, V.: Generating natural language attacks in a hard label black box setting. In: AAAI’21: Proc. of the Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI, pp. 13525–13533. AAAI Press (2021). https://ojs.aaai.org/index.php/AAAI/article/view/17595
Morris, J.X., Lifland, E., Yoo, J.Y., Grigsby, J., Jin, D., Qi, Y.: Textattack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In: EMNLP’20: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 119–126. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.16
Papernot, N., McDaniel, P.D., Swami, A., Harang, R.E.: Crafting adversarial input sequences for recurrent neural networks. In: Brand, J., Valenti, M.C., Akinpelu, A., Doshi, B.T., Gorsic, B.L. (eds.) MILCOM’16: Proceedings of the 2016 IEEE Military Communications Conference, pp. 49–54. IEEE (2016). https://doi.org/10.1109/MILCOM.2016.7795300
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP’14: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014). https://doi.org/10.3115/v1/d14-1162
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS’15: Proceedings of the 2015 Advances in Neural Information Processing Systems, pp. 91–99 (2015). https://proceedings.neurips.cc/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html
Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: ACL’19: Proceedings of the 2019 Association for Computational Linguistics, pp. 1085–1097. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/p19-1103
Samanta, S., Mehta, S.: Towards crafting text adversarial samples. CoRR abs/1707.02812 (2017). http://arxiv.org/abs/1707.02812
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019). http://arxiv.org/abs/1910.01108
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS’14: Proc. of the 2014 Advances in Neural Information Processing Systems. pp. 3104–3112 (2014), https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., Fergus, R.: Intriguing properties of neural networks. In: Bengio, Y., LeCun, Y. (eds.) ICLR’14: Proc. of the 2014 International Conference on Learning Representations (2014), http://arxiv.org/abs/1312.6199
Wang, X., Jin, H., He, K.: Natural language adversarial attacks and defenses in word level. CoRR abs/1909.06723 (2019). http://arxiv.org/abs/1909.06723
Wang, Y., Huang, M., Zhu, X., Zhao, L.: Attention-based LSTM for aspect-level sentiment classification. In: EMNLP’16: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/d16-1058
Wang, Y., Huang, R., Song, S., Huang, Z., Huang, G.: Not all images are worth 16x16 words: Dynamic transformers for efficient image recognition. In: NIPS’21: Proc. of the 2021 Advances in Neural Information Processing Systems, vol. 34 (2021). https://proceedings.neurips.cc/paper/2021/hash/64517d8435994992e682b3e4aa0a0661-Abstract.html
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: NAACL’16: Proc. of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489. The Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/n16-1174, https://doi.org/10.18653/v1/n16-1174
Zang, Y., Qi, F., Yang, C., Liu, Z., Zhang, M., Liu, Q., Sun, M.: Word-level textual adversarial attacking as combinatorial optimization. In: ACL’20: Proceedings of the 2020 Annual Meeting of the Association for Computational Linguistics, pp. 6066–6080. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.540
Zhang, W.E., Sheng, Q.Z., Alhazmi, A., Li, C.: Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Trans. Intell. Syst. Technol. 11(3), 24:1–24:41 (2020). https://doi.org/10.1145/3374217
Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS’15: Proceedings of the 2015 Advances in Neural Information Processing Systems, pp. 649–657 (2015). https://proceedings.neurips.cc/paper/2015/hash/250cf8b51c773f3f8dc8b4be867a9a02-Abstract.html

Download references

Author information

Authors and Affiliations

College of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Shasha Zhou
Department of Computer Science, University of Exeter, Exeter, EX4 5DS, UK
Ke Li & Geyong Min

Authors

Shasha Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Ke Li
View author publications
You can also search for this author in PubMed Google Scholar
Geyong Min
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke Li .

Editor information

Editors and Affiliations

TU Dortmund, Dortmund, Germany
Günter Rudolph
Leiden University, Leiden, The Netherlands
Anna V. Kononova
Shinshu University, Nagano, Japan
Hernán Aguirre
Technische Universität Dresden, Dresden, Germany
Pascal Kerschke
University of Stirling, Stirling, UK
Gabriela Ochoa
Jožef Stefan Institute, Ljubljana, Slovenia
Tea Tušar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, S., Li, K., Min, G. (2022). Attention-Based Genetic Algorithm for Adversarial Attack in Natural Language Processing. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds) Parallel Problem Solving from Nature – PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13398. Springer, Cham. https://doi.org/10.1007/978-3-031-14714-2_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-14714-2_24
Published: 14 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14713-5
Online ISBN: 978-3-031-14714-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics