Skip to main content

A Textual Adversarial Attack Scheme for Domain-Specific Models

  • Conference paper
  • First Online:
Machine Learning for Cyber Security (ML4CS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13656))

Included in the following conference series:

Abstract

Most of the textual adversarial attack methods generate adversarial examples by searching solutions from a perturbation space, which is constructed based on universal corpus. These methods possess high performance when attacking models trained on universal corpus, whereas have a greatly reduced attack capability when attacking domain-specific models. In this paper, we inject domain-specific knowledge into the perturbation space and combine the new domain-specific space with the universal space to enlarge the candidate space for attacking. Specifically, for a domain-specific victim model, the corresponding corpus is used to construct a domain-specific word embedding space, which is utilized as the augmented perturbation space. Besides, we use beam search to augment the search range to further improve the attack ability. Experiment results, involving multiple victim models, datasets, and baselines, reflect that our attack method realized significant improvements on domain-specific model attack.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/datasets/slythe/tweets-of-4-banks-in-south-africa-poc.

  2. 2.

    https://huggingface.co.

  3. 3.

    https://github.com/JHL-HUST/PWWS.

  4. 4.

    https://github.com/jind11/TextFooler.

References

  1. Ebrahimi, J., Lowd, D., Dou, D.: On adversarial examples for character-level neural machine translation. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 653–663, Association for Computational Linguistics, Santa Fe, New Mexico, USA, August 2018

    Google Scholar 

  2. Yang, P., Chen, J., Hsieh, C.J., Wang, J.L., Jordan, M.I.: Greedy attack and gumbel attack: generating adversarial examples for discrete data. J. Mach. Learn. Res. 21(43), 1–36 (2020)

    MathSciNet  MATH  Google Scholar 

  3. Ebrahimi, J., Rao, A., Lowd, D., Dou, D.: HotFlip: white-box adversarial examples for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, pp. 31–36, Melbourne, Australia, July 2018

    Google Scholar 

  4. Gil, Y., Chai, Y., Gorodissky, O., Berant, J.: White-to-black: efficient distillation of black-box adversarial attacks. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp. 1373–1379, Minneapolis, Minnesota, June 2019

    Google Scholar 

  5. Liu, S., Ning, L., Chen, C., Tang, K.: Efficient combinatorial optimization for word-level adversarial textual attack. IEEE/ACM Trans. Audio Speech Lang. Process. 30, 98–111 (2021)

    Article  Google Scholar 

  6. Wang, T., et al.: CAT-gen: improving robustness in NLP models via controlled adversarial text generation. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, pp. 5141–5146, Online, November 2020

    Google Scholar 

  7. Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 1085–1097, Florence, Italy, July 2019

    Google Scholar 

  8. Li, L., Ma, R., Guo, Q., Xue, X., Qiu, X.: BERT-ATTACK: adversarial attack against BERT using BERT. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6193–6202. Association for Computational Linguistics, November 2020

    Google Scholar 

  9. Pruthi, D., Dhingra, B., Lipton, Z.C.: Combating adversarial misspellings with robust word recognition. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5582–5591. Association for Computational Linguistics, Florence, Italy, July 2019

    Google Scholar 

  10. Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Srivastava, M., Chang, K.W.: Generating natural language adversarial examples. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2890–2896. Association for Computational Linguistics, Brussels, Belgium, October-November 2018

    Google Scholar 

  11. Zang, Y., et al.: Word-level textual adversarial attacking as combinatorial optimization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 6066–6080. Association for Computational Linguistics, July 2020

    Google Scholar 

  12. Dong, Z., Dong, Q., Hao, C.: HowNet and its computation of meaning. In: Coling 2010: Demonstrations, pp. 53–56. Coling 2010 Organizing Committee, Beijing, China, August 2010

    Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  14. Pennington, J., Socher, R., Manning, CD.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  15. Jin, D., Jin, Z., Zhou, J.T., Szolovits, P.: Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8018–8025 (2020)

    Google Scholar 

  16. Garg, S., Ramakrishnan, G.: BAE: BERT-based adversarial examples for text classification. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6174–6181. Association for Computational Linguistics, November 2020

    Google Scholar 

  17. Wang, B., Xu, C., Liu, X., Cheng, Y., Li, B.: Semattack: natural textual attacks via different semantic spaces. arXiv preprint arXiv:2205.01287 (2022)

  18. Guo, C., Sablayrolles, A., Jégou, H., Kiela, D.:. Gradient-based adversarial attacks against text transformers. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 5747–5757. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, November 2021

    Google Scholar 

  19. Cheng, M., Yi, J., Chen, P.-Y., Zhang, H., Hsieh, C.-J.: Seq2sick: evaluating the robustness of sequence-to-sequence models with adversarial examples. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 3601–3608 (2020)

    Google Scholar 

  20. Schick, T., Schütze, H.: Attentive mimicking: better word embeddings by attending to informative contexts. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 489–494. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019

    Google Scholar 

  21. Mrkšić, N., et al.: Counter-fitting word vectors to linguistic constraints. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 142–148. Association for Computational Linguistics, San Diego, California, June 2016

    Google Scholar 

  22. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Adv. Neural Inf. Process. Syst. 28 (2015)

    Google Scholar 

  23. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar, October 2014

    Google Scholar 

  24. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019

    Google Scholar 

  25. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China under Grant 61972148.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhitao Guan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, J., Wang, S., Wu, L., Dong, H., Guan, Z. (2023). A Textual Adversarial Attack Scheme for Domain-Specific Models. In: Xu, Y., Yan, H., Teng, H., Cai, J., Li, J. (eds) Machine Learning for Cyber Security. ML4CS 2022. Lecture Notes in Computer Science, vol 13656. Springer, Cham. https://doi.org/10.1007/978-3-031-20099-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20099-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20098-4

  • Online ISBN: 978-3-031-20099-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics