Skip to main content

Adversarial Text Generation via Probability Determined Word Saliency

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12487))

Abstract

Deep learning (DL) technology has been widely deployed in many fields and achieved great success, but it is not absolutely safe and reliable. It has been proved that research on adversarial attacks can reveal the vulnerability of deep neural networks (DNN). Although many methods of adversarial attack and defense have been proposed in the field of images, the research on textual adversarial samples is still few. It is challenging because text samples are sparse and discrete and the added perturbation might lead to grammatical errors and semantic changes. Thus, there are some special restrictions on textual adversarial samples. We propose a synonyms substitution-based adversarial text generation via Probability Determined Word Saliency (PDWS). In our method PDWS, the word saliency and the optimal substitution word are determined by the optimal replace-ment effect. The replacement effect is the probability change caused by replacing one word with its substitution word. We evaluate our attack method on two popular text classification tasks using CNN and LSTM. The experimental results show that our method gets higher misleading rate and less perturbation rate than the baseline methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations 2015, ICLR, San Diego, USA (2015)

    Google Scholar 

  2. Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: Proceedings of the IEEE conference on computer vision and pattern recognition 2015, CVPR, pp. 427–436 (2015)

    Google Scholar 

  3. Wang, W., Wang, L., Tang, B., Wang, R., Ye, A.: Towards a robust deep neural network in text domain a survey. arXiv preprint arXiv:1902.07285 (2019)

  4. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)

  5. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)

  6. Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, CVPR, pp. 9185–9193 (2018)

    Google Scholar 

  7. Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: International Conference on Machine Learning, ICML, pp. 5286–5295 (2018)

    Google Scholar 

  8. Song C., He K., Wang L.: Improving the generalization of adversarial training with domain adaptation. arXiv preprint arXiv:1810.00740 (2019)

  9. Ling, X., Ji, S., Zou, J., Wang, J., Wang, T.: DEEPSEC: a uniform platform for security analysis of deep learning model. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 529–546 (2019)

    Google Scholar 

  10. Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: 2016 IEEE Military Communications Conference, MILCOM, pp. 49–54 (2016)

    Google Scholar 

  11. Samanta, S., Mehta, S.: Towards crafting text adversarial samples. arXiv preprint arXiv:1707.02812 (2017)

  12. Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Chang, K.W.: Generating natural language adversarial examples. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 2890–2896 (2018)

    Google Scholar 

  13. Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops, SPW, pp. 50–56 (2018)

    Google Scholar 

  14. Li, J., Ji, S., Du, T., Li, B., Wang, T.: Textbugger: generating adversarial text against real-world applications. arXiv preprint arXiv:1812.05271 (2019)

  15. Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 1085–1097 (2019)

    Google Scholar 

  16. Liang, B., Li, H., Su, M., Bian, P., Li, X., Shi, W.: Deep text classification can be fooled. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI, pp. 4208–4215 (2017)

    Google Scholar 

  17. Qi, F., Yang, C., Liu, Z., Dong, Q., Sun, M., Dong, Z.: Openhownet: an open sememe-based lexical knowledge base. arXiv preprint arXiv:1901.09957 (2019)

  18. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1746–1751 (2014)

    Google Scholar 

Download references

Acknowledgements

The work is partially supported by the National Natural Science Foundation of China under Grant 61972148, Beijing Natural Science Foundation under grant 4182060.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhitao Guan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, G., Shi, L., Guan, Z. (2020). Adversarial Text Generation via Probability Determined Word Saliency. In: Chen, X., Yan, H., Yan, Q., Zhang, X. (eds) Machine Learning for Cyber Security. ML4CS 2020. Lecture Notes in Computer Science(), vol 12487. Springer, Cham. https://doi.org/10.1007/978-3-030-62460-6_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62460-6_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62459-0

  • Online ISBN: 978-3-030-62460-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics