Adversarial Text Generation via Probability Determined Word Saliency

Ma, Gang; Shi, Lingyun; Guan, Zhitao

doi:10.1007/978-3-030-62460-6_50

Adversarial Text Generation via Probability Determined Word Saliency

Gang Ma¹²,
Lingyun Shi¹² &
Zhitao Guan¹²

Conference paper
First Online: 11 November 2020

1188 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12487))

Abstract

Deep learning (DL) technology has been widely deployed in many fields and achieved great success, but it is not absolutely safe and reliable. It has been proved that research on adversarial attacks can reveal the vulnerability of deep neural networks (DNN). Although many methods of adversarial attack and defense have been proposed in the field of images, the research on textual adversarial samples is still few. It is challenging because text samples are sparse and discrete and the added perturbation might lead to grammatical errors and semantic changes. Thus, there are some special restrictions on textual adversarial samples. We propose a synonyms substitution-based adversarial text generation via Probability Determined Word Saliency (PDWS). In our method PDWS, the word saliency and the optimal substitution word are determined by the optimal replace-ment effect. The replacement effect is the probability change caused by replacing one word with its substitution word. We evaluate our attack method on two popular text classification tasks using CNN and LSTM. The experimental results show that our method gets higher misleading rate and less perturbation rate than the baseline methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations 2015, ICLR, San Diego, USA (2015)
Google Scholar
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: Proceedings of the IEEE conference on computer vision and pattern recognition 2015, CVPR, pp. 427–436 (2015)
Google Scholar
Wang, W., Wang, L., Tang, B., Wang, R., Ye, A.: Towards a robust deep neural network in text domain a survey. arXiv preprint arXiv:1902.07285 (2019)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204 (2017)
Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, CVPR, pp. 9185–9193 (2018)
Google Scholar
Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: International Conference on Machine Learning, ICML, pp. 5286–5295 (2018)
Google Scholar
Song C., He K., Wang L.: Improving the generalization of adversarial training with domain adaptation. arXiv preprint arXiv:1810.00740 (2019)
Ling, X., Ji, S., Zou, J., Wang, J., Wang, T.: DEEPSEC: a uniform platform for security analysis of deep learning model. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 529–546 (2019)
Google Scholar
Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: 2016 IEEE Military Communications Conference, MILCOM, pp. 49–54 (2016)
Google Scholar
Samanta, S., Mehta, S.: Towards crafting text adversarial samples. arXiv preprint arXiv:1707.02812 (2017)
Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Chang, K.W.: Generating natural language adversarial examples. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 2890–2896 (2018)
Google Scholar
Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops, SPW, pp. 50–56 (2018)
Google Scholar
Li, J., Ji, S., Du, T., Li, B., Wang, T.: Textbugger: generating adversarial text against real-world applications. arXiv preprint arXiv:1812.05271 (2019)
Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 1085–1097 (2019)
Google Scholar
Liang, B., Li, H., Su, M., Bian, P., Li, X., Shi, W.: Deep text classification can be fooled. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI, pp. 4208–4215 (2017)
Google Scholar
Qi, F., Yang, C., Liu, Z., Dong, Q., Sun, M., Dong, Z.: Openhownet: an open sememe-based lexical knowledge base. arXiv preprint arXiv:1901.09957 (2019)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1746–1751 (2014)
Google Scholar

Download references

Acknowledgements

The work is partially supported by the National Natural Science Foundation of China under Grant 61972148, Beijing Natural Science Foundation under grant 4182060.

Author information

Authors and Affiliations

School of Control and Computer Engineering, North China Electric Power University, 102206, Beijing, China
Gang Ma, Lingyun Shi & Zhitao Guan

Authors

Gang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Lingyun Shi
View author publications
You can also search for this author in PubMed Google Scholar
Zhitao Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhitao Guan .

Editor information

Editors and Affiliations

Xidian University, Xi'an, China
Xiaofeng Chen
Guangzhou University, Guangzhou, China
Hongyang Yan
Michigan State University, East Lansing, MI, USA
Qiben Yan
Division of Computer, Electrical and Mathematical Sciences and Engineering, King Abdullah University of Science, Thuwal, Saudi Arabia
Xiangliang Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, G., Shi, L., Guan, Z. (2020). Adversarial Text Generation via Probability Determined Word Saliency. In: Chen, X., Yan, H., Yan, Q., Zhang, X. (eds) Machine Learning for Cyber Security. ML4CS 2020. Lecture Notes in Computer Science(), vol 12487. Springer, Cham. https://doi.org/10.1007/978-3-030-62460-6_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-62460-6_50
Published: 11 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62459-0
Online ISBN: 978-3-030-62460-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics