Gradient-Based Adversarial Attacks on Categorical Sequence Models via Traversing an Embedded World

Fursov, Ivan; Zaytsev, Alexey; Kluchnikov, Nikita; Kravchenko, Andrey; Burnaev, Evgeny

doi:10.1007/978-3-030-72610-2_27

Ivan Fursov²³,
Alexey Zaytsev²³,
Nikita Kluchnikov²³,
Andrey Kravchenko²⁴ &
…
Evgeny Burnaev²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12602))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

948 Accesses

Abstract

Deep learning models suffer from a phenomenon called adversarial attacks: we can apply minor changes to the model input to fool a classifier for a particular example. The literature mostly considers adversarial attacks on models with images and other structured inputs. However, the adversarial attacks for categorical sequences can also be harmful. Successful attacks for inputs in the form of categorical sequences should address the following challenges: (1) non-differentiability of the target function, (2) constraints on transformations of initial sequences, and (3) diversity of possible problems. We handle these challenges using two black-box adversarial attacks. The first approach adopts a Monte-Carlo method and allows usage in any scenario, the second approach uses a continuous relaxation of models and target metrics, and thus allows a usage of state-of-the-art methods for adversarial attacks with little additional effort. Results for money transactions, medical fraud, and NLP datasets suggest that the proposed methods generate reasonable adversarial sequences that are close to original ones, but fool machine learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Query-Efficient Black-Box Adversarial Attack with Random Pattern Noises

SyReNN: A Tool for Analyzing Deep Neural Networks

A Detailed Survey on Deep Learning Techniques for Real-Time Image Classification, Recognition and Analysis

Notes

1.
The code is available at
https://github.com/fursovia/dilma/tree/master. The data is available at https://www.dropbox.com/s/axu26guw2a0mwos/adat_datasets.zip?dl=0.
2.
https://www.kaggle.com/c/python-and-analyze-data-final-project/data.

References

Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2805–2824 (2019)
Article MathSciNet Google Scholar
Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430 (2018)
Article Google Scholar
Khrulkov, V., Oseledets, I.: Art of singular vectors and universal adversarial perturbations. In: IEEE CVPR, pp. 8562–8570 (2018)
Google Scholar
Zhang, W.E., Sheng, Q.Z., Alhazmi, A., Li, C.: Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Trans. Intell. Syst. Technol. (TIST) 11(3), 1–41 (2020)
Google Scholar
Wang, W., Tang, B., Wang, R., Wang, L., Ye, A.: A survey on adversarial attacks and defenses in text. arXiv:1902.07285 preprint (2019)
Sun, L., Wang, J., Yu, P.S., Li, B.: Adversarial attack and defense on graph data: a survey. arXiv:1812.10528 preprint (2018)
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial machine learning at scale. In: ICLR (2017)
Google Scholar
Samanta, S., Mehta, S.: Towards crafting text adversarial samples. arXiv:1707.02812 preprint (2017)
Liang, B., Li, H., Su, M., Bian, P., Li, X., Shi, W.: Deep text classification can be fooled. In: IJCAI (2017)
Google Scholar
Ebrahimi, J., Rao, A., Lowd, D., Dou, D.: Hotflip: white-box adversarial examples for text classification. In: Annual Meeting of ACL, pp. 31–36 (2018)
Google Scholar
Sato, M., Suzuki, H., Shindo, J., Matsumoto, Y.: Interpretable adversarial perturbation in input embedding space for text. In: IJCAI (2018)
Google Scholar
Moon, S., Neves, L., Carvalho, V.: Multimodal named entity recognition for short social media posts. In: Conference of the North American Chapter ACL: Human Language Technologies, pp. 852–860 (2018)
Google Scholar
Bowman, S., Vilnis, L., Vinyals, O., Dai, A., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: SIGNLL CoNNL, pp. 10–21 (2016)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2014)
Google Scholar
Zügner, D., Akbarnejad, A., Günnemann, S.: Adversarial attacks on neural networks for graph data. In: ACM SIGKDD, pp. 2847–2856 (2018)
Google Scholar
Papernot, N., McDaniel, P., Swami, A., Harang, R.: Crafting adversarial input sequences for recurrent neural networks. In: IEEE MILCOM, pp. 49–54 (2016)
Google Scholar
Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. In: IEEE Security and Privacy Workshops, pp. 50–56. IEEE (2018)
Google Scholar
Jin, D., Jin, Z., Zhou, J.T., Szolovits, P.: Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: AAAI (2020)
Google Scholar
Fursov, I., Zaytsev, A., Khasyanov, R., Spindler, M., Burnaev, E.: Sequence embeddings help to identify fraudulent cases in healthcare insurance. arXiv:1910.03072 preprint (2019)
Ren, Y., et al: Generating natural language adversarial examples on a large scale with generative models. arXiv preprint arXiv:2003.10388 (2020)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
Google Scholar
Li, P., Lam, W., Bing, L., Wang, Z.: Deep recurrent generative decoder for abstractive text summarization. In: EMNLP, pp. 2091–2100 (2017)
Google Scholar
Hu, R., Andreas, J., Rohrbach, M.: Learning to reason: end-to-end module networks for visual question answering. In: IEEE ICCV, pp. 804–813 (2017)
Google Scholar
Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: Annual Meeting of ACL, pp. 1631–1640 (2016)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM (1999)
Google Scholar
Graves, A.: Sequence transduction with recurrent neural networks. arXiv:1211.3711 preprint (2012)
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.-Y.: Mass: masked sequence to sequence pre-training for language generation. In: ICML, pp. 5926–5936 (2019)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Annual Meeting of ACL, pp. 311–318 (2002)
Google Scholar
Fursov, I., Zaytsev, A., et al.: Differentiable language model adversarial attacks on categorical sequence classifiers. arXiv:2006.11078 preprint (2020)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NeurIPS, pp. 649–657 (2015)
Google Scholar

Download references

Acknowledgments

The work presented in Sect. 3 by Alexey Zaytsev was supported by RSF grant 20–71-10135. The work presented in Sect. 4 by Evgeny Burnaev was supported by RFBR grant 20-01-00203.

Author information

Authors and Affiliations

Skolkovo Institute of Science and Technology, Moscow, Russia
Ivan Fursov, Alexey Zaytsev, Nikita Kluchnikov & Evgeny Burnaev
DeepReason.ai, Oxford, UK
Andrey Kravchenko

Authors

Ivan Fursov
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Zaytsev
View author publications
You can also search for this author in PubMed Google Scholar
Nikita Kluchnikov
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Kravchenko
View author publications
You can also search for this author in PubMed Google Scholar
Evgeny Burnaev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivan Fursov .

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Wil M. P. van der Aalst
University of Ljubljana, Ljubljana, Slovenia
Vladimir Batagelj
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Krasovskii Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Michael Khachay
National Research University Higher School of Economics, St. Petersburg, Russia
Olessia Koltsova
University of Oslo, Oslo, Norway
Andrey Kutuzov
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
National Research University Higher School of Economics, Moscow, Russia
Irina A. Lomazova
Moscow State University, Moscow, Russia
Natalia Loukachevitch
LORIA, Vandœuvre lès Nancy, France
Amedeo Napoli
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Panchenko
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Università Ca' Foscari Venezia, Venice, Italy
Marcello Pelillo
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Kazan Federal University, Kazan, Russia
Elena Tutubalina

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fursov, I., Zaytsev, A., Kluchnikov, N., Kravchenko, A., Burnaev, E. (2021). Gradient-Based Adversarial Attacks on Categorical Sequence Models via Traversing an Embedded World. In: van der Aalst, W.M.P., et al. Analysis of Images, Social Networks and Texts. AIST 2020. Lecture Notes in Computer Science(), vol 12602. Springer, Cham. https://doi.org/10.1007/978-3-030-72610-2_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-72610-2_27
Published: 09 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72609-6
Online ISBN: 978-3-030-72610-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Gradient-Based Adversarial Attacks on Categorical Sequence Models via Traversing an Embedded World

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Query-Efficient Black-Box Adversarial Attack with Random Pattern Noises

SyReNN: A Tool for Analyzing Deep Neural Networks

A Detailed Survey on Deep Learning Techniques for Real-Time Image Classification, Recognition and Analysis

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Gradient-Based Adversarial Attacks on Categorical Sequence Models via Traversing an Embedded World

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Query-Efficient Black-Box Adversarial Attack with Random Pattern Noises

SyReNN: A Tool for Analyzing Deep Neural Networks

A Detailed Survey on Deep Learning Techniques for Real-Time Image Classification, Recognition and Analysis

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation