Skip to main content
Log in

Agent manipulator: Stealthy strategy attacks on deep reinforcement learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep reinforcement learning (DRL) is a primary machine learning approach for solving sequential decision problems. To exploit the potential vulnerabilities of DRL, we propose a poisoning attack method that injects a backdoor for the DRL model by manipulating the training data with triggers. Existing attack methods can be easily detected by defenders, and their interpretability and transferability have not been studied so far. To address these issues, we propose an agent manipulator, a stealthy target poisoning method. The agent manipulator generates stealthy poisoning examples and fine tunes the model together with clean examples. It achieves state-of-the-art attack performance and is also the first black-box poisoning method through the poisoned examples’ transfer. Corresponding experimental results indicate that, even with a single poisoning example, the poisoning model reaches 60% of trigger success rate of the target action. The effectiveness of the agent manipulator can be interpreted through the heat map’s visualization and the neuron coverage rate. In addition, the agent manipulator can disrupt the model’s deep feature extraction and the execution of actions. Additionally, we verified that the agent manipulator is immune to the existing defenses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Ye D, Chen G, Zhang W, Chen S, Yuan B, Liu B, Chen J, Liu Z, Qiu F, Yu H et al (2020) Towards playing full moba games with deep reinforcement learning. Adv Neural Inf Process Syst 33:621–632

    Google Scholar 

  2. Yang Y, Vamvoudakis KG, Modares H (2020) Safe reinforcement learning for dynamical games. Int J Robust Nonlinear Control 30(9):3706–3726

    Article  MathSciNet  MATH  Google Scholar 

  3. Yang X, He H, Wei Q, Luo B (2018) Reinforcement learning for robust adaptive control of partially unknown nonlinear systems subject to unmatched uncertainties. Inf Sci 463:307–322

    Article  MathSciNet  MATH  Google Scholar 

  4. Fayjie AR, Hossain S, Oualid D, Lee DJ (2018) Driverless car: Autonomous driving using deep reinforcement learning in urban environment. In: 2018 15th international conference on ubiquitous robots UR, IEEE, pp 896–901

  5. Prasad N, Cheng LF, Chivers C, Draugelis M, Engelhardt BE (2017) A reinforcement learning approach to weaning of mechanical ventilation in intensive care units. In: 33Rd conference on uncertainty in artificial intelligence

  6. Lee J, Koh H, Choe HJ (2021) Learning to trade in financial time series using high-frequency through wavelet transformation and deep reinforcement learning. Appl Intell 51(8):6202– 6223

    Article  Google Scholar 

  7. Perrusquía A, Yu W, Li X (2021) Multi-agent reinforcement learning for redundant robot control in task-space. Int J Mach Learn Cybern 12(1):231–241

    Article  Google Scholar 

  8. Nguyen TT, Reddi V (2022) Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems, p 1–18

  9. Andersen PA, Goodwin M, Granmo OC (2020) Towards safe reinforcement-learning in industrial grid-warehousing. Inf Sci 537:467–484

    Article  MathSciNet  Google Scholar 

  10. Le N, Rathour VS, Yamazaki K, Luu K, Savvides M (2021) Deep reinforcement learning in computer vision: a comprehensive survey. Artificial Intelligence Review, p 1–87

  11. Furuta R, Inoue N, Yamasaki T (2019) Pixelrl: Fully convolutional network with reinforcement learning for image processing. IEEE Trans Multimed 22(7):1704–1719

    Article  Google Scholar 

  12. Liu Q, Cheng L, Jia AL, Liu C (2021) Deep reinforcement learning for communication flow control in wireless mesh networks. IEEE Netw 35(2):112–119

    Article  Google Scholar 

  13. Chen P, Lu W (2021) Deep reinforcement learning based moving object grasping. Inf Sci 565:62–76

    Article  MathSciNet  Google Scholar 

  14. Vithayathil Varghese N, Mahmoud QH (2020) A survey of multi-task deep reinforcement learning. Electronics 9(9):1363

    Article  Google Scholar 

  15. Zou F, Yen GG, Tang L, Wang C (2021) A reinforcement learning approach for dynamic multi-objective optimization. Inf Sci 546:815–834

    Article  MathSciNet  MATH  Google Scholar 

  16. Pröllochs N, Feuerriegel S, Lutz B, Neumann D (2020) Negation scope detection for sentiment analysis: a reinforcement learning framework for replicating human interpretations. Inf Sci 536:205–221

    Article  Google Scholar 

  17. Lei L, Tan Y, Zheng K, Liu S, Zhang K, Shen X (2020) Deep reinforcement learning for autonomous internet of things: model, applications and challenges. IEEE Commun Surv Tutor 22(3):1722–1760

    Article  Google Scholar 

  18. Igl M, Ciosek K, Li Y, Tschiatschek S, Zhang C, Devlin S, Hofmann K (2019) Generalization in reinforcement learning with selective noise injection and information bottleneck. Proceedings of the 33rd International Conference on Neural Information Processing Systems, p 13979–13991

  19. Wang J, Liu Y, Li B (2020) Reinforcement learning with perturbed rewards. Proc Conf AAAI Artif Intell 34(04):6202–6209

    Google Scholar 

  20. Pinto L, Davidson J, Sukthankar R, Gupta A (2017) Robust adversarial reinforcement learning. International Conference on Machine Learning, p 2817–2826

  21. Bravo M, Mertikopoulos P (2017) On the robustness of learning in games with stochastically perturbed payoff observations. Games Econ Behav 103:41–66

    Article  MathSciNet  MATH  Google Scholar 

  22. Behzadan V, Munir A (2018) Mitigation of policy manipulation attacks on deep q-networks with parameter-space noise, International Conference on Computer Safety Reliability, and Security, p 406–417

  23. Al-Nima RRO, Han T, Al-Sumaidaee SAM, Chen T, Woo WL (2021) Robustness and performance of deep reinforcement learning. Appl Soft Comput 105:107295

    Article  Google Scholar 

  24. Han Y, Rubinstein BI, Abraham T, Alpcan T, De Vel O, Erfani S, Hubczenko D, Leckie C, Montague P (2018) Reinforcement learning for autonomous defence in software-defined networking. In: International conference on decision and game theory for security, Springer, pp 145–165

  25. Bai X, Niu W, Liu J, Gao X, Xiang Y, Liu J (2018) Adversarial examples construction towards white-box q table variation in dqn pathfinding training. In: 2018 IEEE third international conference on data science in cyberspace (DSC), IEEE, pp 781–787

  26. Lee XY, Ghadai S, Tan KL, Hegde C, Sarkar S (2020) Spatiotemporally constrained action space attacks on deep reinforcement learning agents. In: AAAI, pp 4577–4584

  27. Panagiota K, Kacper W, Jha S, Wenchao L (2020) Trojdrl: Trojan attacks on deep reinforcement learning agents. In: Proc. 57th ACM/IEEE design automation conference (DAC)

  28. Behzadan V, Munir A (2017) Vulnerability of deep reinforcement learning to policy induction attacks. In: International conference on machine learning and data mining in pattern recognition, pp 262–275

  29. Wang B, Yao Y, Shan S, Li H, Viswanath B, Zheng H, Zhao BY (2019) Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE symposium on security and privacy (SP), IEEE, pp 707–723

  30. Wang L, Javed Z, Wu X, Guo W, Xing X, Song D (2021) Backdoorl: Backdoor attack against competitive reinforcement learning. In: IJCAI

  31. Behzadan V, Hsu W (2019) Adversarial exploitation of policy imitation. In: IJCAI

  32. Kos J, Song D (2017) Delving into adversarial attacks on deep policies. In: 5Th international conference on learning representations, ICLR

  33. Tretschk E, Oh SJ, Fritz M (2018) Sequential attacks on agents for long-term adversarial goals. In: 2. ACM Computer science in cars symposium

  34. Hussenot L, Geist M, Pietquin O (2019) Targeted attacks on deep reinforcement learning agents through adversarial observations, 1–9 arXiv:1905.12282

  35. Huang S, Papernot N, Goodfellow I, Duan Y, Abbeel P (2017) Adversarial attacks on neural network policies. In: 5Th international conference on learning representations, ICLR

  36. Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: The international conference on learning representations, ICLR

  37. Behzadan V, Munir A (2017) Whatever does not kill deep reinforcement learning, makes it stronger, 1–8 arXiv:1712.09344

  38. Pattanaik A, Tang Z, Liu S, Bommannan G, Chowdhary G (2018) Robust deep reinforcement learning with adversarial attacks. In: 17th International conference on autonomous agents and multiagent systems, AAMAS 2018, pp 2040– 2042

  39. Gleave A, Dennis M, Wild C, Kant N, Levine S, Russell S (2019) Adversarial policies: Attacking deep reinforcement learning. In: International conference on learning representations

  40. Sun Y, Huo D, Huang F (2020) Vulnerability-aware poisoning mechanism for online rl with unknown dynamics. In: International conference on learning representations

  41. Zhang X, Ma Y, Singla A, Zhu X (2020) Adaptive reward-poisoning attacks against reinforcement learning. Proceedings of the 37th International Conference on Machine Learning 119:11225–11234

    Google Scholar 

  42. Behzadan V, Hsu W (2017) Analysis and improvement of adversarial training in dqn agents with adversarially-guided exploration (age), 1–9 arXiv:1906.01119

  43. Rajeswaran A, Ghotra S, Ravindran B, Levine S (2016) Epopt: Learning robust neural network policies using model ensembles. In: Proceedings of the 5th International Conference on Learning Representations, 1–15 arXiv:1610.01283

  44. Morimoto J, Doya K (2005) Robust reinforcement learning. Neural Comput 17(2):335–359

    Article  MathSciNet  Google Scholar 

  45. Ogunmolu O, Gans N, Summers T (2018) Minimax iterative dynamic game: application to nonlinear robot control tasks. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp 6919–6925

  46. Gu Z, Jia Z, Choset H (2019) Adversary a3c for robust reinforcement learning, 1–12 arXiv:1912.00330

  47. Behzadan V, Hsu W (2019) Sequential triggers for watermarking of deep reinforcement learning policies, 1–4 arXiv:1906.01126

  48. Lin YC, Liu MY, Sun M, Huang JB (2017) Detecting adversarial attacks on neural network policies with visual foresight, 1–10 arXiv:1710.00814

  49. Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, Horgan D, Piot B, Azar M, Silver D (2018) Rainbow: Combining improvements in deep reinforcement learning. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 32

  50. Bellemare MG, Naddaf Y, Veness J, Bowling M (2013) The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res 47:253–279

    Article  Google Scholar 

  51. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2020) Grad-cam: Visual explanations from deep networks via gradient-based localization. Int J Comput Vis 128(2):336– 359

    Article  Google Scholar 

  52. Pei K, Cao Y, Yang J, Jana S (2017) Deepxplore: Automated whitebox testing of deep learning systems. In: proceedings of the 26th symposium on operating systems principles, pp 1–18

Download references

Acknowledgment

This research was supported by the National Natural Science Foundation of China under Grant No. 62072406, the Natural Science Foundation of Zhejiang Province under Grant No. LY19F020025, the National Key Research and Development Program of China under Grant No. 2018AAA0100801, Key R&D Projects in Zhejiang Province No. 2021C01117, 2020 Industrial Internet Innovation Development Project No. TC200H01V, “Ten Thousand Talents Program” Science and Technology Innovation Leading Talent Project in Zhejiang Province No. 2020R52011.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jinyin Chen or Liang Bao.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, J., Wang, X., Zhang, Y. et al. Agent manipulator: Stealthy strategy attacks on deep reinforcement learning. Appl Intell 53, 12831–12858 (2023). https://doi.org/10.1007/s10489-022-03882-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03882-w

Keywords

Navigation