Abstract
Deep Reinforcement Learning (DRL) is an essential subfield of Artificial Intelligence (AI), where agents interact with environments to learn policies for solving complex tasks. In recent years, DRL has achieved remarkable breakthroughs in various tasks, including video games, robotic control, quantitative trading, and autonomous driving. Despite its accomplishments, security and privacy-related issues still prevent us from deploying trustworthy DRL applications. For example, by manipulating the environment, an attacker can influence an agent’s actions, misleading it to behave abnormally. Additionally, an attacker can infer private training data and environmental information by maliciously interacting with DRL models, causing a privacy breach. In this survey, we systematically investigate the recent progress of security and privacy issues in the context of DRL. First, we present a holistic review of security-related attacks within DRL systems from the perspectives of single-agent and multi-agent systems and review privacy-related attacks. Second, we review and classify defense methods used to address security-related challenges, including robust learning, anomaly detection, and game theory approaches. Third, we review and classify privacy-preserving technologies, including encryption, differential privacy, and policy confusion. We conclude the survey by discussing open issues and possible directions for future research in this field.
- [1] . 2016. Deep learning with differential privacy. In ACM SIGSAC Conference on Computer and Communications Security. 308–318.Google ScholarDigital Library
- [2] . 2022. Privacy-preserving in double deep-Q-network with differential privacy in continuous spaces. In Australasian Joint Conference on Artificial Intelligence. Springer, 15–26.Google ScholarDigital Library
- [3] . 2017. Continuous adaptation via meta-learning in nonstationary and competitive environments. Learning (2017).Google Scholar
- [4] . 2020. Learning dexterous in-hand manipulation. Int. J. Robot. Res. 39, 1 (2020), 3–20.Google ScholarDigital Library
- [5] . 2021. Poisoning deep reinforcement learning agents with in-distribution triggers. arXiv: Learning (2021).Google Scholar
- [6] . 2013. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. arXiv preprint arXiv:1306.4447 (2013).Google Scholar
- [7] . 2020. Model-based reinforcement learning with value-targeted regression. In International Conference on Machine Learning. PMLR, 463–474.Google Scholar
- [8] . 2022. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862 (2022).Google Scholar
- [9] . 2016. Differentially private policy evaluation. In International Conference on Machine Learning. PMLR, 2130–2138.Google Scholar
- [10] . 2018. Emergent complexity via multi-agent competition. In International Conference on Learning Representations.Google Scholar
- [11] . 2019. Adversarial exploitation of policy imitation. arXiv preprint arXiv:1906.01121 (2019).Google Scholar
- [12] . 2017. Vulnerability of deep reinforcement learning to policy induction attacks. Mach. Learn. Data Min. Pattern Recog. (2017).Google Scholar
- [13] . 2018. The faults in our pi stars: Security issues and open challenges in deep reinforcement learning. arXiv: Learning (2018).Google Scholar
- [14] . 1952. On the theory of dynamic programming. ProcNat’l Acad. Sci. 38, 8 (1952), 716–719.Google ScholarCross Ref
- [15] . 2020. The emergence of adversarial communication in multi-agent reinforcement learning. In Conference on Robot Learning.Google Scholar
- [16] . 2022. Black-box reward attacks against deep reinforcement learning based on successor representation.Google Scholar
- [17] . 2021. Multi-agent reinforcement learning: A review of challenges and applications. Appl. Sci. 11, 11 (2021), 4948.Google ScholarCross Ref
- [18] . 2020. Adversarial attack against deep reinforcement learning with static reward impact map. Comput. Commun. Secur. (2020).Google Scholar
- [19] . 2021. On the privacy risks of algorithmic fairness. In IEEE European Symposium on Security and Privacy (EuroS&P’21). IEEE, 292–303.Google Scholar
- [20] . 2020. Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning. IEEE Trans. Intell. Transport. Syst. (2020).Google Scholar
- [21] . 2021. Temporal watermarks for deep reinforcement learning models. Auton. Agents. Multi-agent Syst. (2021).Google Scholar
- [22] . 2021. Stealing deep reinforcement learning models for fun and profit. In ACM Asia Conference on Computer and Communications Security. 307–319.Google ScholarDigital Library
- [23] . 2017. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In 10th ACM Workshop on Artificial Intelligence and Security.Google Scholar
- [24] . 2022. Linearity grafting: Relaxed neuron pruning helps certifiable robustness. In International Conference on Machine Learning. PMLR, 3760–3772.Google Scholar
- [25] . 2023. Deep reinforcement learning in recommender systems: A survey and new perspectives. Knowl.-based Syst. 264 (2023), 110335.Google ScholarDigital Library
- [26] . 2020. SentiNet: Detecting localized universal attacks against deep learning systems. In IEEE Security and Privacy Workshops (SPW’20). IEEE, 48–54.Google Scholar
- [27] . 2021. Differentially private regret minimization in episodic Markov decision processes. arXiv preprint arXiv:2112.10599 (2021).Google Scholar
- [28] . 2021. Adaptive control of differentially private linear quadratic systems. In IEEE International Symposium on Information Theory (ISIT’21). IEEE, 485–490.Google Scholar
- [29] . 2019. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning. PMLR, 1310–1320.Google Scholar
- [30] . 2017. Unifying PAC and regret: Uniform PAC bounds for episodic reinforcement learning. Adv. Neural Inf. Process. Syst. 30 (2017).Google Scholar
- [31] . 2012. Off-policy actor-critic. arXiv preprint arXiv:1205.4839 (2012).Google Scholar
- [32] . 2006. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference. Springer, 265–284.Google ScholarDigital Library
- [33] . 2014. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In 23rd USENIX Security Symposium (USENIX Security’14). 17–32.Google Scholar
- [34] . 2022. Reward-free attacks in multi-agent reinforcement learning.Google Scholar
- [35] . 2018. Property inference attacks on fully connected neural networks using permutation invariant representations. In ACM SIGSAC Conference on Computer and Communications Security. 619–633.Google ScholarDigital Library
- [36] . 2021. Local differential privacy for regret minimization in reinforcement learning. Adv. Neural Inf. Process. Syst. 34 (2021).Google Scholar
- [37] . 2022. RLAS-BIABC: A reinforcement learning-based answer selection using the BERT model boosted by an improved ABC algorithm. Computat. Intell. Neurosci. 2022 (2022).Google Scholar
- [38] . 2019. Adversarial policies: Attacking deep reinforcement learning. In International Conference on Learning Representations.Google Scholar
- [39] . 2021. Privacy-preserving kickstarting deep reinforcement learning with privacy-aware learners. arXiv preprint arXiv:2102.09599 (2021).Google Scholar
- [40] . 2020. Privacy-preserving policy synthesis in Markov decision processes. In 59th IEEE Conference on Decision and Control (CDC’20). IEEE, 6266–6271.Google ScholarDigital Library
- [41] . 2020. The Dirichlet mechanism for differential privacy on the unit simplex. In American Control Conference (ACC’20). IEEE, 1253–1258.Google ScholarCross Ref
- [42] . 2021. Where did you learn that from? Surprising effectiveness of membership inference attacks against temporally correlated data in deep reinforcement learning. arXiv preprint arXiv:2109.03975 (2021).Google Scholar
- [43] . 2014. Explaining and harnessing adversarial examples. arXiv: Machine Learning (2014).Google Scholar
- [44] . 2022. Towards comprehensive testing on the robustness of cooperative multi-agent reinforcement learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 115–122.Google ScholarCross Ref
- [45] . 2021. Adversarial policy learning in two-player competitive games. In International Conference on Machine Learning.Google Scholar
- [46] . 2013. Differential privacy for functions and functional data. J. Mach. Learn. Res. 14, 1 (2013), 703–727.Google ScholarDigital Library
- [47] . 2023. Recent advances in reinforcement learning in finance. Math. Finance 33, 3 (2023), 437–503.Google ScholarCross Ref
- [48] . 2021. Privacy-aware load ensemble control: A linearly-solvable MDP approach. IEEE Trans. Smart Grid 13, 1 (2021), 255–267.Google ScholarCross Ref
- [49] . 2019. Towards privacy and security of deep learning systems: A survey. arXiv preprint arXiv:1911.12562 (2019).Google Scholar
- [50] . 2022. Robust adversarial attacks detection based on explainable deep reinforcement learning for UAV guidance and planning.Google Scholar
- [51] . 2017. Deep models under the GAN: Information leakage from collaborative deep learning. In ACM SIGSAC Conference on Computer and Communications Security. 603–618.Google ScholarDigital Library
- [52] . 2020. Malicious attacks against deep reinforcement learning interpretations. Knowl. Discov. Data Min. (2020).Google Scholar
- [53] . 2017. Adversarial attacks on neural network policies. Learning (2017).Google Scholar
- [54] . 2019. Deceptive reinforcement learning under adversarial manipulations on cost signals. Decis. Game Theor. Secur. (2019).Google Scholar
- [55] . 2019. CopyCAT: Taking control of neural policies with constant attacks. Adapt. Agents Multi-agents Syst. (2019).Google Scholar
- [56] . 2020. Challenges and countermeasures for adversarial attacks on deep reinforcement learning. arXiv: Learning (2020).Google Scholar
- [57] . 2020. Snooping attacks on deep reinforcement learning. Adapt. Agents Multi-agents Syst. (2020).Google Scholar
- [58] . 2018. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. arXiv: Learning (2018).Google Scholar
- [59] . 2021. Reinforcement learning on encrypted data. arXiv preprint arXiv:2109.08236 (2021).Google Scholar
- [60] . 2019. MemGuard: Defending against black-box membership inference attacks via adversarial examples. In ACM SIGSAC Conference on Computer and Communications Security. 259–274.Google ScholarDigital Library
- [61] . 2018. Learning attentional communication for multi-agent cooperation. Neural Inf. Process. Syst. (2018).Google Scholar
- [62] . 2022. A survey of attack, defense and related security analysis for deep reinforcement learning.Google Scholar
- [63] . 2019. PRADA: Protecting against DNN model stealing attacks. In IEEE European Symposium on Security and Privacy (EuroS&P’19). IEEE, 512–527.Google Scholar
- [64] . 2018. Model extraction warning in MLaaS paradigm. In 34th Annual Computer Security Applications Conference. 371–380.Google ScholarDigital Library
- [65] . 2020. TrojDRL: Evaluation of backdoor attacks on deep reinforcement learning. In 57th ACM/IEEE Design Automation Conference (DAC’20). IEEE, 1–6.Google ScholarCross Ref
- [66] . 2021. Deep reinforcement learning for autonomous driving: A survey. IEEE Trans. Intell. Transport. Syst. 23, 6 (2021), 4909–4926.Google ScholarCross Ref
- [67] . 2017. Delving into adversarial attacks on deep policies. Learning (2017).Google Scholar
- [68] . 2016. Adversarial examples in the physical world. Learning (2016).Google Scholar
- [69] . 2023. Can agents run relay race with strangers? Generalization of RL to out-of-distribution trajectories. arXiv preprint arXiv:2304.13424 (2023).Google Scholar
- [70] . 2019. Actor critic with differentially private critic. arXiv preprint arXiv:1910.05876 (2019).Google Scholar
- [71] . 2020. Spatiotemporally constrained action space attacks on deep reinforcement learning agents. In National Conference on Artificial Intelligence.Google ScholarCross Ref
- [72] . 2019. Learning to cope with adversarial attacks. arXiv: Learning (2019).Google Scholar
- [73] . 2021. Locally differentially private reinforcement learning for linear mixture Markov decision processes. arXiv preprint arXiv:2110.10133 (2021).Google Scholar
- [74] . 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).Google Scholar
- [75] . 2020. On the robustness of cooperative multi-agent reinforcement learning. In IEEE Symposium on Security and Privacy.Google Scholar
- [76] . 2023. A survey on reinforcement learning for recommender systems. IEEE Trans. Neural Netw. Learn. Syst. (2023).Google Scholar
- [77] . 2017. Tactics of adversarial attack on deep reinforcement learning agents. In International Conference on Learning Representations.Google Scholar
- [78] . 2017. Detecting adversarial attacks on neural network policies with visual foresight. arXiv preprint arXiv:1710.00814 (2017).Google Scholar
- [79] . 1994. Markov games as a framework for multi-agent reinforcement learning. In International Conference on Machine Learning.Google Scholar
- [80] . 2018. Emergent coordination through competition. In International Conference on Learning Representations.Google Scholar
- [81] . 2019. Privacy-preserving reinforcement learning design for patient-centric dynamic treatment regimes. IEEE Trans. Emerg. Topics Comput. 9, 1 (2019), 456–470.Google ScholarCross Ref
- [82] . 2021. Deceptive reinforcement learning for privacy-preserving planning. arXiv preprint arXiv:2102.03022 (2021).Google Scholar
- [83] . 2020. Certified adversarial robustness for deep reinforcement learning. In Conference on Robot Learning. PMLR, 1328–1337.Google Scholar
- [84] . 2021. Differentially private exploration in reinforcement learning with linear representation. arXiv preprint arXiv:2112.01585 (2021).Google Scholar
- [85] . 2017. Adversarially robust policy learning: Active construction of physically-plausible perturbations. Intell. Robot. Syst. (2017).Google Scholar
- [86] . 2019. Goal recognition for rational and irrational agents. In 18th International Conference on Autonomous Agents and MultiAgent Systems. 440–448.Google ScholarDigital Library
- [87] . 2022. RALF: An adaptive reinforcement learning framework for teaching dyslexic students. Multim. Tools Applic. 81, 5 (2022), 6389–6412.Google ScholarDigital Library
- [88] . 2018. Analog Q-learning methods for secure multiparty computation. IAENG Int. J. Comput. Sci. 45, 4 (2018), 623–629.Google Scholar
- [89] . 2015. Human-level control through deep reinforcement learning. Nature (2015).Google ScholarCross Ref
- [90] . 2022. Attacking deep reinforcement learning with decoupled adversarial policy. IEEE Trans. Depend. Sec. Comput. (2022).Google Scholar
- [91] . 2017. Universal adversarial perturbations. Comput. Vis. Pattern Recog. (2017).Google Scholar
- [92] . 2016. DeepFool: A simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition. 2574–2582.Google ScholarCross Ref
- [93] . 2021. Toward competitive multi-agents in polo game based on reinforcement learning. Multim. Tools Applic. 80 (2021), 26773–26793.Google ScholarDigital Library
- [94] . 2022. Improved regret for differentially private exploration in linear MDP. In International Conference on Machine Learning. PMLR, 16529–16552.Google Scholar
- [95] . 2021. Deep reinforcement learning for cyber security. IEEE Trans. Neural Netw. Learn. Syst. (2021).Google Scholar
- [96] . 2020. A review on reinforcement learning: Introduction and applications in industrial process control. Comput. Chem. Eng. 139 (2020), 106886.Google ScholarCross Ref
- [97] . 2017. Minimax iterative dynamic game: Application to nonlinear robot control tasks. Intell. Robot. Syst. (2017).Google Scholar
- [98] . 2020. Robust deep reinforcement learning through adversarial loss. Neural Inf. Process. Syst. (2020).Google Scholar
- [99] . 2020. Locally private distributed reinforcement learning. arXiv preprint arXiv:2001.11718 (2020).Google Scholar
- [100] . 2022. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 35 (2022), 27730–27744.Google Scholar
- [101] . 2019. How you act tells a lot: Privacy-leaking attack on deep reinforcement learning. In 18th International Conference on Autonomous Agents and MultiAgent Systems. 368–376.Google Scholar
- [102] . 2020. Privacy-preserving reinforcement learning using homomorphic encryption in cloud computing infrastructures. IEEE Access 8 (2020), 203564–203579.Google ScholarCross Ref
- [103] . 2020. BLAZE: Blazing fast privacy-preserving machine learning. arXiv preprint arXiv:2005.09042 (2020).Google Scholar
- [104] . 2022. Evaluating robustness of cooperative MARL: A model-based approach. arXiv preprint arXiv:2202.03558 (2022).Google Scholar
- [105] . 2019. Theoretical evidence for adversarial robustness through randomization. Adv. Neural Inf. Process. Syst. 32 (2019).Google Scholar
- [106] . 2017. Robust adversarial reinforcement learning.Google Scholar
- [107] . 2016. Preserving privacy of agents in reinforcement learning for distributed cognitive radio networks. In 23rd International Conference on Neural Information Processing (ICONIP’16). Springer, 555–562.Google ScholarCross Ref
- [108] . 2021. How private is your RL policy? An inverse RL based analysis framework. arXiv preprint arXiv:2112.05495 (2021).Google Scholar
- [109] . 2022. Offline reinforcement learning with differential privacy. arXiv preprint arXiv:2206.00810 (2022).Google Scholar
- [110] . 2020. Backdooring and poisoning neural networks with image-scaling attacks. In IEEE Symposium on Security and Privacy.Google Scholar
- [111] . 2018. Modeling others using oneself in multi-agent reinforcement learning. In International Conference on Machine Learning. PMLR, 4257–4266.Google Scholar
- [112] . 2020. Policy teaching via environment poisoning: Training-time adversarial attacks against reinforcement learning. In International Conference on Machine Learning.Google Scholar
- [113] . 2021. Reward poisoning in reinforcement learning: Attacks against unknown learners in unknown environments. arXiv: Learning (2021).Google Scholar
- [114] . 2020. Adversarial attacks and defenses in deep learning. Engineering (2020).Google ScholarCross Ref
- [115] . 2019. Optimal attacks on reinforcement learning policies. arXiv: Learning (2019).Google Scholar
- [116] . 2008. Privacy-preserving reinforcement learning. In 25th International Conference on Machine Learning. 864–871.Google ScholarDigital Library
- [117] . 2018. ML-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246 (2018).Google Scholar
- [118] . 2022. Reinforcement learning in manufacturing control: Baselines, challenges and ways forward. Eng. Applic. Artif. Intell. 112 (2022), 104868.Google ScholarDigital Library
- [119] . 2018. Online robust policy learning in the presence of unknown adversaries. Neural Inf. Process. Syst. (2018).Google Scholar
- [120] . 2020. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 7839 (2020), 604–609.Google ScholarCross Ref
- [121] . 2015. Trust region policy optimization. In International Conference on Machine Learning. PMLR, 1889–1897.Google ScholarDigital Library
- [122] . 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
- [123] . 2020. Differentially private actor and its eligibility trace. Electronics 9, 9 (2020), 1486.Google ScholarCross Ref
- [124] . 2020. Deep reinforcement learning with robust and smooth policy. In International Conference on Machine Learning. PMLR, 8707–8718.Google Scholar
- [125] . 2022. Efficiently computing local Lipschitz constants of neural networks via bound propagation. Adv. Neural Inf. Process. Syst. 35 (2022), 2350–2364.Google Scholar
- [126] . 2016. Sampling race: Bypassing timing-based analog active sensor spoofing detection on analog-digital systems. In 10th USENIX Conference on Offensive Technologies (WOOT’16).Google Scholar
- [127] . 2017. Membership inference attacks against machine learning models. In IEEE Symposium on Security and Privacy (SP’17). IEEE, 3–18.Google Scholar
- [128] . 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017).Google Scholar
- [129] . 2021. Reward is enough. Artif. Intell. 299 (2021), 103535.Google ScholarCross Ref
- [130] . 2019. Distributionally robust reinforcement learning. arXiv: Machine Learning (2019).Google Scholar
- [131] . 2015. Rocking drones with intentional sound noise on gyroscopic sensors. In USENIX Security Symposium.Google Scholar
- [132] . 2020. Stealthy and efficient adversarial attacks against deep reinforcement learning. In National Conference on Artificial Intelligence.Google ScholarCross Ref
- [133] . 2022. Certifiably robust policy learning against adversarial communication in multi-agent systems. arXiv preprint arXiv:2206.10158 (2022).Google Scholar
- [134] . 1991. Dyna, an integrated architecture for learning, planning, and reacting. ACM SIGART Bull. 2, 4 (1991), 160–163.Google ScholarDigital Library
- [135] . 2018. Reinforcement Learning: An Introduction. MIT Press.Google ScholarDigital Library
- [136] . 2020. An automatic cost learning framework for image steganography using deep reinforcement learning. IEEE Trans. Inf. Forens. Secur. 16 (2020), 952–967.Google ScholarCross Ref
- [137] . 2021. Improving cost learning for JPEG steganography by exploiting JPEG domain knowledge. IEEE Trans. Circ. Syst. Vid. Technol. (2021).Google Scholar
- [138] . 2019. Action robust reinforcement learning and applications in continuous control. In International Conference on Machine Learning.Google Scholar
- [139] . 2016. Stealing machine learning models via prediction APIs. In 25th USENIX Security Symposium (USENIX Security’16). 601–618.Google Scholar
- [140] . 2018. Sequential attacks on agents for long-term adversarial goals. arXiv: Learning (2018).Google Scholar
- [141] . 2021. Adversarial attacks on multi-agent communication. In International Conference on Computer Vision.Google Scholar
- [142] . 2020. Private reinforcement learning with PAC and regret guarantees. In International Conference on Machine Learning. PMLR, 9754–9764.Google Scholar
- [143] . 2020. A survey of multi-task deep reinforcement learning. Electronics 9, 9 (2020), 1363.Google ScholarCross Ref
- [144] . 2018. Stealing hyperparameters in machine learning. In IEEE Symposium on Security and Privacy (SP’18). IEEE, 36–52.Google Scholar
- [145] . 2019. Privacy-preserving Q-learning with functional noise in continuous spaces. Adv. Neural Inf. Process. Syst. 32 (2019).Google Scholar
- [146] . 2023. Are large language models really robust to word-level perturbations? arXiv preprint arXiv:2309.11166 (2023).Google Scholar
- [147] . 2020. Reinforcement learning with perturbed rewards. In National Conference on Artificial Intelligence.Google ScholarCross Ref
- [148] . 2021. BACKDOORL: Backdoor attack against competitive reinforcement learning. In International Joint Conference on Artificial Intelligence.Google Scholar
- [149] . 2020. Deep learning defense method against adversarial attacks. Syst., Man Cybern. (2020).Google Scholar
- [150] . 2021. Beta-CROWN: Efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification. Adv. Neural Inf. Process. Syst. 34 (2021).Google Scholar
- [151] . 2019. Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE Conference on Computer Communications (INFOCOM’19). IEEE, 2512–2520.Google ScholarDigital Library
- [152] . 2018. Towards fast computation of certified robustness for ReLU networks. In International Conference on Machine Learning. PMLR, 5276–5285.Google Scholar
- [153] . 2020. Toward evaluating robustness of deep reinforcement learning with continuous control. Learning (2020).Google Scholar
- [154] . 2018. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning. PMLR, 5286–5295.Google Scholar
- [155] . 2021. COPA: Certifying robust policies for offline reinforcement learning against poisoning attacks. In International Conference on Learning Representations.Google Scholar
- [156] . 2020. The value of collaboration in convex machine learning with differential privacy. In IEEE Symposium on Security and Privacy (SP’20). IEEE, 304–317.Google Scholar
- [157] . 2021. Adversarial policy training against deep reinforcement learning. In 30th USENIX Security Symposium (USENIX Security’21). 1883–1900.Google Scholar
- [158] . 2023. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864 (2023).Google Scholar
- [159] . 2018. A PCA-based model to predict adversarial examples on Q-learning of path finding. In IEEE International Conference on Data Science in Cyberspace.Google Scholar
- [160] . 2018. Characterizing attacks on deep reinforcement learning. arXiv: Learning (2018).Google Scholar
- [161] . 2019. Seeing is not believing: Camouflage attacks on image scaling algorithms. In USENIX Security Symposium.Google Scholar
- [162] . 2018. Mitigating adversarial effects through randomization. In International Conference on Learning Representations.Google Scholar
- [163] . 2019. Feature denoising for improving adversarial robustness. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 501–509.Google ScholarCross Ref
- [164] . 2019. Privacy preserving off-policy evaluation. arXiv preprint arXiv:1902.00174 (2019).Google Scholar
- [165] . 2022. Defending observation attacks in deep reinforcement learning via detection and denoising. arXiv preprint arXiv:2206.07188 (2022).Google Scholar
- [166] . 2022. Transferable environment poisoning: Training-time attack on reinforcement learner with limited prior knowledge. In 21st International Conference on Autonomous Agents and Multiagent Systems. 1884–1886.Google ScholarDigital Library
- [167] . 2021. Transferable environment poisoning: Training-time attack on reinforcement learning. In 20th International Conference on Autonomous Agents and Multiagent Systems. 1398–1406.Google ScholarDigital Library
- [168] . 2020. Automatic perturbation analysis for scalable certified robustness and beyond. Adv. Neural Inf. Process. Syst. 33 (2020).Google Scholar
- [169] . 2022. Trustworthy reinforcement learning against intrinsic vulnerabilities: Robustness, safety, and generalizability. arXiv preprint arXiv:2209.08025 (2022).Google Scholar
- [170] . 2021. Mis-spoke or mis-lead: Achieving robustness in multi-agent communicative reinforcement learning. arXiv: Learning (2021).Google Scholar
- [171] . 2020. Enhanced adversarial strategically-timed attacks against deep reinforcement learning. In International Conference on Acoustics, Speech, and Signal Processing.Google Scholar
- [172] . 2020. IPBSM: An optimal bribery selfish mining in the presence of intelligent and pure attackers. Int. J. Intell. Syst. 35, 11 (2020), 1735–1748.Google ScholarDigital Library
- [173] . 2020. Deep reinforcement learning for automated stock trading: An ensemble strategy. In 1st ACM International Conference on AI in Finance. 1–8.Google ScholarDigital Library
- [174] . 2020. CloudLeak: Large-scale deep learning models stealing through adversarial examples. In Network and Distributed System Security Symposium (NDSS’20).Google Scholar
- [175] . 2023. GPTFUZZER: Red teaming large language models with auto-generated jailbreak prompts. arXiv preprint arXiv:2309.10253 (2023).Google Scholar
- [176] . 2021. A review of deep reinforcement learning for smart building energy management. IEEE Internet Things J. 8, 15 (2021), 12046–12063.Google ScholarCross Ref
- [177] . 2022. A temporal-pattern backdoor attack to deep reinforcement learning. In IEEE Global Communications Conference (GLOBECOM’22). IEEE, 2710–2715.Google ScholarCross Ref
- [178] . 2019. A novel multi-step reinforcement learning method for solving reward hacking. Appl. Intell. (2019).Google ScholarDigital Library
- [179] . 2020. Preventing imitation learning with adversarial policy ensembles. arXiv preprint arXiv:2002.01059 (2020).Google Scholar
- [180] . 2021. Robust reinforcement learning on state observations with learned optimal adversary. In International Conference on Learning Representations.Google Scholar
- [181] . 2020. Robust deep reinforcement learning against adversarial perturbations on state observations. Neural Inf. Process. Syst. (2020).Google Scholar
- [182] . 2008. Value-based policy teaching with active indirect elicitation. In National Conference on Artificial Intelligence.Google Scholar
- [183] . 2009. Policy teaching through reward function learning. Electron. Commerce (2009).Google Scholar
- [184] . 2018. Efficient neural network robustness certification with general activation functions. Adv. Neural Inf. Process. Syst. 31 (2018).Google Scholar
- [185] . 2021. Multi-agent reinforcement learning: A selective overview of theories and algorithms. In Handbook of Reinforcement Learning and Control. Springer, 321–384.Google ScholarCross Ref
- [186] . 2020. Succinct and robust multi-agent communication with temporal message control. Neural Inf. Process. Syst. (2020).Google Scholar
- [187] . 2021. Fairness in learning-based sequential decision algorithms: A survey. In Handbook of Reinforcement Learning and Control. Springer, 525–555.Google ScholarCross Ref
- [188] . 2020. Sim-to-real transfer in deep reinforcement learning for robotics: A survey. In IEEE Symposium Series on Computational Intelligence (SSCI’20). IEEE, 737–744.Google Scholar
- [189] . 2023. Empirical study of privacy inference attack against deep reinforcement learning models. Connect. Sci. 35, 1 (2023), 2211240.Google ScholarCross Ref
- [190] . 2022. Differentially private reinforcement learning with linear function approximation. Proceedings of the ACM Measur. Anal. Comput. Syst. 6, 1 (2022), 1–27.Google ScholarDigital Library
- [191] . 2022. RoMFAC: A robust mean-field actor-critic reinforcement learning against adversarial perturbations on states. arXiv preprint arXiv:2205.07229 (2022).Google Scholar
- [192] . 2023. PromptBench: Towards evaluating the robustness of large language models on adversarial prompts. arXiv preprint arXiv:2306.04528 (2023).Google Scholar
- [193] . 2019. Deep leakage from gradients. Adv. Neural Inf. Process. Syst. 32 (2019).Google Scholar
Index Terms
- Security and Privacy Issues in Deep Reinforcement Learning: Threats and Countermeasures
Recommendations
Deep-attack over the deep reinforcement learning
AbstractRecent adversarial attack developments have made reinforcement learning more vulnerable, and different approaches exist to deploy attacks against it, where the key is how to choose the right timing of the attack. Some work tries to ...
Highlights- Adversarial attack over DRL in terms of attack efficiency and stealth is explored.
Snooping Attacks on Deep Reinforcement Learning
AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent SystemsAdversarial attacks have exposed a significant security vulnerability in state-of-the-art machine learning models. Among these models include deep reinforcement learning agents. The existing methods for attacking reinforcement learning agents assume the ...
Reward Delay Attacks on Deep Reinforcement Learning
Decision and Game Theory for SecurityAbstractMost reinforcement learning algorithms implicitly assume strong synchrony. We present novel attacks targeting Q-learning that exploit a vulnerability entailed by this assumption by delaying the reward signal for a limited time period. We consider ...
Comments