Skip to main content

Mutation Testing of Reinforcement Learning Systems

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13071))

Abstract

Reinforcement Learning (RL), one of the most active research areas in artificial intelligence, focuses on goal-directed learning from interaction with an uncertain environment. RL systems play an increasingly important role in many aspects of society. Therefore, its safety issues have received more and more attention. Testing has achieved great success in ensuring safety of the traditional software systems. However, current testing approaches hardly consider RL systems. To fill this gap, we propose the first Mutation Testing technique specialized for RL systems. We define a series of mutation operators simulating possible problems RL systems may encounter. Next, we design test environments that could reveal possible problems within the RL systems. The mutation score specialized for RL systems is proposed to analyze the extent of potential faults and evaluate the quality of test environments. Our evaluation in three popular environments, namely FrozenLake, CartPole, and MountainCar demonstrates the practicability of the proposed techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Many learning agents in the RL system have a large number of sensors to observe the environment.

  2. 2.

    Corresponding rewards are also considered as lost too.

  3. 3.

    The numbers of states that the agent should observe in different environments are with huge difference.

References

  1. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  2. Zheng, G., et al.: DRN: a deep reinforcement learning framework for news recommendation. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, 23–27 April 2018, pp. 167–176. ACM (2018)

    Google Scholar 

  3. El Sallab, A., Abdou, M., Perot, E., Yogamani, S.K.: Deep reinforcement learning framework for autonomous driving. Electron. Imaging 2017(19), 70–76 (2017)

    Article  Google Scholar 

  4. Yu, C., Liu, J., Nemati, S.: Reinforcement learning in healthcare: a survey. CoRR, abs/1908.08796 (2019)

    Google Scholar 

  5. Kober, J., Peters, J.: Reinforcement learning in robotics: a survey. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 579–610. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_18

    Chapter  Google Scholar 

  6. Sun, J., et al.: Stealthy and efficient adversarial attacks against deep reinforcement learning. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 5883–5891. AAAI Press (2020)

    Google Scholar 

  7. Jia, Y., Harman, M.: An analysis and survey of the development of mutation testing. IEEE Trans. Software Eng. 37(5), 649–678 (2011)

    Article  Google Scholar 

  8. Ma, L., et al.: DeepMutation: mutation testing of deep learning systems. In: 29th IEEE International Symposium on Software Reliability Engineering, ISSRE 2018, Memphis, TN, USA, 15–18 October 2018, pp. 100–111. IEEE Computer Society (2018)

    Google Scholar 

  9. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)

    MATH  Google Scholar 

  10. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  11. Nguyen, T.T., Reddi, V.J.: Deep reinforcement learning for cyber security. CoRR (2019)

    Google Scholar 

  12. Lipton, R.: Fault diagnosis of computer programs. Ph.D. thesis, Carnegie Mellon University (1971)

    Google Scholar 

  13. DeMillo, R.A., Lipton, R.J., Sayward, F.G.: Hints on test data selection: help for the practicing programmer. Computer 11(4), 34–41 (1978)

    Article  Google Scholar 

  14. Hamlet, R.G.: Testing programs with the aid of a compiler. IEEE Trans. Software Eng. 3(4), 279–290 (1977)

    Article  MathSciNet  Google Scholar 

  15. Delamaro, M.E., Maldonado, J.C., Mathur, A.P.: Interface mutation: an approach for integration testing. IEEE Trans. Software Eng. 27(3), 228–247 (2001)

    Article  Google Scholar 

  16. Vigna, G., Robertson, W.K., Balzarotti, D.: Testing network-based intrusion detection signatures using mutant exploits. In: Atluri, V., Pfitzmann, B., McDaniel, P.D. (eds.) Proceedings of the 11th ACM Conference on Computer and Communications Security, CCS 2004, Washington, DC, USA, 25–29 October 2004, pp. 21–30. ACM (2004)

    Google Scholar 

  17. Moran, K., et al.: MDroid+: a mutation testing framework for android. In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M. (eds.) Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, ICSE 2018, Gothenburg, Sweden, 27 May–03 June 2018, pp. 33–36. ACM (2018)

    Google Scholar 

  18. Hu, Q., Ma, L., Xie, X., Yu, B., Liu, Y., Zhao, J.: DeepMutation++: a mutation testing framework for deep learning systems. In: 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA, USA, 11–15 November 2019, pp. 1158–1161. IEEE (2019)

    Google Scholar 

  19. Wang, J., Dong, G., Sun, J., Wang, X., Zhang, P.: Adversarial sample detection for deep neural network through model mutation testing. In: Atlee, J.M., Bultan, T., Whittle, J. (eds.) Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, 25–31 May 2019, pp. 1245–1256. IEEE/ACM (2019)

    Google Scholar 

  20. Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, 2–7 February 2018, pp. 3215–3222. AAAI Press (2018)

    Google Scholar 

  21. Kaiser, L., et al.: Model based reinforcement learning for atari. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020 (2020)

    Google Scholar 

  22. Wen, L., Duan, J., Li, S.E., Xu, S., Peng, H.: Safe reinforcement learning for autonomous vehicles through parallel constrained policy optimization. CoRR, abs/2003.01303 (2020)

    Google Scholar 

  23. Szegedy, C., et al.: Intriguing properties of neural networks. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)

    Google Scholar 

  24. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)

    Google Scholar 

  25. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5

    Chapter  Google Scholar 

  26. Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, 28–31 October 2017, pp. 1–18. ACM (2017)

    Google Scholar 

  27. Ma, L., et al.: DeepGauge: multi-granularity testing criteria for deep learning systems. In: Huchard, M., Kästner, C., Fraser, G. (eds.) Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, 3–7 September 2018, pp. 120–131. ACM (2018)

    Google Scholar 

  28. Gerasimou, S., Eniser, H.F., Sen, A., Cakan, A.: Importance-driven deep learning system testing. In: ICSE 2020: 42nd International Conference on Software Engineering, Seoul, South Korea, June 27–19 July 2020, pp. 702–713. ACM (2020)

    Google Scholar 

  29. Uesato, J., et al.: Rigorous agent evaluation: an adversarial approach to uncover catastrophic failures. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019 (2019)

    Google Scholar 

  30. Tran, H.-D., Cai, F., Lopez, D.M., Musau, P., Johnson, T.T., Koutsoukos, X.D.: Safety verification of cyber-physical systems with reinforcement learning control. ACM Trans. Embed. Comput. Syst. 18(5s), 105:1–105:22 (2019)

    Google Scholar 

Download references

Acknowledgement

This research was supported by the Guangdong Science and Technology Department (Grant No. 2018B010107004) and the National Natural Science Foundation of China under Grant No. 62172019, 61772038, 61532019.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lu, Y., Sun, W., Sun, M. (2021). Mutation Testing of Reinforcement Learning Systems. In: Qin, S., Woodcock, J., Zhang, W. (eds) Dependable Software Engineering. Theories, Tools, and Applications. SETTA 2021. Lecture Notes in Computer Science(), vol 13071. Springer, Cham. https://doi.org/10.1007/978-3-030-91265-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91265-9_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91264-2

  • Online ISBN: 978-3-030-91265-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics