Skip to main content

Verifiable and Scalable Mission-Plan Synthesis for Autonomous Agents

  • Conference paper
  • First Online:
Formal Methods for Industrial Critical Systems (FMICS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12327))

Abstract

The problem of synthesizing mission plans for multiple autonomous agents, including path planning and task scheduling, is often complex. Employing model checking alone to solve the problem might not be feasible, especially when the number of agents grows or requirements include real-time constraints. In this paper, we propose a novel approach called MCRL that integrates model checking and reinforcement learning to overcome this limitation. Our approach employs timed automata and timed computation tree logic to describe the autonomous agents’ behavior and requirements, and trains the model by a reinforcement learning algorithm, namely Q-learning, to populate a table used to restrict the state space of the model. Our method provides means to synthesize mission plans for multi-agent systems whose complexity exceeds the scalability boundaries of exhaustive model checking, but also to analyze and verify synthesized mission plans to ensure given requirements. We evaluate the proposed method on various scenarios involving autonomous agents, as well as present comparisons with two similar approaches, TAMAA and UPPAAL STRATEGO. The evaluation shows that MCRL performs better for a number of agents greater than three.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    UPPAAL 4.1.22 was published in March 2019 on http://www.uppaal.org/.

  2. 2.

    Further visual information on the missions plans in TAMAA can be found at http://doi.org/10.5281/zenodo.3731960.

References

  1. Abdeddaı, Y., Asarin, E., Maler, O., et al.: Scheduling with Timed Automata, vol. 354. Elsevier, Amsterdam (2006)

    Google Scholar 

  2. Alur, R., Dill, D.: Automata for modeling real-time systems. In: Paterson, M.S. (ed.) ICALP 1990. LNCS, vol. 443, pp. 322–335. Springer, Heidelberg (1990). https://doi.org/10.1007/BFb0032042

    Chapter  Google Scholar 

  3. Behjati, R., Sirjani, M., Nili Ahmadabadi, M.: Bounded rational search for on-the-fly model checking of LTL properties. In: Arbab, F., Sirjani, M. (eds.) FSEN 2009. LNCS, vol. 5961, pp. 292–307. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11623-0_17

    Chapter  Google Scholar 

  4. Behrmann, G., Cougnard, A., David, A., Fleury, E., Larsen, K.G., Lime, D.: UPPAAL-Tiga: time for playing games!. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 121–125. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73368-3_14

    Chapter  Google Scholar 

  5. Bengtsson, J., Yi, W.: Timed automata: semantics, algorithms and tools. In: Desel, J., Reisig, W., Rozenberg, G. (eds.) ACPN 2003. LNCS, vol. 3098, pp. 87–124. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27755-2_3

    Chapter  Google Scholar 

  6. Bouton, M., Cosgun, A., Kochenderfer, M.J.: Belief state planning for autonomously navigating urban intersections. In: Intelligent Vehicles Symposium, pp. 825–830. IEEE (2017)

    Google Scholar 

  7. Bouton, M., Karlsson, J., Nakhaei, A., Fujimura, K., Kochenderfer, M.J., Tumova, J.: Reinforcement learning with probabilistic guarantees for autonomous driving. arXiv preprint arXiv:1904.07189 (2019)

  8. Bucklew, J.: Introduction to Rare Event Simulation. Springer, New York (2013). https://doi.org/10.1007/978-1-4757-4078-3

    Book  Google Scholar 

  9. Chandler, P., Pachter, M.: Research issues in autonomous control of tactical UAVs. In: Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No. 98CH36207). IEEE (1998)

    Google Scholar 

  10. Clarke, E.M., Klieber, W., Nováček, M., Zuliani, P.: Model checking and the state explosion problem. In: Meyer, B., Nordio, M. (eds.) LASER 2011. LNCS, vol. 7682, pp. 1–30. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35746-6_1

    Chapter  Google Scholar 

  11. Daniel, K., Nash, A., Koenig, S., Felner, A.: Theta*: any-angle path planning on grids. J. Artif. Intell. Res. 39, 533–579 (2010)

    Article  MathSciNet  Google Scholar 

  12. David, A., et al.: Statistical model checking for stochastic hybrid systems (2012)

    Google Scholar 

  13. David, A., Jensen, P.G., Larsen, K.G., Mikučionis, M., Taankvist, J.H.: Uppaal Stratego. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 206–211. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46681-0_16

    Chapter  Google Scholar 

  14. Dewey, D.: Reinforcement learning and the reward engineering principle. In: 2014 AAAI Spring Symposium Series (2014)

    Google Scholar 

  15. Fisher, H.: Probabilistic learning combinations of local job-shop scheduling rules. In: Industrial Scheduling, pp. 225–251. Prentice Hall, Englewood Cliffs (1963)

    Google Scholar 

  16. Franklin, S., Graesser, A.: Is it an agent, or just a program?: a taxonomy for autonomous agents. In: Müller, J.P., Wooldridge, M.J., Jennings, N.R. (eds.) ATAL 1996. LNCS, vol. 1193, pp. 21–35. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0013570

    Chapter  Google Scholar 

  17. Gu, R., Enoiu, E.P., Seceleanu, C.: TAMAA: UPPAAL-based mission planning for autonomous agents. In: The 35th ACM/SIGAPP Symposium On Applied Computing SAC2020, Brno, Czech Republic, 30 March 2020 (2019)

    Google Scholar 

  18. Gu, R., Marinescu, R., Seceleanu, C., Lundqvist, K.: Towards a two-layer framework for verifying autonomous vehicles. In: Badger, J.M., Rozier, K.Y. (eds.) NFM 2019. LNCS, vol. 11460, pp. 186–203. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20652-9_12

    Chapter  Google Scholar 

  19. Larsen, K.G., Legay, A.: On the power of statistical model checking. In: Margaria, T., Steffen, B. (eds.) ISoLA 2016. LNCS, vol. 9953, pp. 843–862. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47169-3_62

    Chapter  Google Scholar 

  20. LaValle, S.M.: Rapidly-exploring random trees: a new tool for path planning. Technical report (1998)

    Google Scholar 

  21. Li, X., Serlin, Z., Yang, G., Belta, C.: A formal methods approach to interpretable reinforcement learning for robotic planning. Sci. Robot. 4 (2019)

    Google Scholar 

  22. Mallozzi, P., Pardo, R., Duplessis, V., Pelliccione, P., Schneider, G.: MoVEMo: a structured approach for engineering reward functions. In: 2018 Second IEEE International Conference on Robotic Computing (IRC), pp. 250–257. IEEE (2018)

    Google Scholar 

  23. Nikou, A., Boskos, D., Tumova, J., Dimarogonas, D.V.: On the timed temporal logic planning of coupled multi-agent systems. Automatica 97, 339–345 (2018)

    Article  MathSciNet  Google Scholar 

  24. Pelánek, R.: Fighting state space explosion: review and evaluation. In: Cofer, D., Fantechi, A. (eds.) FMICS 2008. LNCS, vol. 5596, pp. 37–52. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03240-0_7

    Chapter  Google Scholar 

  25. Sutton, R.S., Barto, A.G., et al.: Introduction to Reinforcement Learning, vol. 2. MIT press Cambridge, Cambridge (1998)

    Google Scholar 

  26. Wang, Y., Chaudhuri, S., Kavraki, L.E.: Bounded policy synthesis for POMDPs with safe-reachability objectives. In: International Conference on Autonomous Agents and Multi Agent Systems. IFAAMS (2018)

    Google Scholar 

  27. Watkins, C.J.H.: Learning from Delayed Rewards. King’s College, Cambridge (1989)

    Google Scholar 

Download references

Acknowledgement

The research leading to the presented results has been undertaken within the research profile DPAC - Dependable Platform for Autonomous Systems and Control project, funded by the Swedish Knowledge Foundation, grant number: 20150022.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rong Gu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gu, R., Enoiu, E., Seceleanu, C., Lundqvist, K. (2020). Verifiable and Scalable Mission-Plan Synthesis for Autonomous Agents. In: ter Beek, M.H., Ničković, D. (eds) Formal Methods for Industrial Critical Systems. FMICS 2020. Lecture Notes in Computer Science(), vol 12327. Springer, Cham. https://doi.org/10.1007/978-3-030-58298-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58298-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58297-5

  • Online ISBN: 978-3-030-58298-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics