Skip to main content

Model-Driven Design and Generation of Training Simulators for Reinforcement Learning

  • Conference paper
  • First Online:
Conceptual Modeling (ER 2024)

Abstract

Reinforcement learning (RL) is an important class of machine learning techniques, in which intelligent agents optimize their behavior by observing and evaluating the outcomes of their repeated interactions with their environment. A key to successfully engineering such agents is to provide them with the opportunity to engage in a large number of such interactions safely and at a low cost. This is often achieved through developing simulators of such interactions, in which the agents can be trained while also different training strategies and parameters are explored. However, specifying and implementing such simulators can be a complex endeavor requiring a systematic process for capturing and analyzing both the goals and actions of the agents and the characteristics of the target environment. We propose a framework for model-driven goal-oriented development of RL simulation environments. The framework utilizes a set of extensions to a standard goal modeling notation that allows concise modeling of a large number of ways by which an intelligent agent can interact with its environment. Though subsequent formalization, the model is used by a specially constructed simulation engine to simulate agent behavior, such that off-the-shelf RL algorithms can use it as a training environment. We present the extension of the goal modeling language, sketch its semantics, and show how models built with it can be made executable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Open AI Gym (2022). https://github.com/openai/gym

  2. Abdelzad, V., Amyot, D., Alwidian, S., Lethbridge, T.: A textual syntax with tool support for the goal-oriented requirement language. In: Proceedings of the 8th International i* Workshop (iStar 2015) (2015).https://ceur-ws.org/Vol-1402/paper6.pdf

  3. Ahmad, K., Bano, M., Abdelrazek, M., Arora, C., Grundy, J.: What’s up with requirements engineering for artificial intelligence systems? In: Proceedigns of the 29th IEEE International Requirements Engineering Conference (R 2021), pp. 1–12 (2021). https://doi.org/10.1109/RE51729.2021.00008

  4. Amyot, D., Mussbacher, G.: User requirements notation: the first ten years, the next ten years (invited paper). J. Softw. (JSW) 6(5), 747–768 (2011)

    Google Scholar 

  5. Anderson, R.N., Boulanger, A., Powell, W.B., Scott, W.: Adaptive stochastic control for the smart grid. Proc. IEEE 99(6), 1098–1115 (2011). https://doi.org/10.1109/JPROC.2011.2109671

    Article  Google Scholar 

  6. Angelopoulos, K., Papadopoulos, A.V., Silva Souza, V.E., Mylopoulos, J.: Model predictive control for software systems with CobRA. In: Proceedings of the 11th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2016), pp. 35–46. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2897053.2897054

  7. Beck, D., Lakemeyer, G.: Reinforcement learning for Golog programs with first-order state-abstraction. Logic J. IGPL 20(5), 909–942 (2012). https://doi.org/10.1093/jigpal/jzs011

  8. Bencomo, N., Belaggoun, A.: Supporting decision-making for self-adaptive systems: from goal models to dynamic decision networks. In: Doerr, J., Opdahl, A.L. (eds.) REFSQ 2013. LNCS, vol. 7830, pp. 221–236. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37422-7_16

    Chapter  Google Scholar 

  9. Berry, D.M.: Requirements engineering for artificial intelligence: what is a requirements specification for an artificial intelligence? In: Gervasi, V., Vogelsang, A. (eds.) REFSQ 2022. LNCS, vol. 13216, pp. 19–25. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98464-9_2

    Chapter  Google Scholar 

  10. Bork, D., Ali, S.J., Roelens, B.: Conceptual modeling and artificial intelligence: A systematic mapping study. The Computing Research Repository (CoRR) abs/2303.0 (2023). https://doi.org/10.48550/arXiv.2303.06758

  11. Bork, D., et al.: 1st workshop on conceptual modeling meets artificial intelligence and data-driven decision making (CMAI 2020). In: Grossmann, G., Ram, S. (eds.) Advances in Conceptual Modeling. ER 2020 Workshops CMAI, CMLS, CMOMM4FAIR, CoMoNoS, EmpER. Vienna, Austria (2020), https://doi.org/10.1007/978-3-030-65847-213

  12. Boutilier, C., Reiter, R., Soutchanski, M., Thrun, S.: Decision-theoretic, high-level agent programming in the situation calculus. In: Proceedings of the 17th Conference on Artificial Intelligence (AAAI 2000), pp. 355–362. AAAI Press, Austin, TX (2000). https://dl.acm.org/doi/10.5555/647288.721273

  13. Brunotte, W., Chazette, L., Klös, V., Speith, T.: Quo vadis, explainability? – a research roadmap for explainability engineering. In: Gervasi, V., Vogelsang, A. (eds.) REFSQ 2022. LNCS, vol. 13216, pp. 26–32. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98464-9_3

    Chapter  Google Scholar 

  14. Chazette, L., Brunotte, W., Speith, T.: Exploring explainability: a definition, a model, and a knowledge catalogue. In: Proceedings fo the 29th IEEE International Requirements Engineering Conference (RE 2021), pp. 197–208 (2021). https://doi.org/10.1109/RE51729.2021.00025

  15. Chazette, L., Schneider, K.: Explainability as a non-functional requirement: challenges and recommendations. Requirements Eng. 25(4), 493–514 (2020). https://doi.org/10.1007/s00766-020-00333-1

    Article  Google Scholar 

  16. Cognini, R., Corradini, F., Gnesi, S., Polini, A., Re, B.: Business process flexibility - a systematic literature review with a software systems perspective. Inf. Syst. Front. 20(2), 343–371 (2018). https://doi.org/10.1007/s10796-016-9678-2

    Article  Google Scholar 

  17. Dalpiaz, F., Franch, X., Horkoff, J.: iStar 2.0 Language Guide. The Computing Research Repository (CoRR) abs/1605.0 (2016). http://arxiv.org/abs/1605.07767

  18. Damiani, E., Frati, F.: Towards conceptual models for machine learning computations. In: Trujillo, J.C., et al. (eds.) ER 2018. LNCS, vol. 11157, pp. 3–9. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-5_1

    Chapter  Google Scholar 

  19. Dell’Anna, D., Dalpiaz, F., Dastani, M.: Validating goal models via bayesian networks. In: Proceedings of the 5th International Workshop on Artificial Intelligence for Requirements Engineering (AIRE 2018), pp. 39–46 (2018). https://doi.org/10.1109/AIRE.2018.00012

  20. Dell’Anna, D., Dalpiaz, F., Dastani, M.: Requirements-driven evolution of sociotechnical systems via probabilistic reasoning and hill climbing. Autom. Softw. Eng. 26(3), 513–557 (2019). https://doi.org/10.1007/s10515-019-00255-5

    Article  Google Scholar 

  21. Durán, F., Rocha, C., Salaün, G.: Stochastic analysis of BPMN with time in rewriting logic. Sci. Comput. Program. 168, 1–17 (2018). https://doi.org/10.1016/j.scico.2018.08.007

  22. Félix Solano, G., Diniz Caldas, R., Nunes Rodrigues, G., Vogel, T., Pelliccione, P.: Taming uncertainty in the assurance process of self-adaptive systems: a goal-oriented approach. In: Proceedings of the 14th IEEE/ACM International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2019), pp. 89–99 (May 2019). https://doi.org/10.1109/SEAMS.2019.00020

  23. Giorgini, P., Mylopoulos, J., Nicchiarelli, E., Sebastiani, R.: Formal reasoning techniques for goal models. In: Spaccapietra, S., March, S., Aberer, K. (eds.) Journal on Data Semantics I. LNCS, vol. 2800, pp. 1–20. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39733-5_1

    Chapter  Google Scholar 

  24. Goldsby, H.J., Sawyer, P., Bencomo, N., Cheng, B.H.C., Hughes, D.: Goal-based modeling of dynamically adaptive system requirements. In: 15th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems (ecbs 2008), pp. 36–45 (200). https://doi.org/10.1109/ECBS.2008.22

  25. Gonçalves, E., Araujo, J., Castro, J.: iStar4RationalAgents: modeling requirements of multi-agent systems with rational agents. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 558–566. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_46

    Chapter  Google Scholar 

  26. Gottesman, O., et al.: Guidelines for reinforcement learning in healthcare. Nat. Med. 25(1), 16–18 (2019). https://doi.org/10.1038/s41591-018-0310-5

    Article  Google Scholar 

  27. Habibullah, K.M., Horkoff, J.: Non-functional requirements for machine learning: understanding current use and challenges in industry. In: Proceedings of the 29th IEEE International Requirements Engineering Conference (RE 2021), pp. 13–23 (2021). https://doi.org/10.1109/RE51729.2021.00009

  28. Hartmann, T., Moawad, A., Schockaert, C., Fouquet, F., Le Traon, Y.: Meta-modelling meta-learning. In: Proceedings of the 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS 2019), pp. 300–305 (2019). https://doi.org/10.1109/MODELS.2019.00014

  29. Hayes, C.F., et al.: A practical guide to multi-objective reinforcement learning and planning. Auton. Agent. Multi-Agent Syst. 36(1), 26 (2022). https://doi.org/10.1007/s10458-022-09552-y

    Article  Google Scholar 

  30. Heaven, W., Letier, E.: Simulating and optimising design decisions in quantitative goal models. In: Proceedings of the 19th IEEE International Requirements Engineering Conference (RE 2011), pp. 79–88. Trento, Italy (2011). https://doi.org/10.1109/RE.2011.6051653

  31. Herbert, L.T., Hansen, Z.N.L., Jacobsen, P.: SBOAT: a stochastic BPMN analysis and optimisation tool. In: Karlaftis, M.G., Lagaros, N.D., Papadrakakis, M. (eds.) Proceedings of the 1st International Conference on Engineering and Applied Sciences Optimization (OPT-i), pp. 1136–1152. National Technical University of Athens (2014). http://www.opti2014.org/

  32. Hinton, A., Kwiatkowska, M., Norman, G., Parker, D.: PRISM: A tool for automatic verification of probabilistic systems. In: Hermanns, H., Palsberg, J. (eds.) TACAS 2006. LNCS, vol. 3920, pp. 441–444. Springer, Heidelberg (2006). https://doi.org/10.1007/11691372_29

    Chapter  Google Scholar 

  33. Ishikawa, F.: Concepts in quality assessment for machine learning - from test data to arguments. In: Trujillo, J.C., et al. (eds.) ER 2018. LNCS, vol. 11157, pp. 536–544. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-5_39

    Chapter  Google Scholar 

  34. Kimmig, A., Demoen, B., De Raedt, L., Costa, V.S., Rocha, R.: On the implementation of the probabilistic logic programming language ProbLog. Theory Pract. Logic Program. 11(2–3), 235–262 (2011). https://doi.org/10.1017/S1471068410000566

    Article  MathSciNet  Google Scholar 

  35. Kusmenko, E., Nickels, S., Pavlitskaya, S., Rumpe, B., Timmermanns, T.: Modeling and training of neural processing systems. In: Proceedings of the ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS 2019), pp. 283–293 (2019). https://doi.org/10.1109/MODELS.2019.00012

  36. Letier, E., van Lamsweerde, A.: Reasoning about partial goal satisfaction for requirements and design engineering. In: Proceedings of the 12th International Symposium on the Foundation of Software Engineering (FSE 2004), pp. 53–62. ACM Press, Newport Beach, CA (Nov 2004). https://doi.org/10.1145/1041685.1029905

  37. Liaskos, S.: Tool support for modeling and reasoning with decision theoretic goal models. CEUR Workshop Proceedings 3618 (2023). https://ceur-ws.org/Vol-3618/pd_paper_2.pdf

  38. Liaskos, S., Golipour, R.: Tool and reproducibility package for: Model-driven design and generation of training simulators for reinforcement learning (2024). https://github.com/cmg-york/RLGen

  39. Liaskos, S., Khan, S.M., Litoiu, M., Jungblut, M.D., Rogozhkin, V., Mylopoulos, J.: Behavioral adaptation of information systems through goal models. Inform. Syst. (IS) 37(8), 767–783 (2012). https://doi.org/10.1016/j.is.2012.05.006

  40. Liaskos, S., Khan, S.M., Mylopoulos, J.: Modeling and reasoning about uncertainty in goal models: a decision-theoretic approach. Softw. Syst. Model. 21, 1–24 (2022). https://doi.org/10.1007/s10270-021-00968-w

    Article  Google Scholar 

  41. Liaskos, S., Khan, S.M., Soutchanski, M., Mylopoulos, J.: Modeling and Reasoning with Decision-Theoretic Goals. In: Proceedings of the 32th International Conference on Conceptual Modeling, (ER 2013), Hong-Kong, China, pp. 19–32 (2013). https://doi.org/10.1007/978-3-642-41924-9_3

  42. Liaskos, S., McIlraith, S.A., Mylopoulos, J.: Towards augmenting requirements models with preferences. In: Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE 2009), pp. 565–569 (2009). https://doi.org/10.1109/ASE.2009.91

  43. Liaskos, S., McIlraith, S.A., Sohrabi, S., Mylopoulos, J.: Integrating preferences into goal models for requirements engineering. In: Proceedings of the 10th IEEE International Requirements Engineering Conference (RE 2010), Sydney, Australia (2010). https://doi.org/10.1109/RE.2010.26

  44. Lima, P., et al.: Scalability of istar: a systematic mapping study. In: Workshop em Engenharia de Requisitos (WER 2016) (2016). https://api.semanticscholar.org/CorpusID:59248836

  45. Liu, W., Wang, Y., Zhou, Q., Li, T.: Graphical modeling vs. textual modeling: an experimental comparison based on istar models. In: Proceedings of the 45th IEEE Annual Computers, Software, and Applications Conference (COMPSAC 2021), pp. 844–853 (2021). https://doi.org/10.1109/COMPSAC51774.2021.00117

  46. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), pp. 1928–1937. JMLR.org (2016). https://dl.acm.org/doi/10.5555/3045390.3045594

  47. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  48. Morandini, M., Penserini, L., Perini, A.: Towards goal-oriented development of self-adaptive systems. In: Proceedings of the 2008 International Workshop on Software Engineering for Adaptive and Self-Managing Systems, SEAMS 2008, pp. 9–16. Association for Computing Machinery, New York (2008). https://doi.org/10.1145/1370018.1370021

  49. Mussbacher, G., et al.: Opportunities in intelligent modeling assistance. Softw. Syst. Model. 19(5), 1045–1053 (2020). https://doi.org/10.1007/s10270-020-00814-5

    Article  Google Scholar 

  50. Mylopoulos, J., Chung, L., Liao, S., Wang, H., Yu, E.: Exploring alternatives during requirements analysis. IEEE Softw. 18(1), 92–96 (2001). https://doi.org/10.1109/52.903174

  51. Nalchigar, S., Yu, E.: Business-driven data analytics: a conceptual modeling framework. Data Knowl. Eng. 117, 359–372 (2018). https://doi.org/10.1016/j.datak.2018.04.006

  52. Nalchigar, S., Yu, E., Keshavjee, K.: Modeling machine learning requirements from three perspectives: a case report from the healthcare domain. Requirements Eng. 26(2), 237–254 (2021). https://doi.org/10.1007/s00766-020-00343-z

    Article  Google Scholar 

  53. Nguyen, C.M., Sebastiani, R., Giorgini, P., Mylopoulos, J.: Multi-objective reasoning with constrained goal models. Requirements Eng. 23(2), 189–225 (2018). https://doi.org/10.1007/s00766-016-0263-5

    Article  Google Scholar 

  54. Object Management Group: Business Process Model And Notation (v2.0). Tech. rep. (2011). https://www.omg.org/spec/BPMN/2.0.2/PDF

  55. Pei, Z., Liu, L., Wang, C., Wang, J.: Requirements engineering for machine learning: a review and reflection. In: Proceedings of the 30th IEEE International Requirements Engineering Conference Workshops (REW 2022), pp. 166–175 (2022). https://doi.org/10.1109/REW56159.2022.00039

  56. Pérez-Soler, S., Guerra, E., de Lara, J.: Model-driven chatbot development. In: Dobbie, G., Frank, U., Kappel, G., Liddle, S.W., Mayr, H.C. (eds.) ER 2020. LNCS, vol. 12400, pp. 207–222. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62522-1_15

    Chapter  Google Scholar 

  57. Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: reliable reinforcement learning implementations. J. Mach. Learn. Res. 22(268), 1–8 (2021). http://jmlr.org/papers/v22/20-1364.html

  58. Rao, A., Jelvis, T.: Foundations of Reinforcement Learning with Applications in Finance. Chapman and Hall/CRC (2022)

    Google Scholar 

  59. Reiter, R.: Knowledge in Action. MIT Press, Logical Foundations for Specifying and Implementing Dynamical Systems (2001)

    Google Scholar 

  60. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms. The Computing Research Repository (CoRR) (201). https://doi.org/10.48550/arxiv.1707.06347

  61. Soutchanski, M.: High-Level Robot Programming in Dynamic and Incompletely Known Environments. Ph.D. thesis, Department of Computer Science, University of Toronto (2003)

    Google Scholar 

  62. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press (2018)

    Google Scholar 

  63. Vogelsang, A., Borg, M.: Requirements engineering for machine learning: perspectives from data scientists. In: Proceedigns of the 6th International Workshop on Artificial Intelligence for Requirements Engineering (AIRE 2019), pp. 245–251 (2019). https://doi.org/10.1109/REW.2019.00050

  64. Wei, H., Zheng, G., Yao, H., Li, Z.: IntelliLight: a reinforcement learning approach for intelligent traffic light control. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018), pp. 2496–2505. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3219819.3220096

  65. Yang, W.C., Marra, G., Rens, G., De Raedt, L.: Safe Reinforcement learning via probabilistic logic shields. In: Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), pp. 5739–5749 (2023). https://doi.org/10.24963/ijcai.2023/637

  66. Yohannis, A., Kolovos, D.: Towards model-based bias mitigation in machine learning. Proceedings of the 25th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems (MODELS 2022), pp. 143–153 (2022). https://doi.org/10.1145/3550355.3552401

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sotirios Liaskos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liaskos, S., M. Khan, S., Mylopoulos, J., Golipour, R. (2025). Model-Driven Design and Generation of Training Simulators for Reinforcement Learning. In: Maass, W., Han, H., Yasar, H., Multari, N. (eds) Conceptual Modeling. ER 2024. Lecture Notes in Computer Science, vol 15238. Springer, Cham. https://doi.org/10.1007/978-3-031-75872-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-75872-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-75871-3

  • Online ISBN: 978-3-031-75872-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics