Skip to main content

Digging for Decision Trees: A Case Study in Strategy Sampling and Learning

  • Conference paper
  • First Online:
Bridging the Gap Between AI and Reality (AISoLA 2024)

Abstract

We introduce a formal model of transportation in an open-pit mine for the purpose of optimising the mine’s operations. The model is a network of Markov automata (MA); the optimisation goal corresponds to maximising a time-bounded expected reward property. Today’s model checking algorithms exacerbate the state space explosion problem by applying a discretisation approach to such properties on MA. We show that model checking is infeasible even for small mine instances. Instead, we propose statistical model checking with lightweight strategy sampling or table-based Q-learning over untimed strategies as an alternative to approach the optimisation task, using the Modest Toolset’s modes tool. We add support for partial observability to modes so that strategies can be based on carefully selected model features, and we implement a connection from modes to the dtControl tool to convert sampled or learned strategies into decision trees. We experimentally evaluate the adequacy of our new tooling on the open-pit mine case study. Our experiments demonstrate the limitations of Q-learning, the impact of feature selection, and the usefulness of decision trees as an explainable representation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Data availability

The models and tools/scripts to reproduce our experimental evaluation are archived and available at DOI 10.5281/zenodo.13327230 [10].

Notes

  1. 1.

    To ease the presentation, we assume actions to uniquely identify transitions per state.

  2. 2.

    We use the average runtime of both properties per model and run, whose coefficient of variation was in almost all cases below 10% and in two cases 25.3% and 25.1%.

References

  1. Agha, G., Palmskog, K.: A survey of statistical model checking. ACM Trans. Model. Comput. Simul. 28(1), 6:1–6:39 (2018). https://doi.org/10.1145/3158668

  2. Alarie, S., Gamache, M.: Overview of solution strategies used in truck dispatching systems for open pit mines. Int. J. Surf. Min. Reclam. Environ. 16(1), 59–76 (2002). https://doi.org/10.1076/ijsm.16.1.59.3408

    Article  MATH  Google Scholar 

  3. Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2), 183–235 (1994). https://doi.org/10.1016/0304-3975(94)90010-8

    Article  MathSciNet  MATH  Google Scholar 

  4. Ashok, P., Butkova, Y., Hermanns, H., Křetínský, J.: Continuous-time Markov decisions based on partial exploration. In: Lahiri, S.K., Wang, C. (eds.) ATVA 2018. LNCS, vol. 11138, pp. 317–334. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01090-4_19

  5. Ashok, P., Jackermeier, M., Křetínský, J., Weinhuber, C., Weininger, M., Yadav, M.: dtControl 2.0: explainable strategy representation via decision tree learning steered by experts. In: TACAS 2021. LNCS, vol. 12652, pp. 326–345. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72013-1_17

    Chapter  MATH  Google Scholar 

  6. Baier, C., de Alfaro, L., Forejt, V., Kwiatkowska, M.: Model checking probabilistic systems. In: Handbook of Model Checking, pp. 963–999. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-10575-8_28

    Chapter  MATH  Google Scholar 

  7. Behrmann, G., David, A., Larsen, K.G.: A tutorial on Uppaal. In: Bernardo, M., Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30080-9_7

    Chapter  MATH  Google Scholar 

  8. Bellman, R.: A Markovian decision process. J. Mathem. Mech. 6(5), 679–684 (1957)

    MathSciNet  MATH  Google Scholar 

  9. Bohnenkamp, H.C., D’Argenio, P.R., Hermanns, H., Katoen, J.P.: MoDeST: a compositional modeling formalism for hard and softly timed systems. IEEE Trans. Software Eng. 32(10), 812–830 (2006). https://doi.org/10.1109/TSE.2006.104

    Article  MATH  Google Scholar 

  10. Budde, C.E., D’Argenio, P.R., Hartmanns, A.: Artifact for digging for decision trees: a case study in strategy sampling and learning. Zenodo (2024). https://doi.org/10.5281/zenodo.13327230

    Article  MATH  Google Scholar 

  11. Budde, C.E., D’Argenio, P.R., Hartmanns, A., Sedwards, S.: An efficient statistical model checker for nondeterminism and rare events. Int. J. Softw. Tools Technol. Transf. 22(6), 759–780 (2020). https://doi.org/10.1007/S10009-020-00563-2

    Article  MATH  Google Scholar 

  12. Bulychev, P.E., et al.: Uppaal-SMC: statistical model checking for priced timed automata. In: Wiklicky, H., Massink, M. (eds.) 10th Workshop on Quantitative Aspects of Programming Languages and Systems (QAPL). EPTCS, vol. 85, pp. 1–16 (2012). https://doi.org/10.4204/EPTCS.85.1

  13. Butkova, Y., Fox, G.: Optimal time-bounded reachability analysis for concurrent systems. In: Vojnar, T., Zhang, L. (eds.) TACAS 2019. LNCS, vol. 11428, pp. 191–208. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17465-1_11

    Chapter  MATH  Google Scholar 

  14. Butkova, Y., Hartmanns, A., Hermanns, H.: A Modest approach to Markov automata. ACM Trans. Model. Comput. Simul. 31(3), 14:1–14:34 (2021). https://doi.org/10.1145/3449355

  15. Butkova, Y., Hatefi, H., Hermanns, H., Krčál, J.: Optimal continuous time Markov decisions. In: Finkbeiner, B., Pu, G., Zhang, L. (eds.) ATVA 2015. LNCS, vol. 9364, pp. 166–182. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24953-7_12

  16. Bäuerle, N., Rieder, U.: Markov decision processes with applications to finance. Springer (2011). https://doi.org/10.1007/978-3-642-18324-9

    Article  MATH  Google Scholar 

  17. D’Argenio, P.R., Hartmanns, A., Sedwards, S.: Lightweight statistical model checking in nondeterministic continuous time. In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11245, pp. 336–353. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03421-4_22

    Chapter  MATH  Google Scholar 

  18. D’Argenio, P.R., Legay, A., Sedwards, S., Traonouez, L.M.: Smart sampling for lightweight verification of Markov decision processes. Int. J. Softw. Tools Technol. Transf. 17(4), 469–484 (2015). https://doi.org/10.1007/S10009-015-0383-0

    Article  MATH  Google Scholar 

  19. David, A., Jensen, P.G., Larsen, K.G., Mikučionis, M., Taankvist, J.H.: Uppaal Stratego. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 206–211. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46681-0_16

    Chapter  MATH  Google Scholar 

  20. Eisentraut, C., Hermanns, H., Zhang, L.: On probabilistic automata in continuous time. In: 25th Annual IEEE Symposium on Logic in Computer Science (LICS), pp. 342–351. IEEE Computer Society (2010). https://doi.org/10.1109/LICS.2010.41

  21. Gros, T.P., et al.: DSMC evaluation stages: fostering robust and safe behavior in deep reinforcement learning – extended version. ACM Trans. Model. Comput. Simul. 33(4), 17:1–17:28 (2023). https://doi.org/10.1145/3607198

  22. Hahn, E.M., Hartmanns, A., Hermanns, H., Katoen, J.P.: A compositional modelling and analysis framework for stochastic hybrid systems. Formal Methods Syst. Des. 43(2), 191–232 (2013). https://doi.org/10.1007/S10703-012-0167-Z

    Article  MATH  Google Scholar 

  23. Hartmanns, A., Hermanns, H.: The Modest Toolset: an integrated environment for quantitative modelling and verification. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 593–598. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54862-8_51

    Chapter  MATH  Google Scholar 

  24. Hartmanns, A., Hermanns, H.: A Modest Markov automata tutorial. In: Krötzsch, M., Stepanova, D. (eds.) Reasoning Web. Explainable Artificial Intelligence. LNCS, vol. 11810, pp. 250–276. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31423-1_8

  25. Hartmanns, A., Junges, S., Quatmann, T., Weininger, M.: A practitioner’s guide to MDP model checking algorithms. In: Sankaranarayanan, S., Sharygina, N. (eds.) 29th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS). LNCS, vol. 13993, pp. 469–488. Springer (2023). https://doi.org/10.1007/978-3-031-30823-9_24

  26. Hartmanns, A., Klauck, M.: The Modest state of learning, sampling, and verifying strategies. In: Margaria, T., Steffen, B. (eds.) 11th International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (ISoLA). LNCS, vol. 13703, pp. 406–432. Springer (2022). https://doi.org/10.1007/978-3-031-19759-8_25

  27. Hatefi-Ardakani, H.: Finite horizon analysis of Markov automata. Ph.D. thesis, Saarland University, Germany (2017). http://scidok.sulb.uni-saarland.de/volltexte/2017/6743/

  28. Hensel, C., Junges, S., Katoen, J.P., Quatmann, T., Volk, M.: The probabilistic model checker Storm. Int. J. Softw. Tools Technol. Transf. 24(4), 589–610 (2022). https://doi.org/10.1007/S10009-021-00633-Z

    Article  MATH  Google Scholar 

  29. Jensen, P.G., Larsen, K.G., Mikucionis, M.: Playing wordle with Uppaal Stratego. In: Jansen, N., Stoelinga, M., van den Bos, P. (eds.) A Journey from Process Algebra via Timed Automata to Model Learning - Essays Dedicated to Frits Vaandrager on the Occasion of His 60th Birthday. LNCS, vol. 13560, pp. 283–305. Springer (2022). https://doi.org/10.1007/978-3-031-15629-8_15

  30. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998). https://doi.org/10.1016/S0004-3702(98)00023-X

    Article  MathSciNet  MATH  Google Scholar 

  31. Kretínský, J., Meggendorfer, T.: Of cores: a partial-exploration framework for Markov decision processes. Log. Methods Comput. Sci. 16(4) (2020). https://lmcs.episciences.org/6833

  32. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47

    Chapter  MATH  Google Scholar 

  33. Law, A.M.: Simulation Modeling and Analysis, 5th edn. McGraw-Hill series in industrial engineering and management science, McGraw-Hill Education (2015)

    Google Scholar 

  34. Legay, A., Sedwards, S., Traonouez, L.-M.: Scalable verification of Markov decision processes. In: Canal, C., Idani, A. (eds.) SEFM 2014. LNCS, vol. 8938, pp. 350–362. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15201-1_23

  35. Moradi Afrapoli, A., Askari-Nasab, H.: Mining fleet management systems: a review of models and algorithms. Int. J. Min. Reclam. Environ. 33(1), 42–60 (2019). https://doi.org/10.1080/17480930.2017.1336607

    Article  MATH  Google Scholar 

  36. Norman, G., Parker, D., Zou, X.: Verification and control of partially observable probabilistic systems. Real Time Syst. 53(3), 354–402 (2017). https://doi.org/10.1007/S11241-017-9269-4

    Article  MATH  Google Scholar 

  37. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics, Wiley (1994). https://doi.org/10.1002/9780470316887

  38. Sutton, R.S., Barto, A.G.: Reinforcement learning - An introduction. MIT Press, Adaptive computation and machine learning (1998)

    Google Scholar 

  39. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992). https://doi.org/10.1007/BF00992698

Download references

Acknowledgments

We are grateful to Matías D. Lee and Joaquín Feltes for discussions and insights on early versions of the open-pit mine model.

Funding

This work was supported by Agencia I\(+\)D\(+\)i grant PICT 2022-09-00580 (CoSMoSS), the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreements 101008233 (MISSION) and 101067199 (ProSVED), the Interreg North Sea project STORM_SAFE, the NextGenerationEU projects D53D23008400006 (SMARTITUDE) under the MUR PRIN 2022 and PE00000014 (SERICS) under the MUR PNRR, NWO VIDI grant VI.Vidi.223.110 (TruSTy), and SeCyT-UNC grant 33620230100384CB (MECANO).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnd Hartmanns .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Budde, C.E., D’Argenio, P.R., Hartmanns, A. (2025). Digging for Decision Trees: A Case Study in Strategy Sampling and Learning. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2024. Lecture Notes in Computer Science, vol 15217. Springer, Cham. https://doi.org/10.1007/978-3-031-75434-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-75434-0_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-75433-3

  • Online ISBN: 978-3-031-75434-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics