Skip to main content

Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Research Track (ECML PKDD 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14172))

  • 1558 Accesses

Abstract

Despite its successes, Deep Reinforcement Learning (DRL) yields non-interpretable policies. Moreover, since DRL does not exploit symbolic relational representations, it has difficulties in coping with structural changes in its environment (such as increasing the number of objects). Meanwhile, Relational Reinforcement Learning inherits the relational representations from symbolic planning to learn reusable policies. However, it has so far been unable to scale up and exploit the power of deep neural networks. We propose Deep Explainable Relational Reinforcement Learning (DERRL), a framework that exploits the best of both – neural and symbolic worlds. By resorting to a neuro-symbolic approach, DERRL combines relational representations and constraints from symbolic planning with deep learning to extract interpretable policies. These policies are in the form of logical rules that explain why each decision (or action) is arrived at. Through several experiments, in setups like the Countdown Game, Blocks World, Gridworld, Traffic, and Mingrid, we show that the policies learned by DERRL are adaptable to varying configurations and environmental changes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    DERRL uses a relational representation akin to Quinlan’s FOIL [36] with background knowledge comprising ground facts and non-recursive Datalog-formulated rules.

  2. 2.

    Here, the outer tuple denotes stacks and the inner tuples denote the blocks in the stack. For e.g., ((ab), (cd)) has two stacks: (ab) is stack 1 and (cd) is stack 2.

  3. 3.

    More generally, given a vector \(\boldsymbol{y} \in [0, 1]^n\), Lukasiewicz t-norm . See Appendix in [15] for proof.

  4. 4.

    This assumes a specified upper bound on the number of rules for each action, similar to selecting the number of clusters in a clustering algorithm [43].

  5. 5.

    add: acc \(\mathrel {+}=\) top, sub: acc \(\mathrel {-}=\) top, null: acc.

  6. 6.

    goal(X) is provided as background since it does not change during the episode.

  7. 7.

    When the target is to the southeast, and the agent encounters a target to its right, it will travel north (up) rather than south (down).

References

  1. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)

    Article  Google Scholar 

  2. Raedt, L.: Logical and relational learning. In: Zaverucha, G., da Costa, A.L. (eds.) SBIA 2008. LNCS (LNAI), vol. 5249, pp. 1–1. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88190-2_1

    Chapter  Google Scholar 

  3. De Raedt, L., Kersting, K., Natarajan, S., Poole, D.: Statistical Relational Artificial Intelligence: Logic, Probability, and Computation, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 32. Morgan & Claypool, San Rafael (2016)

    Book  MATH  Google Scholar 

  4. Driessens, K., Džeroski, S.: Integrating guidance into relational reinforcement learning. Mach. Learn. 57(3), 271–304 (2004)

    Article  MATH  Google Scholar 

  5. Dzeroski, S., Raedt, L.D., Blockeel, H.: Relational reinforcement learning. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 136–143. Morgan Kaufmann Publishers Inc., San Francisco (1998)

    Google Scholar 

  6. Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. J. Mach. Learn. Res. 6(18), 503–556 (2005)

    MathSciNet  MATH  Google Scholar 

  7. Evans, R., Grefenstette, E.: Learning explanatory rules from noisy data. J. Artif. Int. Res. 61(1), 1–64 (2018)

    Google Scholar 

  8. Fikes, R.E., Nilsson, N.J.: Strips: a new approach to the application of theorem proving to problem solving. Artif. Intell. 2(3), 189–208 (1971)

    Article  MATH  Google Scholar 

  9. Frosst, N., Hinton, G.E.: Distilling a neural network into a soft decision tree. CoRR abs/1711.09784 (2017)

    Google Scholar 

  10. Garg, S., Bajpai, A., et al.: Size independent neural transfer for rddl planning. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 29, pp. 631–636 (2019)

    Google Scholar 

  11. Garg, S., Bajpai, A., et al.: Symbolic network: generalized neural policies for relational mdps. In: International Conference on Machine Learning, pp. 3397–3407. PMLR (2020)

    Google Scholar 

  12. Gelfond, M., Lifschitz, V.: Action languages. Electron. Trans. Artif. Intell. 3, 195–210 (1998)

    MathSciNet  Google Scholar 

  13. Ghallab, M., et al.: PDDL–The Planning Domain Definition Language (1998)

    Google Scholar 

  14. Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M.A., Kagal, L.: Explaining explanations: an approach to evaluating interpretability of machine learning. CoRR abs/1806.00069 (2018)

    Google Scholar 

  15. Hazra, R., De Raedt, L.: Deep explainable relational reinforcement learning: a neuro-symbolic approach. arXiv preprint arXiv:2304.08349 (2023)

  16. Iyer, R., Li, Y., Li, H., Lewis, M., Sundar, R., Sycara, K.: Transparency and explanation in deep reinforcement learning neural networks. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018, pp. 144–150. Association for Computing Machinery, New York (2018)

    Google Scholar 

  17. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. p. 0. OpenReview.net, Palais des Congrès Neptune, Toulon, France (2017)

    Google Scholar 

  18. Janisch, J., Pevnỳ, T., Lisỳ, V.: Symbolic relational deep reinforcement learning based on graph neural networks. arXiv preprint arXiv:2009.12462 (2020)

  19. Jiang, Z., Luo, S.: Neural logic reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 3110–3119. PMLR, Long Beach, USA (09–15 Jun 2019)

    Google Scholar 

  20. Kersting, K., Driessens, K.: Non-parametric policy gradients: A unified treatment of propositional and relational domains. In: Proceedings of the 25th International Conference on Machine Learning, pp. 456–463 (2008)

    Google Scholar 

  21. Kersting, K., Otterlo, M.V., De Raedt, L.: Bellman goes relational. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 59 (2004)

    Google Scholar 

  22. Khoshafian, S.N., Copeland, G.P.: Object identity. ACM SIGPLAN Notices 21(11), 406–416 (1986)

    Article  Google Scholar 

  23. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations, ICLR 2017 (2017)

    Google Scholar 

  24. Kokel, H., Manoharan, A., Natarajan, S., Ravindran, B., Tadepalli, P.: Reprel: Integrating relational planning and reinforcement learning for effective abstraction. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 31, pp. 533–541 (2021)

    Google Scholar 

  25. Krajzewicz, D., Erdmann, J., Behrisch, M., Bieker, L.: Recent development and applications of sumo-simulation of urban mobility. Int. J. Adv. Syst. Measur. 5(3 &4) (2012)

    Google Scholar 

  26. Lamb, L.C., Garcez, A.d., Gori, M., Prates, M.O., Avelar, P.H., Vardi, M.Y.: Graph neural networks meet neural-symbolic computing: A survey and perspective. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pp. 4877–4884. International Joint Conferences on Artificial Intelligence Organization, Yokohama, Japan (7 2020), survey track

    Google Scholar 

  27. Lee, J.D., See, K.A.: Trust in automation: designing for appropriate reliance. Hum. Factors 46(1), 50–80 (2004)

    Article  Google Scholar 

  28. Liu, G., Schulte, O., Zhu, W., Li, Q.: Toward interpretable deep reinforcement learning with linear model u-trees. In: ECML/PKDD (2018)

    Google Scholar 

  29. Lloyd, J.W.: Foundations of Logic Programming. Springer, Heidelberg (1984)

    Book  MATH  Google Scholar 

  30. Lyu, D., Yang, F., Liu, B., Gustafson, S.: Sdrl: Interpretable and data-efficient deep reinforcement learning leveraging symbolic planning. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI’19/IAAI’19/EAAI’19, AAAI Press, Honolulu, Hawaii, USA (2019)

    Google Scholar 

  31. Marra, G., Giannini, F., Diligenti, M., Maggini, M., Gori, M.: T-norms driven loss functions for machine learning. arXiv: Artificial Intelligence (2019)

    Google Scholar 

  32. Martínez, D., Alenya, G., Torras, C.: Relational reinforcement learning with guided demonstrations. Artif. Intell. 247, 295–312 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  33. Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Log. Program. 19(20), 629–679 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  34. Nau, D., Ghallab, M., Traverso, P.: Automated Planning: Theory & Practice. Morgan Kaufmann Publishers Inc., San Francisco (2004)

    MATH  Google Scholar 

  35. Payani, A., Fekri, F.: Incorporating relational background knowledge into reinforcement learning via differentiable inductive logic programming. arXiv preprint arXiv:2003.10386 (2020)

  36. Quinlan, J.R.: Learning logical definitions from relations. Mach. Learn. 5(3), 239–266 (sep 1990)

    Google Scholar 

  37. Silva, A., Gombolay, M., Killian, T., Jimenez, I., Son, S.H.: Optimization methods for interpretable differentiable decision trees applied to reinforcement learning. In: Chiappa, S., Calandra, R. (eds.) Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 108, pp. 1855–1865. PMLR, Palermo, Italy, 26–28 Aug 2020

    Google Scholar 

  38. Stowers, K., Kasdaglis, N., Newton, O.B., Lakhmani, S.G., Wohleber, R.W., Chen, J.Y.: Intelligent agent transparency. Proceedings of the Human Factors and Ergonomics Society Annual Meeting 60, 1706–1710 (2016)

    Article  Google Scholar 

  39. Tadepalli, P., Givan, R., Driessens, K.: Relational reinforcement learning: an overview. In: Proceedings of the ICML’04 Workshop on Relational Reinforcement Learning (2004)

    Google Scholar 

  40. de Visser, E.J., Cohen, M., Freedy, A., Parasuraman, R.: A design methodology for trust cue calibration in cognitive agents. In: Shumaker, R., Lackey, S. (eds.) Virtual, Augmented and Mixed Reality. Designing and Developing Virtual and Augmented Environments, pp. 251–262. Springer International Publishing, Cham (2014)

    Google Scholar 

  41. Williams, R.J.: Simple statistical gradient following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)

    Article  MATH  Google Scholar 

  42. Xu, J., Zhang, Z., Friedman, T., Liang, Y., Broeck, G.: A semantic loss function for deep learning with symbolic knowledge. In: International Conference on Machine Learning, pp. 5502–5511. PMLR (2018)

    Google Scholar 

  43. Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Networks 16(3), 645–678 (2005). https://doi.org/10.1109/TNN.2005.845141

    Article  Google Scholar 

  44. Yang, F., Lyu, D., Liu, B., Gustafson, S.: Peorl: integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 4860–4866. AAAI Press, Stockholm, Sweden (2018)

    Google Scholar 

  45. Zambaldi, V., et al.: Deep reinforcement learning with relational inductive biases. In: International Conference on Learning Representations (2019)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rishi Hazra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hazra, R., De Raedt, L. (2023). Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14172. Springer, Cham. https://doi.org/10.1007/978-3-031-43421-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43421-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43420-4

  • Online ISBN: 978-3-031-43421-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics