Abstract
Despite its successes, Deep Reinforcement Learning (DRL) yields non-interpretable policies. Moreover, since DRL does not exploit symbolic relational representations, it has difficulties in coping with structural changes in its environment (such as increasing the number of objects). Meanwhile, Relational Reinforcement Learning inherits the relational representations from symbolic planning to learn reusable policies. However, it has so far been unable to scale up and exploit the power of deep neural networks. We propose Deep Explainable Relational Reinforcement Learning (DERRL), a framework that exploits the best of both – neural and symbolic worlds. By resorting to a neuro-symbolic approach, DERRL combines relational representations and constraints from symbolic planning with deep learning to extract interpretable policies. These policies are in the form of logical rules that explain why each decision (or action) is arrived at. Through several experiments, in setups like the Countdown Game, Blocks World, Gridworld, Traffic, and Mingrid, we show that the policies learned by DERRL are adaptable to varying configurations and environmental changes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
DERRL uses a relational representation akin to Quinlan’s FOIL [36] with background knowledge comprising ground facts and non-recursive Datalog-formulated rules.
- 2.
Here, the outer tuple denotes stacks and the inner tuples denote the blocks in the stack. For e.g., ((a, b), (c, d)) has two stacks: (a, b) is stack 1 and (c, d) is stack 2.
- 3.
More generally, given a vector \(\boldsymbol{y} \in [0, 1]^n\), Lukasiewicz t-norm
. See Appendix in [15] for proof.
- 4.
This assumes a specified upper bound on the number of rules for each action, similar to selecting the number of clusters in a clustering algorithm [43].
- 5.
add: acc \(\mathrel {+}=\) top, sub: acc \(\mathrel {-}=\) top, null: acc.
- 6.
goal(X) is provided as background since it does not change during the episode.
- 7.
When the target is to the southeast, and the agent encounters a target to its right, it will travel north (up) rather than south (down).
References
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26–38 (2017)
Raedt, L.: Logical and relational learning. In: Zaverucha, G., da Costa, A.L. (eds.) SBIA 2008. LNCS (LNAI), vol. 5249, pp. 1–1. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88190-2_1
De Raedt, L., Kersting, K., Natarajan, S., Poole, D.: Statistical Relational Artificial Intelligence: Logic, Probability, and Computation, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 32. Morgan & Claypool, San Rafael (2016)
Driessens, K., Džeroski, S.: Integrating guidance into relational reinforcement learning. Mach. Learn. 57(3), 271–304 (2004)
Dzeroski, S., Raedt, L.D., Blockeel, H.: Relational reinforcement learning. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 136–143. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. J. Mach. Learn. Res. 6(18), 503–556 (2005)
Evans, R., Grefenstette, E.: Learning explanatory rules from noisy data. J. Artif. Int. Res. 61(1), 1–64 (2018)
Fikes, R.E., Nilsson, N.J.: Strips: a new approach to the application of theorem proving to problem solving. Artif. Intell. 2(3), 189–208 (1971)
Frosst, N., Hinton, G.E.: Distilling a neural network into a soft decision tree. CoRR abs/1711.09784 (2017)
Garg, S., Bajpai, A., et al.: Size independent neural transfer for rddl planning. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 29, pp. 631–636 (2019)
Garg, S., Bajpai, A., et al.: Symbolic network: generalized neural policies for relational mdps. In: International Conference on Machine Learning, pp. 3397–3407. PMLR (2020)
Gelfond, M., Lifschitz, V.: Action languages. Electron. Trans. Artif. Intell. 3, 195–210 (1998)
Ghallab, M., et al.: PDDL–The Planning Domain Definition Language (1998)
Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M.A., Kagal, L.: Explaining explanations: an approach to evaluating interpretability of machine learning. CoRR abs/1806.00069 (2018)
Hazra, R., De Raedt, L.: Deep explainable relational reinforcement learning: a neuro-symbolic approach. arXiv preprint arXiv:2304.08349 (2023)
Iyer, R., Li, Y., Li, H., Lewis, M., Sundar, R., Sycara, K.: Transparency and explanation in deep reinforcement learning neural networks. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018, pp. 144–150. Association for Computing Machinery, New York (2018)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. p. 0. OpenReview.net, Palais des Congrès Neptune, Toulon, France (2017)
Janisch, J., Pevnỳ, T., Lisỳ, V.: Symbolic relational deep reinforcement learning based on graph neural networks. arXiv preprint arXiv:2009.12462 (2020)
Jiang, Z., Luo, S.: Neural logic reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 3110–3119. PMLR, Long Beach, USA (09–15 Jun 2019)
Kersting, K., Driessens, K.: Non-parametric policy gradients: A unified treatment of propositional and relational domains. In: Proceedings of the 25th International Conference on Machine Learning, pp. 456–463 (2008)
Kersting, K., Otterlo, M.V., De Raedt, L.: Bellman goes relational. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 59 (2004)
Khoshafian, S.N., Copeland, G.P.: Object identity. ACM SIGPLAN Notices 21(11), 406–416 (1986)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations, ICLR 2017 (2017)
Kokel, H., Manoharan, A., Natarajan, S., Ravindran, B., Tadepalli, P.: Reprel: Integrating relational planning and reinforcement learning for effective abstraction. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 31, pp. 533–541 (2021)
Krajzewicz, D., Erdmann, J., Behrisch, M., Bieker, L.: Recent development and applications of sumo-simulation of urban mobility. Int. J. Adv. Syst. Measur. 5(3 &4) (2012)
Lamb, L.C., Garcez, A.d., Gori, M., Prates, M.O., Avelar, P.H., Vardi, M.Y.: Graph neural networks meet neural-symbolic computing: A survey and perspective. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pp. 4877–4884. International Joint Conferences on Artificial Intelligence Organization, Yokohama, Japan (7 2020), survey track
Lee, J.D., See, K.A.: Trust in automation: designing for appropriate reliance. Hum. Factors 46(1), 50–80 (2004)
Liu, G., Schulte, O., Zhu, W., Li, Q.: Toward interpretable deep reinforcement learning with linear model u-trees. In: ECML/PKDD (2018)
Lloyd, J.W.: Foundations of Logic Programming. Springer, Heidelberg (1984)
Lyu, D., Yang, F., Liu, B., Gustafson, S.: Sdrl: Interpretable and data-efficient deep reinforcement learning leveraging symbolic planning. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI’19/IAAI’19/EAAI’19, AAAI Press, Honolulu, Hawaii, USA (2019)
Marra, G., Giannini, F., Diligenti, M., Maggini, M., Gori, M.: T-norms driven loss functions for machine learning. arXiv: Artificial Intelligence (2019)
Martínez, D., Alenya, G., Torras, C.: Relational reinforcement learning with guided demonstrations. Artif. Intell. 247, 295–312 (2017)
Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Log. Program. 19(20), 629–679 (1994)
Nau, D., Ghallab, M., Traverso, P.: Automated Planning: Theory & Practice. Morgan Kaufmann Publishers Inc., San Francisco (2004)
Payani, A., Fekri, F.: Incorporating relational background knowledge into reinforcement learning via differentiable inductive logic programming. arXiv preprint arXiv:2003.10386 (2020)
Quinlan, J.R.: Learning logical definitions from relations. Mach. Learn. 5(3), 239–266 (sep 1990)
Silva, A., Gombolay, M., Killian, T., Jimenez, I., Son, S.H.: Optimization methods for interpretable differentiable decision trees applied to reinforcement learning. In: Chiappa, S., Calandra, R. (eds.) Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 108, pp. 1855–1865. PMLR, Palermo, Italy, 26–28 Aug 2020
Stowers, K., Kasdaglis, N., Newton, O.B., Lakhmani, S.G., Wohleber, R.W., Chen, J.Y.: Intelligent agent transparency. Proceedings of the Human Factors and Ergonomics Society Annual Meeting 60, 1706–1710 (2016)
Tadepalli, P., Givan, R., Driessens, K.: Relational reinforcement learning: an overview. In: Proceedings of the ICML’04 Workshop on Relational Reinforcement Learning (2004)
de Visser, E.J., Cohen, M., Freedy, A., Parasuraman, R.: A design methodology for trust cue calibration in cognitive agents. In: Shumaker, R., Lackey, S. (eds.) Virtual, Augmented and Mixed Reality. Designing and Developing Virtual and Augmented Environments, pp. 251–262. Springer International Publishing, Cham (2014)
Williams, R.J.: Simple statistical gradient following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)
Xu, J., Zhang, Z., Friedman, T., Liang, Y., Broeck, G.: A semantic loss function for deep learning with symbolic knowledge. In: International Conference on Machine Learning, pp. 5502–5511. PMLR (2018)
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Networks 16(3), 645–678 (2005). https://doi.org/10.1109/TNN.2005.845141
Yang, F., Lyu, D., Liu, B., Gustafson, S.: Peorl: integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 4860–4866. AAAI Press, Stockholm, Sweden (2018)
Zambaldi, V., et al.: Deep reinforcement learning with relational inductive biases. In: International Conference on Learning Representations (2019)
Acknowledgements
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hazra, R., De Raedt, L. (2023). Deep Explainable Relational Reinforcement Learning: A Neuro-Symbolic Approach. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14172. Springer, Cham. https://doi.org/10.1007/978-3-031-43421-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-43421-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43420-4
Online ISBN: 978-3-031-43421-1
eBook Packages: Computer ScienceComputer Science (R0)