Learning Explainable Control Strategies Demonstrated on the Pole-and-Cart System

Šoberl, Domen; Bratko, Ivan

doi:10.1007/978-3-030-22999-3_42

Domen Šoberl¹³ &
Ivan Bratko¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11606))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

2131 Accesses
1 Citations

Abstract

The classical problem of balancing an inverted pendulum is commonly used to evaluate control learning techniques. Traditional learning methods aim to improve the performance of the learned controller, often disregarding comprehensibility of the learned control policies. Recently, Explainable AI (XAI) has become of great interest in the areas where humans can benefit from insights discovered by AI, or need to check whether AI’s decisions make sense. Learning qualitative models allows formulation of learned hypotheses in a comprehensible way, closer to human intuition than traditional numerical learning. In this paper, we use a qualitative approach to learning control strategies, which we demonstrate on the problem of balancing an inverted pendulum. We use qualitative induction to learn a qualitative model from experimentally collected numerical traces, and qualitative simulation to search for possible qualitative control strategies, which are tested through reactive execution. Successful behaviors provide a clear explanation of the learned control strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Values \(F \in [-10, 10]\) gave the same results.

References

Anderson, C.W.: Learning to control an inverted pendulum using neural networks. IEEE Control Syst. Mag. 9(3), 31–37 (1989)
Article Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983)
Google Scholar
Bratko, I.: Alphazero - what’s missing? Informatica (Slovenia) 42, 7–11 (2018)
MathSciNet Google Scholar
Bratko, I.: Prolog: Programming for Artificial Intelligence, 4th edn. Addison-Wesley, Boston (2011)
Google Scholar
Bratko, I., Šuc, D.: Learning qualitative models. AI Mag. 24(4), 107–119 (2003)
Google Scholar
Hosokawa, S., Kato, J., Nakano, K.: A reward allocation method for reinforcement learning in stabilizing control tasks. Artif. Life Robot. 19(2), 109–114 (2014)
Article Google Scholar
Kuipers, B.: Qualitative simulation. Artif. Intell. 29(3), 289–338 (1986)
Article MathSciNet Google Scholar
Kuipers, B.: Qualitative Reasoning: Modeling and Simulation with Incomplete Knowledge. MIT Press, Cambridge (1994)
Google Scholar
Linglin, W., Yongxin, L., Xiaoke, Z.: Design of reinforce learning control algorithm and verified in inverted pendulum. In: 2015 34th Chinese Control Conference (CCC), pp. 3164–3168 (2015)
Google Scholar
Michie, D., Chambers, R.A.: BOXES: an experiment in adaptive control. In: Machine Intelligence, vol. 2, pp. 125–133. Elsevier/North-Holland (1968)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–33 (2015)
Article Google Scholar
Puriel-Gil, G., Yu, W., Sossa, H.: Reinforcement learning compensation based PD control for inverted pendulum. In: 15th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), pp. 1–6 (2018)
Google Scholar
Ramamoorthy, S., Kuipers, B.: Qualitative heterogeneous control of higher order systems. In: Hybrid Systems: Computation and Control, pp. 417–434. Springer, Heidelberg (2003)
Google Scholar
Riedmiller, M., Peters, J., Schaal, S.: Evaluation of policy gradient methods and variants on the cart-pole benchmark. In: 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 254–261 (2007)
Google Scholar
Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR abs/1712.01815 (2017)
Google Scholar
Wellman, M.P.: Qualitative simulation with multivariate constraints. In: Second International Conference on Principles of Knowledge Representation and Reasoning, pp. 547–557. Morgan Kaufmann (1991)
Google Scholar
Šoberl, D., Bratko, I.: Reactive motion planning with qualitative constraints. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10350, pp. 41–50. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60042-0_5
Chapter Google Scholar
Žabkar, J., Možina, M., Bratko, I., Demšar, J.: Learning qualitative models from numerical data. Artif. Intell. 175(9–10), 1604–1619 (2011)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, 1000, Ljubljana, Slovenia
Domen Šoberl & Ivan Bratko

Authors

Domen Šoberl
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Bratko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Domen Šoberl .

Editor information

Editors and Affiliations

Institute for Software Technology, Graz University of Technology, Graz, Austria
Franz Wotawa
Department of Applied Informatics, University of Klagenfurt, Klagenfurt, Austria
Gerhard Friedrich
Institute for Software Technology, Graz University of Technology, Graz, Austria
Ingo Pill
Institute for Software Technology, Graz University of Technology, Graz, Austria
Roxane Koitz-Hristov
Department of Computer Science, Texas State University, San Marcos, TX, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Šoberl, D., Bratko, I. (2019). Learning Explainable Control Strategies Demonstrated on the Pole-and-Cart System. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2019. Lecture Notes in Computer Science(), vol 11606. Springer, Cham. https://doi.org/10.1007/978-3-030-22999-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-22999-3_42
Published: 15 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22998-6
Online ISBN: 978-3-030-22999-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics