Abstract
The classical problem of balancing an inverted pendulum is commonly used to evaluate control learning techniques. Traditional learning methods aim to improve the performance of the learned controller, often disregarding comprehensibility of the learned control policies. Recently, Explainable AI (XAI) has become of great interest in the areas where humans can benefit from insights discovered by AI, or need to check whether AI’s decisions make sense. Learning qualitative models allows formulation of learned hypotheses in a comprehensible way, closer to human intuition than traditional numerical learning. In this paper, we use a qualitative approach to learning control strategies, which we demonstrate on the problem of balancing an inverted pendulum. We use qualitative induction to learn a qualitative model from experimentally collected numerical traces, and qualitative simulation to search for possible qualitative control strategies, which are tested through reactive execution. Successful behaviors provide a clear explanation of the learned control strategy.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Values \(F \in [-10, 10]\) gave the same results.
References
Anderson, C.W.: Learning to control an inverted pendulum using neural networks. IEEE Control Syst. Mag. 9(3), 31–37 (1989)
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983)
Bratko, I.: Alphazero - what’s missing? Informatica (Slovenia) 42, 7–11 (2018)
Bratko, I.: Prolog: Programming for Artificial Intelligence, 4th edn. Addison-Wesley, Boston (2011)
Bratko, I., Šuc, D.: Learning qualitative models. AI Mag. 24(4), 107–119 (2003)
Hosokawa, S., Kato, J., Nakano, K.: A reward allocation method for reinforcement learning in stabilizing control tasks. Artif. Life Robot. 19(2), 109–114 (2014)
Kuipers, B.: Qualitative simulation. Artif. Intell. 29(3), 289–338 (1986)
Kuipers, B.: Qualitative Reasoning: Modeling and Simulation with Incomplete Knowledge. MIT Press, Cambridge (1994)
Linglin, W., Yongxin, L., Xiaoke, Z.: Design of reinforce learning control algorithm and verified in inverted pendulum. In: 2015 34th Chinese Control Conference (CCC), pp. 3164–3168 (2015)
Michie, D., Chambers, R.A.: BOXES: an experiment in adaptive control. In: Machine Intelligence, vol. 2, pp. 125–133. Elsevier/North-Holland (1968)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–33 (2015)
Puriel-Gil, G., Yu, W., Sossa, H.: Reinforcement learning compensation based PD control for inverted pendulum. In: 15th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), pp. 1–6 (2018)
Ramamoorthy, S., Kuipers, B.: Qualitative heterogeneous control of higher order systems. In: Hybrid Systems: Computation and Control, pp. 417–434. Springer, Heidelberg (2003)
Riedmiller, M., Peters, J., Schaal, S.: Evaluation of policy gradient methods and variants on the cart-pole benchmark. In: 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 254–261 (2007)
Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. CoRR abs/1712.01815 (2017)
Wellman, M.P.: Qualitative simulation with multivariate constraints. In: Second International Conference on Principles of Knowledge Representation and Reasoning, pp. 547–557. Morgan Kaufmann (1991)
Šoberl, D., Bratko, I.: Reactive motion planning with qualitative constraints. In: Benferhat, S., Tabia, K., Ali, M. (eds.) IEA/AIE 2017. LNCS (LNAI), vol. 10350, pp. 41–50. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60042-0_5
Žabkar, J., Možina, M., Bratko, I., Demšar, J.: Learning qualitative models from numerical data. Artif. Intell. 175(9–10), 1604–1619 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Šoberl, D., Bratko, I. (2019). Learning Explainable Control Strategies Demonstrated on the Pole-and-Cart System. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2019. Lecture Notes in Computer Science(), vol 11606. Springer, Cham. https://doi.org/10.1007/978-3-030-22999-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-22999-3_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22998-6
Online ISBN: 978-3-030-22999-3
eBook Packages: Computer ScienceComputer Science (R0)