Abstract
This work proposes UA-HRL, an uncertainty-aware hierarchical reinforcement learning framework for mitigating the problems caused by noisy sensor data. The system is composed of an ensemble of predictive models that learns the environment’s underlying dynamics and estimates the uncertainty through their prediction variances and a two-level Hierarchical Reinforcement Learning agent that integrates the uncertainty estimates into the decision-making process. It is also shown how frame-stacking can be combined with the uncertainty estimation for the agent to make better decisions despite the aleatoric noise present in the observations. In the end, results obtained in a simulation environment are presented and discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abdar, M., et al.: A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fusion 76, 243–297 (2021)
Bacon, P.-L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Badre, D., Hoffman, J., Cooney, J.W., D’esposito, M.: Hierarchical cognitive control deficits following damage to the human frontal lobe. Nat. Neurosci. 12(4), 515–522 (2009)
Botvinick, M., Ritter, S., Wang, J.X., Kurth-Nelson, Z., Blundell, C., Hassabis, D.: Reinforcement learning, fast and slow. Trends Cogn. Sci. 23(5), 408–422 (2019)
Botvinick, M.M., Niv, Y., Barto, A.G.: Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113(3), 262–280 (2009)
Botvinick, M.M.: Hierarchical reinforcement learning and decision making. Curr. Opinion Neurobiol. 22(6), 956–962 (2012)
Fort, S., Hu, H., Lakshminarayanan, B.: Deep ensembles: a loss landscape perspective. arXiv preprint arXiv:1912.02757 (2019)
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059. PMLR (2016)
Haider, T., Roza, F.S., Eilers, D., Roscher, K., Günnemann, S.: Domain shifts in reinforcement learning: identifying disturbances in environments (2021)
Henne, M., Schwaiger, A., Roscher, K., Weiss, G.: Benchmarking uncertainty estimation methods for deep learning with safety-related metrics. In: SafeAI@ AAAI, pp. 83–90 (2020)
Henne, M., Schwaiger, A., Weiss, G.: Managing uncertainty of AI-based perception for autonomous systems. In: AISafety@ IJCAI (2019)
Hoel, C.-J., Wolff, K., Laine, L.: Tactical decision-making in autonomous driving by reinforcement learning with uncertainty estimation. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1563–1569. IEEE (2020)
Jong, N.K., Hester, T., Stone, P.: The utility of temporal abstraction in reinforcement learning. In: AAMAS (1), pp. 299–306. Citeseer (2008)
Kahn, G., Villaflor, A., Pong, V., Abbeel, P., Levine, S.: Uncertainty-aware reinforcement learning for collision avoidance. arXiv preprint arXiv:1702.01182 (2017)
Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Liaw, R., Liang, E., Nishihara, R., Moritz, P., Gonzalez, J.E., Stoica, I.: Tune: a research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118 (2018)
Lütjens, B., Everett, M., How, J.P.: Safe reinforcement learning with model uncertainty estimates. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8662–8668. IEEE (2019)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Nachum, O., Gu, S.S., Lee, H., Levine, S.: Data-efficient hierarchical reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Pertsch, K., Lee, Y., Lim, J.J.: Accelerating reinforcement learning with learned skill priors. arXiv preprint arXiv:2010.11944 (2020)
Ribas-Fernandes, J.J.F., et al.: A neural signature of hierarchical reinforcement learningd. Neuron 71(2), 370–379 (2011)
Schrittwieser, J., et al.: Mastering Atari, go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Schwaiger, A., Sinhamahapatra, P., Gansloser, J., Roscher, K.: Is uncertainty quantification in deep learning sufficient for out-of-distribution detection? In: AISafety@ IJCAI (2020)
Schwaiger, F., et al.: From black-box to white-box: examining confidence calibration under different conditions. arXiv preprint arXiv:2101.02971 (2021)
Sedlmeier, A., Gabor, T., Phan, T., Belzner, L., Linnhoff-Popien, C.: Uncertainty-based out-of-distribution detection in deep reinforcement learning. arXiv preprint arXiv:1901.02219 (2019)
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Sutton, R.S., Precup, D., Singh, S.: Between MDPS and semi-MDPS: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Van Amersfoort, J., Smith, L., Teh, Y.W., Gal, Y.: Uncertainty estimation using a single deep deterministic neural network. In: International Conference on Machine Learning, pp. 9690–9700. PMLR (2020)
Vezhnevets, A.S., et al.: Feudal networks for hierarchical reinforcement learning. In: International Conference on Machine Learning, pp. 3540–3549. PMLR (2017)
Yang, Z., Merrick, K., Jin, L., Abbass, H.A.: Hierarchical deep reinforcement learning for continuous action control. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 5174–5184 (2018)
Acknowledgments
This work was funded by the Bavarian Ministry for Economic Affairs, Regional Development and Energy as part of a project to support the thematic development of the Institute for Cognitive Systems.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Roza, F.S. (2023). Uncertainty-Aware Hierarchical Reinforcement Learning Robust to Noisy Observations. In: Arai, K. (eds) Proceedings of the Future Technologies Conference (FTC) 2022, Volume 1. FTC 2022 2022. Lecture Notes in Networks and Systems, vol 559. Springer, Cham. https://doi.org/10.1007/978-3-031-18461-1_35
Download citation
DOI: https://doi.org/10.1007/978-3-031-18461-1_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18460-4
Online ISBN: 978-3-031-18461-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)