Abstract
End-to-end learning for planning is a promising approach for finding good robot strategies in situations where the state transition, observation, and reward functions are initially unknown. Many neural network architectures for this approach have shown positive results. Across these networks, seemingly small components have been used repeatedly in different architectures, which means improving the efficiency of these components has great potential to improve the overall performance of the network. This paper aims to improve one such component: The forward propagation module. In particular, we propose Locally-Connected Interrelated Network (LCI-Net)—a novel type of locally connected layer with unshared but interrelated weights—to improve the efficiency of information propagation and learning stochastic transition models for planning. LCI-Net is a small differentiable neural network module that can be plugged into various existing architectures. For evaluation purposes, we apply LCI-Net to QMDP-Net; QMDP-Net is a neural network for solving POMDP problems whose transition, observation, and reward functions are learned. Simulation tests on benchmark problems involving 2D and 3D navigation and grasping indicate promising results: Changing only the forward propagation module alone with LCI-Net improves QMDP-Net generalization capability by a factor of up to 10.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Neural networks, types, and functional programming. http://colah.github.io/posts/2015-09-NN-Types-FP/. Accessed 03 Sep 2019
François-Lavet, V., Bengio, Y., Precup, D., Pineau, J.: Combined reinforcement learning via abstract representations. In: AAAI, vol. 33, pp. 3582–3589 (2019)
Gupta, S., Davidson, J., Levine, S., Sukthankar, R., Malik, J.: Cognitive mapping and planning for visual navigation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.1109/CVPR.2017.769
Haarnoja, T., Ajay, A., Levine, S., Abbeel, P.: Backprop KF: learning discriminative deterministic state estimators. In: NIPS Conference (2016)
Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. In: AAAI 2015 Fall Symposium (2015)
Howard, A., Roy, N.: The robotics data set repository (radish) (2003). http://radish.sourceforge.net/
Jonkowski, R., Brock, O.: End-to-End Learnable Histogram Filters (2017)
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998)
Karkus, P., Hsu, D., Lee, W.S.: QMDP-net: deep learning for planning under partial observability. In: NIPS Conference (2017)
Karkus, P., Hsu, D., Lee, W.S.: Particle filter networks with application to visual localization. In: CoRL Conference (2018)
Karkus, P., Ma, X., Hsu, D., Kaelbling, L.P., Lee, W.S., Lozano-Perez, T.: Differentiable algorithm networks for composable robot learning. In: Robotics: Science and Systems (2019)
Lee, L., Parisotto, E., Chaplot, D.S., Xing, E., Salakhutdinov, R.: Gated path planning networks. In: ICML Conference (2018)
Littman, M.L., Cassandra, A.R., Kaelbling, L.P.: Learning policies for partially observable environments: scaling up. In: ICML (1995)
Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A.J., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., Kumaran, D., Hadsell, R.: Learning to navigate in complex environments. In: ICLR Conference (2016)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518, 529 EP, February 2015. https://doi.org/10.1038/nature14236
Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in atari games. In: NIPS Conference, pp. 2863–2871 (2015)
Oh, J., Singh, S., Lee, H.: Value prediction network. In: NIPS Conference, pp. 6118–6128 (2017)
Okada, M., Rigazio, L., Aoshima, T.: Path Integral Networks: End-to-End Differentiable Optimal Control (2017)
Shankar, T., Dwivedy, S.K., Guha, P.: Reinforcement learning via recurrent convolutional neural networks. In: ICPR Conference. pp. 2592–2597, December 2016
Sondik, E.: The optimal control of partially observable Markov processes. Ph.D. thesis, Stanford University (1971)
Tamar, A., Wu, Y., Thomas, G., Levine, S., Abbeel, P.: Value iteration networks. In: IJCAI Conference, August 2017
Wahlström, N., Schön, T.B., Deisenroth, M.P.: Learning deep dynamical models from image pixels. In: The 17th IFAC Symposium on System Identification (SYSID) (2015)
Acknowledgements
Nicholas Collins is supported by an Australian Government Research Training Program (RTP) scholarship provided by the University of Queensland.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Collins, N., Kurniawati, H. (2021). Locally-Connected Interrelated Network: A Forward Propagation Primitive. In: LaValle, S.M., Lin, M., Ojala, T., Shell, D., Yu, J. (eds) Algorithmic Foundations of Robotics XIV. WAFR 2020. Springer Proceedings in Advanced Robotics, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-66723-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-66723-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66722-1
Online ISBN: 978-3-030-66723-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)