Abstract
Providing full autonomy to Unmanned Surface Vehicles (USV) is a challenging goal to achieve. Autonomous docking is a subtask that is particularly difficult. The vessel has to distinguish between obstacles and the dock, and the obstacles can be either static or moving. This paper developed a simulator using Reinforcement Learning (RL) to approach the problem.
We studied several scenarios for the task of docking a USV in a simulator environment. The scenarios were defined with different sensor inputs and start-stop procedures but a simple shared reward function. The results show that the system solved the task when the IMU (Inertial Measurement Unit) and GNSS (Global Navigation Satellite System) sensors were used to estimate the state, despite the simplicity of the reward function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
ml-agents/gym-unity at main Unity-Technologies/ml-agents. https://github.com/Unity-Technologies/ml-agents/tree/main/gym-unity
Andrychowicz, M., et al.: What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study, June 2020. http://arxiv.org/abs/2006.05990
Badue, C., et al.: Self-driving cars: a survey. Expert Syst. Appl. 165, 113816 (2021). https://doi.org/10.1016/J.ESWA.2020.113816
Bjering Strand, H.: Autonomous docking control system for the otter USV: a machine learning approach (2020). https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/2780950
Brockman, G., et al.: OpenAI Gym, June 2016. https://arxiv.org/abs/1606.01540v1
Castro, P.S., Moitra, S., Gelada, C., Kumar, S., Bellemare, M.G.: Dopamine: a research framework for deep reinforcement learning, December 2018. https://arxiv.org/abs/1812.06110v1
Cui, R., Yang, C., Li, Y., Sharma, S.: Adaptive neural network control of AUVs with control input nonlinearities using reinforcement learning. IEEE Trans. Syst. Man Cybern. Syst. 47(6), 1019–1029 (2017). https://doi.org/10.1109/TSMC.2016.2645699
Cui, Y., Osaki, S., Matsubara, T.: Autonomous boat driving system using sample-efficient model predictive control-based reinforcement learning approach. J. Field Robot. 38(3), 331–354 (2021). https://doi.org/10.1002/ROB.21990
Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on robot learning, pp. 1–16 (2017)
Epic Games, I.: The most powerful real-time 3D creation tool - unreal Engine. https://www.unrealengine.com/en-US/
Fossen, T.I.: Handbook of Marine Craft Hydrodynamics and Motion Control, 2nd Edition. 2nd edn. Wiley, Hoboken, April 2021
Gaudet, B., Linares, R., Furfaro, R.: Deep reinforcement learning for six degree-of-freedom planetary landing. Adv. Space Res. 65(7), 1723–1741 (2020). https://doi.org/10.1016/J.ASR.2019.12.030
Juliani, A., et al.: Unity: a general platform for intelligent agents, September 2018. https://arxiv.org/abs/1809.02627v2
Koenig, N., Howard, A.: Design and use paradigms for Gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 3, pp. 2149–2154 (2004). https://doi.org/10.1109/IROS.2004.1389727
Kretschmann, L., Burmeister, H.C., Jahn, C.: Analyzing the economic benefit of unmanned autonomous ships: an exploratory cost-comparison between an autonomous and a conventional bulk carrier. Res. Transp. Bus. Manage. 25, 76–86 (2017). https://doi.org/10.1016/J.RTBM.2017.06.002
Kyriakidis, M., et al.: A human factors perspective on automated driving. 20(3), 223–249 (2017). https://doi.org/10.1080/1463922X.2017.1293187, https://www.tandfonline.com/doi/abs/10.1080/1463922X.2017.1293187
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. In: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, September 2015. https://arxiv.org/abs/1509.02971v6
Martinsen, A.B., Lekkas, A.M.: Straight-Path following for underactuated marine vessels using deep reinforcement learning. IFAC-PapersOnLine 51(29), 329–334 (2018). https://doi.org/10.1016/J.IFACOL.2018.09.502
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236, https://www.nature.com/articles/nature14236
Moerland, T.M., Broekens, J., Plaat, A., Jonker, C.M.: A0C: alpha zero in continuous action space, May 2018. https://arxiv.org/abs/1805.09613v1
Mousazadeh, H., et al.: Developing a navigation, guidance and obstacle avoidance algorithm for an unmanned surface vehicle (USV) by algorithms fusion. Ocean Eng. 159, 56–65 (2018). https://doi.org/10.1016/J.OCEANENG.2018.04.018
OpenAI: openai/baselines: OpenAI Baselines: high-quality implementations of reinforcement learning algorithms. https://github.com/openai/baselines
Pomerleau, D.A.: Alvinn: an autonomous land vehicle in a neural network. Adv. Neural Inf. Process. Syst. (1989). https://proceedings.neurips.cc/paper/1988/file/812b4ba287f5ee0bc9d43bbf5bbe87fb-Paper.pdf
Rai, R.: Socket. IO Real-Time Web Application Development - Rohit Rai - Google Books. Packt Publishing Ltd., Birmingham, 1st edn., February 2013. https://books.google.no/books?id=YgdbZbkTDkoC&pg=PT37&dq=socket+io&lr=&source=gbs_selected_pages&cad=2#v=onepage&q=socket%20io&f=false
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms, July 2017. https://arxiv.org/abs/1707.06347v2
Shao, G., Ma, Y., Malekian, R., Yan, X., Li, Z.: A novel cooperative platform design for coupled USV-UAV systems. IEEE Trans. Ind. Inf. 15(9), 4913–4922 (2019). https://doi.org/10.1109/TII.2019.2912024
Shuai, Y., et al.: An efficient neural-network based approach to automatic ship docking. Ocean Eng. 191, 106514 (2019). https://doi.org/10.1016/J.OCEANENG.2019.106514
Tang, Y., Agrawal, S.: Discretizing continuous action space for on-policy optimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 5981–5988, April 2020. https://doi.org/10.1609/AAAI.V34I04.6059, https://ojs.aaai.org/index.php/AAAI/article/view/6059
Thor I. Fossen: Lecture Notes: TTK 4190 Guidance, Navigation and Control of vehicles. https://www.fossen.biz/wiley/pdf/Ch1.pdf
Unity: Unity - Manual: GameObjects. https://docs.unity3d.com/Manual/GameObjects.html
Van Hasselt, H., Wiering, M.A.: Reinforcement learning in continuous action spaces. In: Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007, pp. 272–279 (2007). https://doi.org/10.1109/ADPRL.2007.368199
Vásárhelyi, G., Virágh, C., Somorjai, G., Nepusz, T., Eiben, A.E., Vicsek, T.: Optimized flocking of autonomous drones in confined environments. Sci. Robot. 3(20) (2018). https://doi.org/10.1126/SCIROBOTICS.AAT3536/SUPPL_FILE/AAT3536_SM.PDF, https://www.science.org/doi/abs/10.1126/scirobotics.aat3536
Veelen, M.v., Spreij, P.: Evolution in games with a continuous action space Matthijs van Veelen \(\cdot \) Peter Spreij (2008). https://doi.org/10.1007/s00199-008-0338-8
Zhang, P., et al.: Reinforcement learning-based end-to-end parking for automatic parking system. Sensors 19(18), 3996 (2019). https://doi.org/10.3390/S19183996, https://www.mdpi.com/1424-8220/19/18/3996/htm www.mdpi.com/1424-8220/19/18/3996
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Holen, M., Ruud, EL.M., Warakagoda, N.D., Goodwin, M., Engelstad, P., Knausgård, K.M. (2022). Towards Using Reinforcement Learning for Autonomous Docking of Unmanned Surface Vehicles. In: Iliadis, L., Jayne, C., Tefas, A., Pimenidis, E. (eds) Engineering Applications of Neural Networks. EANN 2022. Communications in Computer and Information Science, vol 1600. Springer, Cham. https://doi.org/10.1007/978-3-031-08223-8_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-08223-8_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08222-1
Online ISBN: 978-3-031-08223-8
eBook Packages: Computer ScienceComputer Science (R0)