Promises and Challenges of Reinforcement Learning Applications in Motion Planning of Automated Vehicles

Pankiewicz, Nikodem; Wrona, Tomasz; Turlej, Wojciech; Orłowski, Mateusz

doi:10.1007/978-3-030-87897-9_29

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12855))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

829 Accesses

Abstract

As automated driving development progresses forward, novel methods are required to handle the vastness of possible road situations and to face end user’s high demands. Trying to solve the problem of motion control involving decision making and trajectory planning it is reasonable to take into consideration reinforcement learning as a viable approach. In this paper, we present the promises reinforcement learning can bring to an automated driving domain and the list of challenges we encountered during our work. We address the issues related to the environment definition, sample efficiency, safety and explainability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Autonomous Vehicle Data Annotation Market Analysis. https://www.researchandmarkets.com/reports/4985697/autonomous-vehicle-data-annotation-market-analysis
Captum. Model Interpretability for PyTorch. https://captum.ai/
GitHub - iamhatesz/rld: A development tool for evaluation and interpretability of reinforcement learning agents. https://github.com/iamhatesz/rld
Off road, but not offline: How simulation helps advance our Waymo Driver. https://blog.waymo.com/2020/04/off-road-but-not-offline-simulation27.html
sicara/tf-explain: Interpretability Methods for tf.keras models with Tensorflow 2.x. https://github.com/sicara/tf-explain
Traffic AI – Simteract. http://simteract.com/traffic-ai/
Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. CoRR abs/1705.10528 (2017). http://arxiv.org/abs/1705.10528
Administration, F.H.: Highway statistics, 2018. Technical report, Washington, DC: US Department of Transportation (2019)
Google Scholar
Amodei, D., Olah, C., Steinhardt, J., Christiano, P.F., Schulman, J., Mané, D.: Concrete Problems in AI Safety. CoRR abs/1606.06565 (2016). http://arxiv.org/abs/1606.06565
Badia, A.P., et al.: Agent57: Outperforming the Atari Human Benchmark (2020)
Google Scholar
Bansal, M., Krizhevsky, A., Ogale, A.S.: ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst. CoRR abs/1812.03079 (2018). http://arxiv.org/abs/1812.03079
Bojarski, M., et al.: End to End Learning for Self-Driving Cars (2016). http://arxiv.org/abs/1604.07316
Dulac-Arnold, G., Mankowitz, D.J., Hester, T.: Challenges of Real-World Reinforcement Learning. CoRR abs/1904.12901 (2019). http://arxiv.org/abs/1904.12901
Dworak, D., Ciepiela, F., Derbisz, J., Izzat, I., Komorkiewicz, M., Wojcik, M.: Performance of LiDAR object detection deep learning architectures based on artificially generated point cloud data from CARLA simulator. In: 2019 24th International Conference on Methods and Models in Automation and Robotics, MMAR 2019, pp. 600–605. Institute of Electrical and Electronics Engineers Inc. (2019). https://doi.org/10.1109/MMAR.2019.8864642
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (2018). http://arxiv.org/abs/1801.01290
Haarnoja, T., et al.: Soft Actor-Critic Algorithms and Applications. CoRR abs/1812.05905 (2018). http://arxiv.org/abs/1812.05905
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Jung, C., Lee, D., Lee, S., Shim, D.H.: V2x-communication-aided autonomous driving: system design and experimental validation. Sensors (Switzerland) 20(10), 2903 (2020). https://doi.org/10.3390/s20102903, http://pmc/articles/ PMC7287954/?report=abstract www.ncbi.nlm.nih.gov/pmc/articles/PMC7287954/
Kalra, N., Paddock, S.M.: Driving to safety: how many miles of driving would it take to demonstrate autonomous vehicle reliability? Trans. Res. Part A Policy Pract. 94, 182–193 (2016)
Article Google Scholar
Kang, B., Jie, Z., Feng, J.: Policy optimization with demonstrations. In: 35th International Conference on Machine Learning, ICML 2018, vol. 6, pp. 3855–3869 (2018)
Google Scholar
Kass, S.J., Cole, K.S., Stanny, C.J.: Effects of distraction and experience on situation awareness and simulated driving. Transp. Res. Part F: Traffic Psychol. Behav. 10(4), 321–329 (2007). https://doi.org/10.1016/j.trf.2006.12.002
Lajunen, T., Parker, D.: Are aggressive people aggressive drivers? A study of the relationship between self-reported general aggressiveness, driver anger and aggressive driving. Accid. Anal. Prev. 33(2), 243–255 (2001). https://doi.org/10.1016/S0001-4575(00)00039-7, https://linkinghub.elsevier.com/retrieve/pii/S0001457500000397
Mnih, V., et al.: Playing Atari with Deep Reinforcement Learning (2013). http://arxiv.org/abs/1312.5602
Molenaar, R., Van Bilsen, A., Van Der Made, R., De Vries, R.: Full spectrum camera simulation for reliable virtual development and validation of ADAS and automated driving applications. In: IEEE Intelligent Vehicles Symposium, Proceedings, vol. 2015-August, pp. 47–52. Institute of Electrical and Electronics Engineers Inc. (2015). https://doi.org/10.1109/IVS.2015.7225661
Molnar, C.: Interpretable Machine Learning (2019)
Google Scholar
Nistér, D., Lee, H.L., Ng, J., Wang, Y.: The Safety Force Field. Technical report
Google Scholar
World Health Organization et al.: Global status report on road safety 2018: Summary. World Health Organization, Technical report (2018)
Google Scholar
Orłowski, M., Wrona, T., Pankiewicz, N., Turlej, W.: Safe and goal-based highway maneuver planning with reinforcement learning. In: Advances in Intelligent Systems and Computing, vol. 1196 AISC, pp. 1261–1274. Springer (2020). https://doi.org/10.1007/978-3-030-50936-1_105, https://link.springer.com/chapter/10.1007/978-3-030-50936-1_105
Pek, C., Althoff, M.: Computationally efficient fail-safe trajectory planning for self-driving vehicles using convex optimization. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 1447–1454. IEEE (2018)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). http://arxiv.org/abs/1707.06347
Shalev-Shwartz, S., Shammah, S., Shashua, A.: On a Formal Model of Safe and Scalable Self-driving Cars (2017). http://arxiv.org/abs/1708.06374
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic Attribution for Deep Networks. Technical report (2017)
Google Scholar
Torabi, F., Warnell, G., Stone, P.: Behavioral cloning from observation. In: IJCAI International Joint Conference on Artificial Intelligence 2018-July(July), pp. 4950–4957 (2018). https://doi.org/10.24963/ijcai.2018/687
Via, E.: What would \(\Pi \) do?: Imitation Learning via off-policy Reinforcement Learning, pp. 1–13 (2019)
Google Scholar
Wu, Y.H., Charoenphakdee, N., Bao, H., Tangkaratt, V., Sugiyama, M.: Imitation Learning from Imperfect Demonstration. Technical report. https://www.basketball-reference.com/leagues/NBA_stats.html
YoungPaul, K.L., Salmon, M.: Examining the relationship between driver distraction and driving errors: a discussion of theory, studies and methods. Safe. Sci. 50(2), pp. 165–174 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

AGH University of Science and Technology, Krakow, Poland
Nikodem Pankiewicz, Tomasz Wrona, Wojciech Turlej & Mateusz Orłowski
Aptiv, Krakow, Poland
Nikodem Pankiewicz, Tomasz Wrona, Wojciech Turlej & Mateusz Orłowski

Authors

Nikodem Pankiewicz
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Wrona
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Turlej
View author publications
You can also search for this author in PubMed Google Scholar
Mateusz Orłowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikodem Pankiewicz .

Editor information

Editors and Affiliations

Czestochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Krakow, Poland
Ryszard Tadeusiewicz
Electrical and Computer Engineering, University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pankiewicz, N., Wrona, T., Turlej, W., Orłowski, M. (2021). Promises and Challenges of Reinforcement Learning Applications in Motion Planning of Automated Vehicles. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2021. Lecture Notes in Computer Science(), vol 12855. Springer, Cham. https://doi.org/10.1007/978-3-030-87897-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-87897-9_29
Published: 06 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87896-2
Online ISBN: 978-3-030-87897-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics