Abstract
As automated driving development progresses forward, novel methods are required to handle the vastness of possible road situations and to face end user’s high demands. Trying to solve the problem of motion control involving decision making and trajectory planning it is reasonable to take into consideration reinforcement learning as a viable approach. In this paper, we present the promises reinforcement learning can bring to an automated driving domain and the list of challenges we encountered during our work. We address the issues related to the environment definition, sample efficiency, safety and explainability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Autonomous Vehicle Data Annotation Market Analysis. https://www.researchandmarkets.com/reports/4985697/autonomous-vehicle-data-annotation-market-analysis
Captum. Model Interpretability for PyTorch. https://captum.ai/
GitHub - iamhatesz/rld: A development tool for evaluation and interpretability of reinforcement learning agents. https://github.com/iamhatesz/rld
Off road, but not offline: How simulation helps advance our Waymo Driver. https://blog.waymo.com/2020/04/off-road-but-not-offline-simulation27.html
sicara/tf-explain: Interpretability Methods for tf.keras models with Tensorflow 2.x. https://github.com/sicara/tf-explain
Traffic AI – Simteract. http://simteract.com/traffic-ai/
Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. CoRR abs/1705.10528 (2017). http://arxiv.org/abs/1705.10528
Administration, F.H.: Highway statistics, 2018. Technical report, Washington, DC: US Department of Transportation (2019)
Amodei, D., Olah, C., Steinhardt, J., Christiano, P.F., Schulman, J., Mané, D.: Concrete Problems in AI Safety. CoRR abs/1606.06565 (2016). http://arxiv.org/abs/1606.06565
Badia, A.P., et al.: Agent57: Outperforming the Atari Human Benchmark (2020)
Bansal, M., Krizhevsky, A., Ogale, A.S.: ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst. CoRR abs/1812.03079 (2018). http://arxiv.org/abs/1812.03079
Bojarski, M., et al.: End to End Learning for Self-Driving Cars (2016). http://arxiv.org/abs/1604.07316
Dulac-Arnold, G., Mankowitz, D.J., Hester, T.: Challenges of Real-World Reinforcement Learning. CoRR abs/1904.12901 (2019). http://arxiv.org/abs/1904.12901
Dworak, D., Ciepiela, F., Derbisz, J., Izzat, I., Komorkiewicz, M., Wojcik, M.: Performance of LiDAR object detection deep learning architectures based on artificially generated point cloud data from CARLA simulator. In: 2019 24th International Conference on Methods and Models in Automation and Robotics, MMAR 2019, pp. 600–605. Institute of Electrical and Electronics Engineers Inc. (2019). https://doi.org/10.1109/MMAR.2019.8864642
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (2018). http://arxiv.org/abs/1801.01290
Haarnoja, T., et al.: Soft Actor-Critic Algorithms and Applications. CoRR abs/1812.05905 (2018). http://arxiv.org/abs/1812.05905
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Jung, C., Lee, D., Lee, S., Shim, D.H.: V2x-communication-aided autonomous driving: system design and experimental validation. Sensors (Switzerland) 20(10), 2903 (2020). https://doi.org/10.3390/s20102903, http://pmc/articles/ PMC7287954/?report=abstract www.ncbi.nlm.nih.gov/pmc/articles/PMC7287954/
Kalra, N., Paddock, S.M.: Driving to safety: how many miles of driving would it take to demonstrate autonomous vehicle reliability? Trans. Res. Part A Policy Pract. 94, 182–193 (2016)
Kang, B., Jie, Z., Feng, J.: Policy optimization with demonstrations. In: 35th International Conference on Machine Learning, ICML 2018, vol. 6, pp. 3855–3869 (2018)
Kass, S.J., Cole, K.S., Stanny, C.J.: Effects of distraction and experience on situation awareness and simulated driving. Transp. Res. Part F: Traffic Psychol. Behav. 10(4), 321–329 (2007). https://doi.org/10.1016/j.trf.2006.12.002
Lajunen, T., Parker, D.: Are aggressive people aggressive drivers? A study of the relationship between self-reported general aggressiveness, driver anger and aggressive driving. Accid. Anal. Prev. 33(2), 243–255 (2001). https://doi.org/10.1016/S0001-4575(00)00039-7, https://linkinghub.elsevier.com/retrieve/pii/S0001457500000397
Mnih, V., et al.: Playing Atari with Deep Reinforcement Learning (2013). http://arxiv.org/abs/1312.5602
Molenaar, R., Van Bilsen, A., Van Der Made, R., De Vries, R.: Full spectrum camera simulation for reliable virtual development and validation of ADAS and automated driving applications. In: IEEE Intelligent Vehicles Symposium, Proceedings, vol. 2015-August, pp. 47–52. Institute of Electrical and Electronics Engineers Inc. (2015). https://doi.org/10.1109/IVS.2015.7225661
Molnar, C.: Interpretable Machine Learning (2019)
Nistér, D., Lee, H.L., Ng, J., Wang, Y.: The Safety Force Field. Technical report
World Health Organization et al.: Global status report on road safety 2018: Summary. World Health Organization, Technical report (2018)
Orłowski, M., Wrona, T., Pankiewicz, N., Turlej, W.: Safe and goal-based highway maneuver planning with reinforcement learning. In: Advances in Intelligent Systems and Computing, vol. 1196 AISC, pp. 1261–1274. Springer (2020). https://doi.org/10.1007/978-3-030-50936-1_105, https://link.springer.com/chapter/10.1007/978-3-030-50936-1_105
Pek, C., Althoff, M.: Computationally efficient fail-safe trajectory planning for self-driving vehicles using convex optimization. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 1447–1454. IEEE (2018)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). http://arxiv.org/abs/1707.06347
Shalev-Shwartz, S., Shammah, S., Shashua, A.: On a Formal Model of Safe and Scalable Self-driving Cars (2017). http://arxiv.org/abs/1708.06374
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic Attribution for Deep Networks. Technical report (2017)
Torabi, F., Warnell, G., Stone, P.: Behavioral cloning from observation. In: IJCAI International Joint Conference on Artificial Intelligence 2018-July(July), pp. 4950–4957 (2018). https://doi.org/10.24963/ijcai.2018/687
Via, E.: What would \(\Pi \) do?: Imitation Learning via off-policy Reinforcement Learning, pp. 1–13 (2019)
Wu, Y.H., Charoenphakdee, N., Bao, H., Tangkaratt, V., Sugiyama, M.: Imitation Learning from Imperfect Demonstration. Technical report. https://www.basketball-reference.com/leagues/NBA_stats.html
YoungPaul, K.L., Salmon, M.: Examining the relationship between driver distraction and driving errors: a discussion of theory, studies and methods. Safe. Sci. 50(2), pp. 165–174 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Pankiewicz, N., Wrona, T., Turlej, W., Orłowski, M. (2021). Promises and Challenges of Reinforcement Learning Applications in Motion Planning of Automated Vehicles. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2021. Lecture Notes in Computer Science(), vol 12855. Springer, Cham. https://doi.org/10.1007/978-3-030-87897-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-87897-9_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87896-2
Online ISBN: 978-3-030-87897-9
eBook Packages: Computer ScienceComputer Science (R0)