Abstract
Progressing towards the age of Autonomous driving, realistic data is necessary for maximum driving scenarios is required for creating Advanced Driver-Assistance System (ADAS) models that can persevere through our widely unpredictable day-to-day travel. So we intend to generate synthetic data that can be utilized well by the ADAS models and also keep it closer to the real-world scenarios. Using Deep Learning concepts, image synthesizing methods, and video-to-video synthesizing methods together with the available real-world data we have created synthetic data. Generative Adversarial Neural Networks (GANs) based models were previously used for static level object manipulations on benchmark datasets. Here, the idea is to enhance this aspect to adding dynamic objects (like cars, pedestrians, etc.) to real-world scenarios from the Cityscapes dataset. Then by using a pre-trained Mask-RCNN, we have done the object detection on the generated synthetic data and have evaluated the output.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kalra, N., Susan, M.: Paddock Driving to Safety: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability?. RAND Corporation, Santa Monica (2016)
Dosovitskiy, A., Ros, G., Codevilla, F., Lépez, A., Koltun, V.: CARLA: an open urban driving simulator, arXiv, no. CoRL, pp. 1–16 (2017)
Jelic, B., Grbic, R., Vranjes, M., Mijic, D.: Can we replace real-world with synthetic data in deep learning-based ADAS algorithm development? IEEE Consum. Electron. Mag. https://doi.org/10.1109/MCE.2021.3083206
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 8798–8807 (2018). https://doi.org/10.1109/CVPR.2018.00917
Wang, T.C., et al.: Video-to-video synthesis. In: Advances in Neural Information Processing Systems, vol. 2018, pp. 1144–1156 (2018)
Wang, T.C., Liu, M.Y., Tao, A., Liu, G., Kautz, J., Catanzaro, B.: Few-shot video-to-video synthesis, arXiv (2019)
Mallya, A., Wang, T.-C., Sapra, K., Liu, M.-Y.: World-consistent video-to-video synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 359–378. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_22
Soni, R.K., Nair, B.B.: Deep learning-based approach to generate realistic data for ADAS applications. In: 2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP), pp. 1–5 (2021). https://doi.org/10.1109/ICCCSP52374.2021.9465529
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1647–1655 (2017). https://doi.org/10.1109/CVPR.2017.179
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, pp. 2332–2341 (2019). https://doi.org/10.1109/CVPR.2019.00244
Zhu, P., Abdal, R., Qin, Y., Wonka, P.: SEAN: image synthesis with semantic region-adaptive normalization. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020, pp. 5103–5112 (2020). https://doi.org/10.1109/CVPR42600.2020.00515
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5967–5976 (2017). https://doi.org/10.1109/CVPR.2017.632
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask RCNN. arXiv:1703.06870v3 [cs.CV], 24 January 2018
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312v3 [cs.CV], 21 February 2015
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016). https://doi.org/10.1109/CVPR.2016.350
Fadaie, J.G.: The state of modeling, simulation, and data utilization within industry- an autonomous vehicles perspective. arXiv:1910.06075 [cs.CV], 7 October 2019
MathWorks, “MATLAB Overview,” MATLAB Product Documentation (2020)
Lee,D., Liu, S., Gu, J., Liu, M.-Y., Yang, M.-H., Kautz, J.: Context-aware synthesis and placement of object instances. In: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada (2018)
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. arXiv: 2003.12039v3 [cs.CV], 25 August 2020
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. arXiv: 1611.00850v2 [cs.CV], 21 November 2016
Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. arXiv:1709.02371v3 [cs.CV], 25 June 2018
Emani, S., Soman, K.P., Sajith Variyar, V.V., Adarsh, S.: Obstacle detection and distance estimation for autonomous electric vehicle using stereo vision and DNN. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds.) Soft Computing and Signal Processing. Advances in Intelligent Systems and Computing, vol. 898, Springer, Singapore (2019)
Mukherjee, A., Adarsh, S., Ramachandran, K.I.: ROS-based pedestrian detection and distance estimation algorithm using stereo vision, leddar and CNN. In: Satapathy, S., Bhateja, V., Janakiramaiah, B., Chen, Y.W. (eds.) Intelligent System Design. Advances in Intelligent Systems and Computing, vol. 1171. Springer, Singapore (2021)
Dunna, S., Nair, B.B., Panda, M.K.: A deep learning based system for fast detection of obstacles using rear-view camera under parking scenarios. In: IEEE International Power and Renewable Energy Conference (IPRECON) 2021, pp. 1–7 (2021). https://doi.org/10.1109/IPRECON52453.2021.9640804
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chinthada, Y.V., Adarsh, S. (2023). Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data. In: Shaw, R.N., Paprzycki, M., Ghosh, A. (eds) Advanced Communication and Intelligent Systems. ICACIS 2022. Communications in Computer and Information Science, vol 1749. Springer, Cham. https://doi.org/10.1007/978-3-031-25088-0_67
Download citation
DOI: https://doi.org/10.1007/978-3-031-25088-0_67
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25087-3
Online ISBN: 978-3-031-25088-0
eBook Packages: Computer ScienceComputer Science (R0)