Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data

Chinthada, Yadhu Vamsi; Adarsh, S.

doi:10.1007/978-3-031-25088-0_67

Yadhu Vamsi Chinthada⁸ &
S. Adarsh⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1749))

Included in the following conference series:

International Conference on Advanced Communication and Intelligent Systems

482 Accesses
2 Citations

Abstract

Progressing towards the age of Autonomous driving, realistic data is necessary for maximum driving scenarios is required for creating Advanced Driver-Assistance System (ADAS) models that can persevere through our widely unpredictable day-to-day travel. So we intend to generate synthetic data that can be utilized well by the ADAS models and also keep it closer to the real-world scenarios. Using Deep Learning concepts, image synthesizing methods, and video-to-video synthesizing methods together with the available real-world data we have created synthetic data. Generative Adversarial Neural Networks (GANs) based models were previously used for static level object manipulations on benchmark datasets. Here, the idea is to enhance this aspect to adding dynamic objects (like cars, pedestrians, etc.) to real-world scenarios from the Cityscapes dataset. Then by using a pre-trained Mask-RCNN, we have done the object detection on the generated synthetic data and have evaluated the output.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Synthetic Data for Video Surveillance Applications of Computer Vision: A Review

Article Open access 17 May 2024

Adversarial Dataset Augmentation Using Reinforcement Learning and 3D Modeling

A Generation Method and Verification of Virtual Dataset

References

Kalra, N., Susan, M.: Paddock Driving to Safety: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability?. RAND Corporation, Santa Monica (2016)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., Lépez, A., Koltun, V.: CARLA: an open urban driving simulator, arXiv, no. CoRL, pp. 1–16 (2017)
Google Scholar
Jelic, B., Grbic, R., Vranjes, M., Mijic, D.: Can we replace real-world with synthetic data in deep learning-based ADAS algorithm development? IEEE Consum. Electron. Mag. https://doi.org/10.1109/MCE.2021.3083206
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 8798–8807 (2018). https://doi.org/10.1109/CVPR.2018.00917
Wang, T.C., et al.: Video-to-video synthesis. In: Advances in Neural Information Processing Systems, vol. 2018, pp. 1144–1156 (2018)
Google Scholar
Wang, T.C., Liu, M.Y., Tao, A., Liu, G., Kautz, J., Catanzaro, B.: Few-shot video-to-video synthesis, arXiv (2019)
Google Scholar
Mallya, A., Wang, T.-C., Sapra, K., Liu, M.-Y.: World-consistent video-to-video synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 359–378. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_22
Chapter Google Scholar
Soni, R.K., Nair, B.B.: Deep learning-based approach to generate realistic data for ADAS applications. In: 2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP), pp. 1–5 (2021). https://doi.org/10.1109/ICCCSP52374.2021.9465529
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1647–1655 (2017). https://doi.org/10.1109/CVPR.2017.179
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, pp. 2332–2341 (2019). https://doi.org/10.1109/CVPR.2019.00244
Zhu, P., Abdal, R., Qin, Y., Wonka, P.: SEAN: image synthesis with semantic region-adaptive normalization. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020, pp. 5103–5112 (2020). https://doi.org/10.1109/CVPR42600.2020.00515
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5967–5976 (2017). https://doi.org/10.1109/CVPR.2017.632
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask RCNN. arXiv:1703.06870v3 [cs.CV], 24 January 2018
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312v3 [cs.CV], 21 February 2015
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016). https://doi.org/10.1109/CVPR.2016.350
Fadaie, J.G.: The state of modeling, simulation, and data utilization within industry- an autonomous vehicles perspective. arXiv:1910.06075 [cs.CV], 7 October 2019
MathWorks, “MATLAB Overview,” MATLAB Product Documentation (2020)
Google Scholar
Lee,D., Liu, S., Gu, J., Liu, M.-Y., Yang, M.-H., Kautz, J.: Context-aware synthesis and placement of object instances. In: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada (2018)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. arXiv: 2003.12039v3 [cs.CV], 25 August 2020
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. arXiv: 1611.00850v2 [cs.CV], 21 November 2016
Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. arXiv:1709.02371v3 [cs.CV], 25 June 2018
Emani, S., Soman, K.P., Sajith Variyar, V.V., Adarsh, S.: Obstacle detection and distance estimation for autonomous electric vehicle using stereo vision and DNN. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds.) Soft Computing and Signal Processing. Advances in Intelligent Systems and Computing, vol. 898, Springer, Singapore (2019)
Google Scholar
Mukherjee, A., Adarsh, S., Ramachandran, K.I.: ROS-based pedestrian detection and distance estimation algorithm using stereo vision, leddar and CNN. In: Satapathy, S., Bhateja, V., Janakiramaiah, B., Chen, Y.W. (eds.) Intelligent System Design. Advances in Intelligent Systems and Computing, vol. 1171. Springer, Singapore (2021)
Google Scholar
Dunna, S., Nair, B.B., Panda, M.K.: A deep learning based system for fast detection of obstacles using rear-view camera under parking scenarios. In: IEEE International Power and Renewable Energy Conference (IPRECON) 2021, pp. 1–7 (2021). https://doi.org/10.1109/IPRECON52453.2021.9640804

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
Yadhu Vamsi Chinthada & S. Adarsh

Authors

Yadhu Vamsi Chinthada
View author publications
You can also search for this author in PubMed Google Scholar
S. Adarsh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yadhu Vamsi Chinthada .

Editor information

Editors and Affiliations

Bharath Institute of Higher Education and Research, Chennai, India
Rabindra Nath Shaw
Systems Research Institute, Warsaw, Poland
Marcin Paprzycki
The Neotia University, Sarisha, India
Ankush Ghosh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chinthada, Y.V., Adarsh, S. (2023). Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data. In: Shaw, R.N., Paprzycki, M., Ghosh, A. (eds) Advanced Communication and Intelligent Systems. ICACIS 2022. Communications in Computer and Information Science, vol 1749. Springer, Cham. https://doi.org/10.1007/978-3-031-25088-0_67

Download citation

DOI: https://doi.org/10.1007/978-3-031-25088-0_67
Published: 15 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25087-3
Online ISBN: 978-3-031-25088-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data

Abstract

Access this chapter

Similar content being viewed by others

Synthetic Data for Video Surveillance Applications of Computer Vision: A Review

Adversarial Dataset Augmentation Using Reinforcement Learning and 3D Modeling

A Generation Method and Verification of Virtual Dataset

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data

Abstract

Access this chapter

Similar content being viewed by others

Synthetic Data for Video Surveillance Applications of Computer Vision: A Review

Adversarial Dataset Augmentation Using Reinforcement Learning and 3D Modeling

A Generation Method and Verification of Virtual Dataset

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation