Skip to main content

Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data

  • Conference paper
  • First Online:
Advanced Communication and Intelligent Systems (ICACIS 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1749))

Abstract

Progressing towards the age of Autonomous driving, realistic data is necessary for maximum driving scenarios is required for creating Advanced Driver-Assistance System (ADAS) models that can persevere through our widely unpredictable day-to-day travel. So we intend to generate synthetic data that can be utilized well by the ADAS models and also keep it closer to the real-world scenarios. Using Deep Learning concepts, image synthesizing methods, and video-to-video synthesizing methods together with the available real-world data we have created synthetic data. Generative Adversarial Neural Networks (GANs) based models were previously used for static level object manipulations on benchmark datasets. Here, the idea is to enhance this aspect to adding dynamic objects (like cars, pedestrians, etc.) to real-world scenarios from the Cityscapes dataset. Then by using a pre-trained Mask-RCNN, we have done the object detection on the generated synthetic data and have evaluated the output.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kalra, N., Susan, M.: Paddock Driving to Safety: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability?. RAND Corporation, Santa Monica (2016)

    Google Scholar 

  2. Dosovitskiy, A., Ros, G., Codevilla, F., Lépez, A., Koltun, V.: CARLA: an open urban driving simulator, arXiv, no. CoRL, pp. 1–16 (2017)

    Google Scholar 

  3. Jelic, B., Grbic, R., Vranjes, M., Mijic, D.: Can we replace real-world with synthetic data in deep learning-based ADAS algorithm development? IEEE Consum. Electron. Mag. https://doi.org/10.1109/MCE.2021.3083206

  4. Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 8798–8807 (2018). https://doi.org/10.1109/CVPR.2018.00917

  5. Wang, T.C., et al.: Video-to-video synthesis. In: Advances in Neural Information Processing Systems, vol. 2018, pp. 1144–1156 (2018)

    Google Scholar 

  6. Wang, T.C., Liu, M.Y., Tao, A., Liu, G., Kautz, J., Catanzaro, B.: Few-shot video-to-video synthesis, arXiv (2019)

    Google Scholar 

  7. Mallya, A., Wang, T.-C., Sapra, K., Liu, M.-Y.: World-consistent video-to-video synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 359–378. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_22

    Chapter  Google Scholar 

  8. Soni, R.K., Nair, B.B.: Deep learning-based approach to generate realistic data for ADAS applications. In: 2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP), pp. 1–5 (2021). https://doi.org/10.1109/ICCCSP52374.2021.9465529

  9. Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 1647–1655 (2017). https://doi.org/10.1109/CVPR.2017.179

  10. Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, pp. 2332–2341 (2019). https://doi.org/10.1109/CVPR.2019.00244

  11. Zhu, P., Abdal, R., Qin, Y., Wonka, P.: SEAN: image synthesis with semantic region-adaptive normalization. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020, pp. 5103–5112 (2020). https://doi.org/10.1109/CVPR42600.2020.00515

  12. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 5967–5976 (2017). https://doi.org/10.1109/CVPR.2017.632

  13. He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask RCNN. arXiv:1703.06870v3 [cs.CV], 24 January 2018

  14. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312v3 [cs.CV], 21 February 2015

  15. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016). https://doi.org/10.1109/CVPR.2016.350

  16. Fadaie, J.G.: The state of modeling, simulation, and data utilization within industry- an autonomous vehicles perspective. arXiv:1910.06075 [cs.CV], 7 October 2019

  17. MathWorks, “MATLAB Overview,” MATLAB Product Documentation (2020)

    Google Scholar 

  18. Lee,D., Liu, S., Gu, J., Liu, M.-Y., Yang, M.-H., Kautz, J.: Context-aware synthesis and placement of object instances. In: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada (2018)

    Google Scholar 

  19. Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. arXiv: 2003.12039v3 [cs.CV], 25 August 2020

  20. Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. arXiv: 1611.00850v2 [cs.CV], 21 November 2016

  21. Sun, D., Yang, X., Liu, M.-Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. arXiv:1709.02371v3 [cs.CV], 25 June 2018

  22. Emani, S., Soman, K.P., Sajith Variyar, V.V., Adarsh, S.: Obstacle detection and distance estimation for autonomous electric vehicle using stereo vision and DNN. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds.) Soft Computing and Signal Processing. Advances in Intelligent Systems and Computing, vol. 898, Springer, Singapore (2019)

    Google Scholar 

  23. Mukherjee, A., Adarsh, S., Ramachandran, K.I.: ROS-based pedestrian detection and distance estimation algorithm using stereo vision, leddar and CNN. In: Satapathy, S., Bhateja, V., Janakiramaiah, B., Chen, Y.W. (eds.) Intelligent System Design. Advances in Intelligent Systems and Computing, vol. 1171. Springer, Singapore (2021)

    Google Scholar 

  24. Dunna, S., Nair, B.B., Panda, M.K.: A deep learning based system for fast detection of obstacles using rear-view camera under parking scenarios. In: IEEE International Power and Renewable Energy Conference (IPRECON) 2021, pp. 1–7 (2021). https://doi.org/10.1109/IPRECON52453.2021.9640804

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yadhu Vamsi Chinthada .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chinthada, Y.V., Adarsh, S. (2023). Deep Learning Based Dynamic Object Addition to Video Instances for Creating Synthetic Data. In: Shaw, R.N., Paprzycki, M., Ghosh, A. (eds) Advanced Communication and Intelligent Systems. ICACIS 2022. Communications in Computer and Information Science, vol 1749. Springer, Cham. https://doi.org/10.1007/978-3-031-25088-0_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25088-0_67

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25087-3

  • Online ISBN: 978-3-031-25088-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics