Skip to main content

Learning 360° Optical Flow Using Tangent Images and Transformer

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15033))

Included in the following conference series:

  • 94 Accesses

Abstract

Optical flow estimation has always been a fundamental task in computer vision. Due to the ultra-wide field of view (FoV) of panoramic cameras, traditional perspective-based methods for optical flow estimation fail to adapt to the omnidirectional nature of 360° panoramic images, making optical flow estimation for panoramic images challenging. In this paper, we firstly transform panoramic images into a set of distortion-free tangent images to cover the entire FoV and extract tangent images features using CNN, solving the problem of significant distortion of equirectangular projection. Then, we introduce a stereo embedding module that adds stereoscopic features to the tangent images to make its globally consistent. Finally, we globally aggregate the distortion-free features of the encoder through transformer, which in turn enhances the image features to solve the large displacement of pixels. Extensive experimental results demonstrate that our method achieves state-of-the-art performance on the public dataset FlowScape and exhibits strong generalization capability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Phan, T.-B., Trinh, D.-H. Lamarque, D., et al.: Dense optical flow for the reconstruction of weakly textured and structured surfaces: application to endoscopy. In: 2019 IEEE International Conference on Image Processing, pp. 310–3142. Taipei (2019)

    Google Scholar 

  2. Lin, Y., Zhou, W.: Deep learning-based algorithm for generating edge information of optical flow frame interpolation. Comput. Appl. Res. 39(06), 1901–1904 (2022)

    Google Scholar 

  3. Lup, V., Nedevschi, S.: Video semantic segmentation leveraging dense optical flow. In: 16th International Conference on Intelligent Computer Communication and Processing, pp. 369–376. Cluj-Napoca, Romania (2020)

    Google Scholar 

  4. Dong, Y.: Faint moving small target detection based on optical flow method. In: 7th International Conference on Intelligent Computing and Signal Processing, pp. 391–395. Xi'an, China (2022)

    Google Scholar 

  5. Zhang, Y., Zhao, B., Zhang, D.: The elder care robot based on panoramic vision. In: 2022 International Symposium on Electrical, Electronics and Information Engineering (ISEEIE), pp. 266–271. Chiang Mai, Thailand (2022)

    Google Scholar 

  6. Akdemir, B., Belbachi, A.-M., Svendsen, L.-M.: Real-time vehicle localization and tracking using monocular panomorph panoramic vision. In: 24th International Conference on Pattern Recognition (ICPR), pp. 2350–2355. China, Beijing (2018)

    Google Scholar 

  7. Meng, L., Hirayama, T., Oyanagi, S.: Underwater-drone with panoramic camera for automatic fish recognition based on deep learning. IEEE Access 6, 17880–17886 (2018)

    Article  Google Scholar 

  8. Xue, C., Zhang, J., Hao, Y.: Research on distortion algorithm of panoramic image unfolding map. In: 2nd International Conference on Algorithms, High Performance Computing and Artificial Intelligence (AHPCAI), pp. 98–102. Guangzhou, China (2022)

    Google Scholar 

  9. Su, Y.C., Grauman. K.: Learning spherical convolution for fast features from 360° imagery. In: European Conference on Computer Vision, pp. 525–541 (2018)

    Google Scholar 

  10. Fernandez, L.C., Facil, J., Perez, Y.A., et al.: Corners for layout: end-to-end layout recovery from 360 images. IEEE Robot. Autom. Lett. 5(2), 1255–1262 (2020)

    Article  Google Scholar 

  11. Ling, Z., Xing, Z., et al.: PanoSwin: a Pano-style swin transformer for panorama understanding. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17755–17764. Canada (2023)

    Google Scholar 

  12. Bhandari, K., Zong, Z., Yan, Y.: LiteFlowNet360: revisiting optical flow estimation in 360 videos. In: 25th International Conference on Pattern Recognition, pp. 8196–8203. Milan, Italy (2021)

    Google Scholar 

  13. Yuan, M., Richardt, C.: 360° optical flow using tangent images. In: 32th International Proceedings of the British Machine Vision Conference (2021)

    Google Scholar 

  14. Li, Y., Barnes, C., Huang, K., et al.: Deep 360° optical flow estimation based on multi-projection fusion. In: Computer Vision–ECCV 17th European Conference, pp. 336–352. Tel Aviv, Israel (2022)

    Google Scholar 

  15. Yuan, L., et al.: Tokens-to-token ViT: training vision transformers from scratch on ImageNet. In: 2021 IEEE/CVF International Conference on Computer Vision, pp. 538–547. Montreal, QC, Canada (2021)

    Google Scholar 

  16. Bhandari, K., Duan, B., Liu, G., et al.: Learning omnidirectional flow in 360° video via Siamese. In: 17th European Conference. Representation. Computer Vision, pp. 557–574. Tel Aviv, Israel (2022)

    Google Scholar 

  17. Kim, E., Jun, W., Heo, J.-P.: Axial constraints for global matching-based optical flow estimation. IEEE Access 11, 69989–70000 (2023)

    Article  Google Scholar 

  18. Zhu, Y., Newsam, S.: Densenet for dense flow. In: 2017 IEEE International Conference on Image Processing, pp. 790–794 (2017)

    Google Scholar 

  19. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al.: Attention is all you need. arXiv. In NeurIPS, pp. 5998–6008 (2017)

    Google Scholar 

  20. Coors, B., Condurache, A.-P., Geiger, A.: SphereNet: learning spherical representations for detection and classification in omnidirectional images. In: 14th Proceedings of the IEEE Conference on European Conference and Computer Vision, pp. 518–533 (2018)

    Google Scholar 

  21. Eder, M., Shvets, M., et al.: Tangent images for mitigating spherical distortion. In: 25th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)

    Google Scholar 

  22. Li, Y., Guo, Y., Yan, Z., Huang, X., Duan, Y., Ren, L.: OmniFusion: 360 monocular depth estimation via geometry-aware fusion. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2791–2800. New Orleans, LA, USA (2022)

    Google Scholar 

  23. Xu, C.: Applying MLP and CNN on handwriting images for image classification task. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering, pp. 830–835. Wuhan, China (2022)

    Google Scholar 

  24. Xu, H., Zhang, J., Cai, J., et al.: Gmflow: learning optical flow via global matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8121–8130. New Orleans, LA, USA (2022). https://doi.org/10.1109/CVPR52688.2022.00795

  25. Li, Z.-H., Liu, X.-T., Drenkow, N., et al.: Revisiting stereo depth estimation from a sequence perspective with transformers. In: 2021 IEEE/CVF International Conference on Computer Vision, pp. 6197–6206. Montreal, QC, Canada (2021)

    Google Scholar 

  26. Khan, I.U., Han, K., Lee, J.W.: TransUser's: a transformer based salient object detection for users experience generation in 360° videos. In: 2024 IEEE International Conference on Artificial Intelligence and extended and Virtual Reality, pp. 256–260. Los Angeles, USA (2024)

    Google Scholar 

  27. Sun, J.-M., Shen, Z.-H., Wang, Y., et al.: Loftr: detector-free local feature matching with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8922–8931. Nashville, TN, USA (2021)

    Google Scholar 

  28. Liu, R., Cheng, Y., Huang, S., Li, C., Cheng, X.: Transformer-based high-fidelity facial displacement completion for detailed 3D face reconstruction. IEEE Trans. Multimedia 26, 799–810 (2024). https://doi.org/10.1109/TMM.2023.3271816

    Article  Google Scholar 

  29. Shi, H., Zhou, Y., Yang, K., et al.: Csflow: learning optical flow via cross strip correlation for autonomous driving (2022)

    Google Scholar 

  30. Shi, H., Zhou, Y., Yang, K., et al.: PanoFlow: learning optical flow for panoramic images. IEEE Trans. Intell. Transp. Syst. 24(5), 5570–5585 (2023)

    Article  Google Scholar 

  31. Dosovitskiy, A., Fischer, P., Fischer, Ilg, E., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766. (2015)

    Google Scholar 

  32. Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 21th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)

    Google Scholar 

  33. Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: 16th Proceedings of the IEEE Conference on European Conference and Computer Vision, pp. 402–419 (2020). https://doi.org/10.1007/978-3-030-58536-5_24

  34. Artizzu, C.-O., Zhang, H., Allibert, G., Demonceaux, C.: OmniFlowNet: a perspective neural network adaptation for optical flow estimation in omnidirectional images. In: 26th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2657–2662 (2021)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the Natural Science Foundation of Jilin No. 20220101134JC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, Y., Han, C., Xv, C., Chen, W., Jin, B. (2025). Learning 360° Optical Flow Using Tangent Images and Transformer. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15033. Springer, Singapore. https://doi.org/10.1007/978-981-97-8502-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-8502-5_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-8501-8

  • Online ISBN: 978-981-97-8502-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics