Learning 360° Optical Flow Using Tangent Images and Transformer

Ma, Yanjie; Han, Cheng; Xv, Chao; Chen, Wudi; Jin, Baohua

doi:10.1007/978-981-97-8502-5_11

Yanjie Ma¹⁵,
Cheng Han¹⁵,
Chao Xv¹⁵,
Wudi Chen¹⁵ &
…
Baohua Jin¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15033))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

94 Accesses

Abstract

Optical flow estimation has always been a fundamental task in computer vision. Due to the ultra-wide field of view (FoV) of panoramic cameras, traditional perspective-based methods for optical flow estimation fail to adapt to the omnidirectional nature of 360° panoramic images, making optical flow estimation for panoramic images challenging. In this paper, we firstly transform panoramic images into a set of distortion-free tangent images to cover the entire FoV and extract tangent images features using CNN, solving the problem of significant distortion of equirectangular projection. Then, we introduce a stereo embedding module that adds stereoscopic features to the tangent images to make its globally consistent. Finally, we globally aggregate the distortion-free features of the encoder through transformer, which in turn enhances the image features to solve the large displacement of pixels. Extensive experimental results demonstrate that our method achieves state-of-the-art performance on the public dataset FlowScape and exhibits strong generalization capability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep 360 $$^\circ $$ Optical Flow Estimation Based on Multi-projection Fusion

Spherical Panorama Stitching Based on Feature Matching and Optical Flow

Learning Omnidirectional Flow in 360 $$^\circ $$ Video via Siamese Representation

References

Phan, T.-B., Trinh, D.-H. Lamarque, D., et al.: Dense optical flow for the reconstruction of weakly textured and structured surfaces: application to endoscopy. In: 2019 IEEE International Conference on Image Processing, pp. 310–3142. Taipei (2019)
Google Scholar
Lin, Y., Zhou, W.: Deep learning-based algorithm for generating edge information of optical flow frame interpolation. Comput. Appl. Res. 39(06), 1901–1904 (2022)
Google Scholar
Lup, V., Nedevschi, S.: Video semantic segmentation leveraging dense optical flow. In: 16th International Conference on Intelligent Computer Communication and Processing, pp. 369–376. Cluj-Napoca, Romania (2020)
Google Scholar
Dong, Y.: Faint moving small target detection based on optical flow method. In: 7th International Conference on Intelligent Computing and Signal Processing, pp. 391–395. Xi'an, China (2022)
Google Scholar
Zhang, Y., Zhao, B., Zhang, D.: The elder care robot based on panoramic vision. In: 2022 International Symposium on Electrical, Electronics and Information Engineering (ISEEIE), pp. 266–271. Chiang Mai, Thailand (2022)
Google Scholar
Akdemir, B., Belbachi, A.-M., Svendsen, L.-M.: Real-time vehicle localization and tracking using monocular panomorph panoramic vision. In: 24th International Conference on Pattern Recognition (ICPR), pp. 2350–2355. China, Beijing (2018)
Google Scholar
Meng, L., Hirayama, T., Oyanagi, S.: Underwater-drone with panoramic camera for automatic fish recognition based on deep learning. IEEE Access 6, 17880–17886 (2018)
Article Google Scholar
Xue, C., Zhang, J., Hao, Y.: Research on distortion algorithm of panoramic image unfolding map. In: 2nd International Conference on Algorithms, High Performance Computing and Artificial Intelligence (AHPCAI), pp. 98–102. Guangzhou, China (2022)
Google Scholar
Su, Y.C., Grauman. K.: Learning spherical convolution for fast features from 360° imagery. In: European Conference on Computer Vision, pp. 525–541 (2018)
Google Scholar
Fernandez, L.C., Facil, J., Perez, Y.A., et al.: Corners for layout: end-to-end layout recovery from 360 images. IEEE Robot. Autom. Lett. 5(2), 1255–1262 (2020)
Article Google Scholar
Ling, Z., Xing, Z., et al.: PanoSwin: a Pano-style swin transformer for panorama understanding. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17755–17764. Canada (2023)
Google Scholar
Bhandari, K., Zong, Z., Yan, Y.: LiteFlowNet360: revisiting optical flow estimation in 360 videos. In: 25th International Conference on Pattern Recognition, pp. 8196–8203. Milan, Italy (2021)
Google Scholar
Yuan, M., Richardt, C.: 360° optical flow using tangent images. In: 32th International Proceedings of the British Machine Vision Conference (2021)
Google Scholar
Li, Y., Barnes, C., Huang, K., et al.: Deep 360° optical flow estimation based on multi-projection fusion. In: Computer Vision–ECCV 17th European Conference, pp. 336–352. Tel Aviv, Israel (2022)
Google Scholar
Yuan, L., et al.: Tokens-to-token ViT: training vision transformers from scratch on ImageNet. In: 2021 IEEE/CVF International Conference on Computer Vision, pp. 538–547. Montreal, QC, Canada (2021)
Google Scholar
Bhandari, K., Duan, B., Liu, G., et al.: Learning omnidirectional flow in 360° video via Siamese. In: 17th European Conference. Representation. Computer Vision, pp. 557–574. Tel Aviv, Israel (2022)
Google Scholar
Kim, E., Jun, W., Heo, J.-P.: Axial constraints for global matching-based optical flow estimation. IEEE Access 11, 69989–70000 (2023)
Article Google Scholar
Zhu, Y., Newsam, S.: Densenet for dense flow. In: 2017 IEEE International Conference on Image Processing, pp. 790–794 (2017)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al.: Attention is all you need. arXiv. In NeurIPS, pp. 5998–6008 (2017)
Google Scholar
Coors, B., Condurache, A.-P., Geiger, A.: SphereNet: learning spherical representations for detection and classification in omnidirectional images. In: 14th Proceedings of the IEEE Conference on European Conference and Computer Vision, pp. 518–533 (2018)
Google Scholar
Eder, M., Shvets, M., et al.: Tangent images for mitigating spherical distortion. In: 25th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Li, Y., Guo, Y., Yan, Z., Huang, X., Duan, Y., Ren, L.: OmniFusion: 360 monocular depth estimation via geometry-aware fusion. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2791–2800. New Orleans, LA, USA (2022)
Google Scholar
Xu, C.: Applying MLP and CNN on handwriting images for image classification task. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering, pp. 830–835. Wuhan, China (2022)
Google Scholar
Xu, H., Zhang, J., Cai, J., et al.: Gmflow: learning optical flow via global matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8121–8130. New Orleans, LA, USA (2022). https://doi.org/10.1109/CVPR52688.2022.00795
Li, Z.-H., Liu, X.-T., Drenkow, N., et al.: Revisiting stereo depth estimation from a sequence perspective with transformers. In: 2021 IEEE/CVF International Conference on Computer Vision, pp. 6197–6206. Montreal, QC, Canada (2021)
Google Scholar
Khan, I.U., Han, K., Lee, J.W.: TransUser's: a transformer based salient object detection for users experience generation in 360° videos. In: 2024 IEEE International Conference on Artificial Intelligence and extended and Virtual Reality, pp. 256–260. Los Angeles, USA (2024)
Google Scholar
Sun, J.-M., Shen, Z.-H., Wang, Y., et al.: Loftr: detector-free local feature matching with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8922–8931. Nashville, TN, USA (2021)
Google Scholar
Liu, R., Cheng, Y., Huang, S., Li, C., Cheng, X.: Transformer-based high-fidelity facial displacement completion for detailed 3D face reconstruction. IEEE Trans. Multimedia 26, 799–810 (2024). https://doi.org/10.1109/TMM.2023.3271816
Article Google Scholar
Shi, H., Zhou, Y., Yang, K., et al.: Csflow: learning optical flow via cross strip correlation for autonomous driving (2022)
Google Scholar
Shi, H., Zhou, Y., Yang, K., et al.: PanoFlow: learning optical flow for panoramic images. IEEE Trans. Intell. Transp. Syst. 24(5), 5570–5585 (2023)
Article Google Scholar
Dosovitskiy, A., Fischer, P., Fischer, Ilg, E., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766. (2015)
Google Scholar
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 21th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: 16th Proceedings of the IEEE Conference on European Conference and Computer Vision, pp. 402–419 (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Artizzu, C.-O., Zhang, H., Allibert, G., Demonceaux, C.: OmniFlowNet: a perspective neural network adaptation for optical flow estimation in omnidirectional images. In: 26th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2657–2662 (2021)
Google Scholar

Download references

Acknowledgement

This work is supported by the Natural Science Foundation of Jilin No. 20220101134JC.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Changchun University of Science and Technology, Changchun, China
Yanjie Ma, Cheng Han, Chao Xv, Wudi Chen & Baohua Jin

Authors

Yanjie Ma
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Han
View author publications
You can also search for this author in PubMed Google Scholar
Chao Xv
View author publications
You can also search for this author in PubMed Google Scholar
Wudi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Baohua Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Han .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Zhouchen Lin
Nankai University, Tianjin, China
Ming-Ming Cheng
Chinese Academy of Sciences, Beijing, China
Ran He
Xinjiang University, Ürümqi, Xinjiang, China
Kurban Ubul
Xinjiang University, Ürümqi, China
Wushouer Silamu
Peking University, Beijing, China
Hongbin Zha
Tsinghua University, Beijing, China
Jie Zhou
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, Y., Han, C., Xv, C., Chen, W., Jin, B. (2025). Learning 360° Optical Flow Using Tangent Images and Transformer. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15033. Springer, Singapore. https://doi.org/10.1007/978-981-97-8502-5_11

Download citation

DOI: https://doi.org/10.1007/978-981-97-8502-5_11
Published: 01 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8501-8
Online ISBN: 978-981-97-8502-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning 360° Optical Flow Using Tangent Images and Transformer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep 360 $$^\circ $$ Optical Flow Estimation Based on Multi-projection Fusion

Spherical Panorama Stitching Based on Feature Matching and Optical Flow

Learning Omnidirectional Flow in 360 $$^\circ $$ Video via Siamese Representation

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learning 360° Optical Flow Using Tangent Images and Transformer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep 360 $$^\circ $$ Optical Flow Estimation Based on Multi-projection Fusion

Spherical Panorama Stitching Based on Feature Matching and Optical Flow

Learning Omnidirectional Flow in 360 $$^\circ $$ Video via Siamese Representation

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation