Abstract
Optical flow estimation has always been a fundamental task in computer vision. Due to the ultra-wide field of view (FoV) of panoramic cameras, traditional perspective-based methods for optical flow estimation fail to adapt to the omnidirectional nature of 360° panoramic images, making optical flow estimation for panoramic images challenging. In this paper, we firstly transform panoramic images into a set of distortion-free tangent images to cover the entire FoV and extract tangent images features using CNN, solving the problem of significant distortion of equirectangular projection. Then, we introduce a stereo embedding module that adds stereoscopic features to the tangent images to make its globally consistent. Finally, we globally aggregate the distortion-free features of the encoder through transformer, which in turn enhances the image features to solve the large displacement of pixels. Extensive experimental results demonstrate that our method achieves state-of-the-art performance on the public dataset FlowScape and exhibits strong generalization capability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Phan, T.-B., Trinh, D.-H. Lamarque, D., et al.: Dense optical flow for the reconstruction of weakly textured and structured surfaces: application to endoscopy. In: 2019 IEEE International Conference on Image Processing, pp. 310–3142. Taipei (2019)
Lin, Y., Zhou, W.: Deep learning-based algorithm for generating edge information of optical flow frame interpolation. Comput. Appl. Res. 39(06), 1901–1904 (2022)
Lup, V., Nedevschi, S.: Video semantic segmentation leveraging dense optical flow. In: 16th International Conference on Intelligent Computer Communication and Processing, pp. 369–376. Cluj-Napoca, Romania (2020)
Dong, Y.: Faint moving small target detection based on optical flow method. In: 7th International Conference on Intelligent Computing and Signal Processing, pp. 391–395. Xi'an, China (2022)
Zhang, Y., Zhao, B., Zhang, D.: The elder care robot based on panoramic vision. In: 2022 International Symposium on Electrical, Electronics and Information Engineering (ISEEIE), pp. 266–271. Chiang Mai, Thailand (2022)
Akdemir, B., Belbachi, A.-M., Svendsen, L.-M.: Real-time vehicle localization and tracking using monocular panomorph panoramic vision. In: 24th International Conference on Pattern Recognition (ICPR), pp. 2350–2355. China, Beijing (2018)
Meng, L., Hirayama, T., Oyanagi, S.: Underwater-drone with panoramic camera for automatic fish recognition based on deep learning. IEEE Access 6, 17880–17886 (2018)
Xue, C., Zhang, J., Hao, Y.: Research on distortion algorithm of panoramic image unfolding map. In: 2nd International Conference on Algorithms, High Performance Computing and Artificial Intelligence (AHPCAI), pp. 98–102. Guangzhou, China (2022)
Su, Y.C., Grauman. K.: Learning spherical convolution for fast features from 360° imagery. In: European Conference on Computer Vision, pp. 525–541 (2018)
Fernandez, L.C., Facil, J., Perez, Y.A., et al.: Corners for layout: end-to-end layout recovery from 360 images. IEEE Robot. Autom. Lett. 5(2), 1255–1262 (2020)
Ling, Z., Xing, Z., et al.: PanoSwin: a Pano-style swin transformer for panorama understanding. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17755–17764. Canada (2023)
Bhandari, K., Zong, Z., Yan, Y.: LiteFlowNet360: revisiting optical flow estimation in 360 videos. In: 25th International Conference on Pattern Recognition, pp. 8196–8203. Milan, Italy (2021)
Yuan, M., Richardt, C.: 360° optical flow using tangent images. In: 32th International Proceedings of the British Machine Vision Conference (2021)
Li, Y., Barnes, C., Huang, K., et al.: Deep 360° optical flow estimation based on multi-projection fusion. In: Computer Vision–ECCV 17th European Conference, pp. 336–352. Tel Aviv, Israel (2022)
Yuan, L., et al.: Tokens-to-token ViT: training vision transformers from scratch on ImageNet. In: 2021 IEEE/CVF International Conference on Computer Vision, pp. 538–547. Montreal, QC, Canada (2021)
Bhandari, K., Duan, B., Liu, G., et al.: Learning omnidirectional flow in 360° video via Siamese. In: 17th European Conference. Representation. Computer Vision, pp. 557–574. Tel Aviv, Israel (2022)
Kim, E., Jun, W., Heo, J.-P.: Axial constraints for global matching-based optical flow estimation. IEEE Access 11, 69989–70000 (2023)
Zhu, Y., Newsam, S.: Densenet for dense flow. In: 2017 IEEE International Conference on Image Processing, pp. 790–794 (2017)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., et al.: Attention is all you need. arXiv. In NeurIPS, pp. 5998–6008 (2017)
Coors, B., Condurache, A.-P., Geiger, A.: SphereNet: learning spherical representations for detection and classification in omnidirectional images. In: 14th Proceedings of the IEEE Conference on European Conference and Computer Vision, pp. 518–533 (2018)
Eder, M., Shvets, M., et al.: Tangent images for mitigating spherical distortion. In: 25th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Li, Y., Guo, Y., Yan, Z., Huang, X., Duan, Y., Ren, L.: OmniFusion: 360 monocular depth estimation via geometry-aware fusion. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2791–2800. New Orleans, LA, USA (2022)
Xu, C.: Applying MLP and CNN on handwriting images for image classification task. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering, pp. 830–835. Wuhan, China (2022)
Xu, H., Zhang, J., Cai, J., et al.: Gmflow: learning optical flow via global matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8121–8130. New Orleans, LA, USA (2022). https://doi.org/10.1109/CVPR52688.2022.00795
Li, Z.-H., Liu, X.-T., Drenkow, N., et al.: Revisiting stereo depth estimation from a sequence perspective with transformers. In: 2021 IEEE/CVF International Conference on Computer Vision, pp. 6197–6206. Montreal, QC, Canada (2021)
Khan, I.U., Han, K., Lee, J.W.: TransUser's: a transformer based salient object detection for users experience generation in 360° videos. In: 2024 IEEE International Conference on Artificial Intelligence and extended and Virtual Reality, pp. 256–260. Los Angeles, USA (2024)
Sun, J.-M., Shen, Z.-H., Wang, Y., et al.: Loftr: detector-free local feature matching with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8922–8931. Nashville, TN, USA (2021)
Liu, R., Cheng, Y., Huang, S., Li, C., Cheng, X.: Transformer-based high-fidelity facial displacement completion for detailed 3D face reconstruction. IEEE Trans. Multimedia 26, 799–810 (2024). https://doi.org/10.1109/TMM.2023.3271816
Shi, H., Zhou, Y., Yang, K., et al.: Csflow: learning optical flow via cross strip correlation for autonomous driving (2022)
Shi, H., Zhou, Y., Yang, K., et al.: PanoFlow: learning optical flow for panoramic images. IEEE Trans. Intell. Transp. Syst. 24(5), 5570–5585 (2023)
Dosovitskiy, A., Fischer, P., Fischer, Ilg, E., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766. (2015)
Mayer, N., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: 21th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: 16th Proceedings of the IEEE Conference on European Conference and Computer Vision, pp. 402–419 (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Artizzu, C.-O., Zhang, H., Allibert, G., Demonceaux, C.: OmniFlowNet: a perspective neural network adaptation for optical flow estimation in omnidirectional images. In: 26th Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2657–2662 (2021)
Acknowledgement
This work is supported by the Natural Science Foundation of Jilin No. 20220101134JC.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ma, Y., Han, C., Xv, C., Chen, W., Jin, B. (2025). Learning 360° Optical Flow Using Tangent Images and Transformer. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15033. Springer, Singapore. https://doi.org/10.1007/978-981-97-8502-5_11
Download citation
DOI: https://doi.org/10.1007/978-981-97-8502-5_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8501-8
Online ISBN: 978-981-97-8502-5
eBook Packages: Computer ScienceComputer Science (R0)