Skip to main content
Log in

Omnidirectional stereo video using a hybrid representation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Compared with the traditional video, omnidirectional stereo video (ODSV) provides a larger field of view (FOV) with depth perception but makes the capturing, processing and displaying more complicated. Even though many attempts have been made to address these challenges, they leave one or more of the following problems: complicated camera rig, high latency and visible distortions. This paper presents a practical end-to-end solution based on a novel hybrid representation to solve these problems simultaneously. The proposed solution is directly from capturing to displaying, which removes the processing step, thus reducing the total time consumption and visible stitching distortions. This hybrid representation is piecewise linear about the horizontal viewing direction whose domain of definition is 0 to 360 with an assumption that the background is static, consisting of both static and moving regions. Using this representation, ODSV can be presented by omnidirectional stereo images and normal stereo pair of videos respectively. Moreover, a single panoramic camera strategy can be adopted to capture the omnidirectional stereo images in real environment and a normal binocular camera can be used to capture the stereo pair of videos. To display the ODSV, this paper presents a real-time tracking-based rendering algorithm for head mounted display (HMD). Experiments show that the proposed method is effective and cost-efficient. In contrast to state-of-the-art methods, the proposed method significantly reduces the complexity of camera rig and data amount, preserving a competitive stereo quality without visible distortions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Anderson R, Gallup D, Barron J T, Kontkanen J, Snavely N, Hernandez C, Agarwal S, Seitz S M (2016) Jump: virtual reality video. ACM Trans Graph 198:1–13. https://doi.org/10.1145/2980179.2980257

    Article  Google Scholar 

  2. Appia K, Batur U (2014) Fully automatic 2D to 3D conversion with aid of High-Level image features. In: Proceedings of SPIE 9011, stereoscopic displays and applications XXV, 90110W. https://doi.org/10.1117/12.2040907

  3. FFmpeg Developers (2021) ffmpeg tool (version be1d324)[software]. http://ffmpeg.org/. Accessed 18 May 2021

  4. Facebook (2021) Introducing oculus quest 2, the next generation of all-in-one VR. https://www.oculus.com/blog/introducing-oculus-quest-2-the-next-generation-of-all-in-one-vr-gaming/. Accessed 18 May 2021

  5. Fan CL, Lo WC, Pai YT, Hsu CH (2019) A survey on 360 video streaming: acquisition, transmission, and display. ACM Comput Surv 52 (4):1–36. https://doi.org/10.1145/3329119

    Article  Google Scholar 

  6. Flynn J, Neulander I, Philbin J, Snavely N (2016) Deep stereo: learning to predict new views from the world’s imagery. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), vol 595, pp 5515–5524. https://doi.org/10.1109/CVPR.2016.595

  7. Forrest B (2021) Surround360 is now open source. https://engineering.fb.com/2016/07/26/video-engineering/surround-360-is-now-open-source/. Accessed 18 May 2021

  8. Google VR (2021) Experience virtual reality in a simple, fun, and affordable way. https://arvr.google.com/cardboard/. Accessed 18 May 2021

  9. Huang SK, Lin HS, Ouhyoung M (2017) Effective omnistereo panorama video generation by deformable spheres. ACM SIGGRAPH 2017 Posters (SIGGRAPH ’17) 22:1–2. https://doi.org/10.1145/3102163.3102199

    Google Scholar 

  10. Konrad J, Wang M, Ishwar P, Wu C, Mukherjee D (2013) Learning-based, automatic 2D-to-3D image and video conversion. IEEE Trans Image Process 22 (9):3485–3496. https://doi.org/10.1109/TIP.2013.2270375

    Article  Google Scholar 

  11. Konrad R, Dansereau DG, Masood A, Wetzstein G (2017) SpinVR: towards live-streaming 3D virtual reality video. ACM Trans Graph 36 (6):1–12. https://doi.org/10.1145/3130800.3130836

    Article  Google Scholar 

  12. Koulieris GA, Akşit K, Stengel M, Mantiuk RK, Mania K, Richardt C (2019) Near-Eye display and tracking technologies for virtual and augmented reality. Comput Graph Forum 38:493–519. https://doi.org/10.1111/cgf.13654

    Article  Google Scholar 

  13. Lee J, Kim B, Kim K, Kim Y, Noh J (2016) Rich360: optimized spherical representation from structured panoramic camera arrays. Int Conf Comput Graph Interact Tech 35(4):1–11. https://doi.org/10.1145/2897824.2925983

    Google Scholar 

  14. Lessig C, Desbrun M, Fiume E (2014) A constructive theory of sampling for image synthesis using reproducing kernel bases. ACM Trans Graph 33 (7):1–14. https://doi.org/10.1145/2601097.2601149

    Article  MATH  Google Scholar 

  15. Limonov A, Yu X, Juan L, Lei C, Jian Y (2018) Stereoscopic realtime 360-degree video stitching. In: 2018 IEEE international conference on consumer electronics (ICCE), pp 1–6. https://doi.org/10.1109/ICCE.2018.8326105https://doi.org/10.1109/ICCE.2018.8326105

  16. Lu J, Yang Y, Liu RY, Kang SB, Yu JY (2019) 2D-to-stereo panorama conversion using gan and concentric mosaics. IEEE Access 7:23187–23196. https://doi.org/10.1109/ACCESS.2019.2899221

    Article  Google Scholar 

  17. Mashraki A (2021) Surround360. https://github.com/facebook/Surround360. Accessed 18 May 2021

  18. Matzen K, Cohen M F, Evans B, Kopf J, Szeliski R (2017) Low-Cost 360 Stereo photography and video capture. ACM Trans Graph 36(4):1–12. https://doi.org/10.1145/3072959.3073645

    Article  Google Scholar 

  19. Ochi D, Kunita Y, Kameda A, Kojima A, Iwaki S (2015) Live streaming system for omnidirectional video. In: 2015 IEEE virtual reality (VR), pp 349–350. https://doi.org/10.1109/VR.2015.7223439

  20. Peleg S, Ben-ezra M (1999) Stereo panorama with a single camera. Proceedings 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149) 1:395–401. https://doi.org/10.1109/CVPR.1999.786969

    Article  Google Scholar 

  21. Pharr M, Jakob W, Humphreys G (2016) Physically based rendering: from theory to implementation, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  22. Pico (2021) Pico Enterprise solutions. https://www.pico-interactive.com/us/. Accessed 18 May 2021

  23. Ra KK, Clark JJ (2019) Decoupled hybrid 360 panoramic stereo video. In: 2019 International conference on 3D vision (3DV), pp 386–394. https://doi.org/10.1109/3DV.2019.00050

  24. Richardt C (2020) Omnidirectional stereo. In: Ikeuchi K (ed) Computer vision. Springer, Cham. https://doi.org/10.1007/978-3-030-03243-2_808-1

  25. Richardt C, Pritch Y, Zimmer H, Sorkine-Hornung A (2013) Megastereo: constructing high-resolution stereo panoramas. 2013 IEEE Conference on Computer Vision and Pattern Recognition 1:1256–1263. https://doi.org/10.1109/CVPR.2013.166

    Article  Google Scholar 

  26. Richardt C, Hedman P, Overbeck R S, Cabral B, Konrad B, Sullivan S (2019) Capture4VR: from VR photography to VR video. In: SIGGRAPH courses, pp 1–319. https://doi.org/10.1145/3305366.3328028

  27. Richardt C, Tompkin J, Wetzstein G (2020) Capture, reconstruction, and representation of the visual real world for virtual reality. In: Magnor M, Sorkine-Hornung A (eds) Real VR – immersive digital reality. Lecture notes in computer science, vol 11900. Springer, Cham, pp 3–32. https://doi.org/10.1007/978-3-030-41816-8_1

  28. Rousselle F, Jarosz W, Novak J (2016) Image-space control variates for rendering. ACM Trans Graph 35(6):1–12. https://doi.org/10.1145/2980179.2982443

    Article  Google Scholar 

  29. Schroers C, Bazin JC, Sorkine-Hornung A (2018) An omnistereoscopic video pipeline for capture and display of real-world VR. ACM Trans Graph 37 (3):1–13. https://doi.org/10.1145/3225150

    Article  Google Scholar 

  30. Sengupta S, Jayaram V, Curless B, Seitz S, Kemelmacher-Shlizerman I (2020) Background matting: the world is your green screen. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2291–2300

  31. Tang M, Wen J, Zhang Y, Gu J, Junker P, Guo B, Jhao G, Zhu Z, Han Y (2019) A universal optical flow based real-time low-latency omnidirectional stereo video system. IEEE Trans Multimedia 21(4):957–972. https://doi.org/10.1109/TMM.2018.2867266

    Article  Google Scholar 

  32. W3C Immersive web working and community groups (2021) WebXR. https://immersiveweb.dev/. Accessed 18 May 2021

  33. Wikipedia contributors (2021) Anaglyph 3D. https://en.wikipedia.org/wiki/Anaglyph_3D. Accessed 18 May 2021

  34. Xie J, Girshick R, Farhadi A (2016) Deep3D: fully automatic 2D-to-3D video conversion with deep convolutional neural networks. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision – ECCV 2016. ECCV 2016. Lecture notes in computer science, vol 9908. Springer, Cham. https://doi.org/10.1007/978-3-319-46493-0_51

  35. Xu F, Zhao T, Luo B, Dai Q (2018) Generating VR live videos with tripod panoramic rig. In: 2018 IEEE conference on virtual reality and 3D user interfaces (VR), pp 446–449. https://doi.org/10.1109/VR.2018.8448283

  36. Zhang E, Cohen MF, Curless B (2016) Emptying, refurnishing, and relighting indoor spaces. ACM Trans Graph 35(6):1–14. https://doi.org/10.1145/2980179.2982432

    Article  Google Scholar 

  37. Zhang J, Zhu T, Zhang A, Yuan X, Wang Z, Beetschen S, Xu L, Lin X, Dai Q, Fang L (2020) Multiscale-VR: multiscale gigapixel 3D panoramic videography for virtual reality. In: 2020 IEEE international conference on computational photography (ICCP), pp 1–12. https://doi.org/10.1109/ICCP48838.2020.9105244

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofei Ai.

Ethics declarations

Conflict of Interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ai, X., Wang, Y., Chen, X. et al. Omnidirectional stereo video using a hybrid representation. Multimed Tools Appl 82, 3995–4010 (2023). https://doi.org/10.1007/s11042-022-13432-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13432-8

Keywords

Navigation