Skip to main content
Log in

Real-Time Light Field Video Focusing and GPU Accelerated Streaming

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

This article has been updated

Abstract

This paper proposes a novel solution of real-time depth range and correct focusing estimation in light field videos represented by arrays of video sequences. This solution, compared to previous approaches, offers a novel way to render high-quality synthetic views from light field videos on contemporary hardware in real-time. Only the video frames containing color information with no other attributes, such as captured depth, are needed. The drawbacks of the previous proposals such as block artifacts in the defocused parts of the scene or manual setting of the depth range are also solved in this paper. This paper describes a complete solution that solves the main memory and performance issues of light field rendering on contemporary personal computers. The whole integration of high-quality light field videos supersedes the approaches in previous works and the paper also provides measurements and experimental results. While reaching the same quality as slower state-of-the-art approaches, this method can still be used in real-time which makes it suitable for industry and real-life scenarios as an alternative to standard 3D rendering approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20

Similar content being viewed by others

Code Availability

The code is publicly available https://www.fit.vutbr.cz/~ichlubna/research and free to use.

Change history

  • 14 December 2023

    The original version of this paper was updated to correct the Code Availability link.

Notes

  1. Blender Demo Files - Barcelona Pavillion by Hamza Cheggour

  2. github.com/NVlabs/instant-ngp

  3. GL_NV_vdpau_interop extension for OpenGL

References

  1. Trottnow, J., Spielmann, S., Lange, T., Chelli, K., Solony, M., Smrz, P., Zemcik, P., Aenchbacher, W., Grogan, M., Alain, M., Smolic, A., Canham, T., Vu-Thanh, O., Vázquez-Corral, J., & Bertalmío, M. (2019). The potential of light fields in media productions. In: SIGGRAPH Asia 2019 Technical Briefs. SA ’19, pp. 71–74. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3355088.3365158

  2. Chlubna, T., Milet, T., & Zemčík, P. (2021). Real-time per-pixel focusing method for light field rendering. Computational Visual Media, 2021(7), 319–333. https://doi.org/10.1007/s41095-021-0205-0

    Article  Google Scholar 

  3. Adelson, E. H., & Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In M. S. Landy & A. J. Movshon (Eds.), Computational Models of Visual Processing (pp. 3–20). Cambridge, MA: MIT Press.

    Google Scholar 

  4. Levoy, M., & Hanrahan, P. (1996) Light field rendering. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques. SIGGRAPH ’96, pp. 31–42. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/237170.237199

  5. Gortler, S. J., Grzeszczuk, R., Szeliski, R., Cohen, M. F. (1996). The lumigraph. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques. SIGGRAPH ’96, pp. 43–54. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/237170.237200

  6. Isaksen, A., McMillan, L., Gortler, S. J. (2000). Dynamically reparameterized light fields. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. SIGGRAPH ’00, pp. 297–306. ACM Press/Addison-Wesley Publishing Co., USA. https://doi.org/10.1145/344779.344929

  7. Schmeing, M., & Jiang, X. (2011). In: Wang, P.S.P. (ed.) Depth Image Based Rendering, pp. 279–310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22407-2_12

  8. Lee, S., Kim, Y., & Eisemann, E. (2018). Iterative depth warping. ACM Transactions on Graphics, 37, 5. https://doi.org/10.1145/3190859

    Article  Google Scholar 

  9. Rosenthal, P., & Linsen, L. (2008). Image-space point cloud rendering. In: Proceedings of Computer Graphics International, pp. 136–143.

  10. Waschbüsch, M., Würmlin, S., & Gross, M. (2007). 3d video billboard clouds. In: Computer Graphics Forum, 26, 561–569. Wiley Online Library.

  11. Broxton, M., Flynn, J., Overbeck, R., Erickson, D., Hedman, P., DuVall, M., Dourgarian, J., Busch, J., Whalen, M., & Debevec, P. (2020). Immersive light field video with a layered mesh representation, 39(4), 86–18615.

    Google Scholar 

  12. Yamanoue, H., Okui, M., & Yuyama, I. (2000). A study on the relationship between shooting conditions and cardboard effect of stereoscopic images. IEEE Transactions on Circuits and Systems for Video Technology, 10(3), 411–416. https://doi.org/10.1109/76.836285

    Article  Google Scholar 

  13. Wilburn, B. S., Smulski, M., Lee, H. -H. K., & Horowitz, M. A. (2001). Light field video camera. In: Media Processors 2002, 4674, 29–36. International Society for Optics and Photonics.

  14. Yang, J. C., Everett, M., Buehler, C., & McMillan, L. (2002). A real-time distributed light field camera. Rendering Techniques, 2002, 77–86.

    Google Scholar 

  15. Georgiev, T., Yu, Z., Lumsdaine, A., & Goma, S. (2013). Lytro camera technology: theory, algorithms, performance analysis. In: Multimedia Content and Mobile Devices, 8667, 86671. International Society for Optics and Photonics.

  16. Lin, X., Wu, J., Zheng, G., & Dai, Q. (2015). Camera array based light field microscopy. Biomedical optics express, 6(9), 3179–3189.

    Article  Google Scholar 

  17. Chelli, K., Lange, T., Thorsten, H., Solony, M., Smrz, P., Alain, M., Smolic, A., Trottnow, J., & Helzle, V. (2020). A versatile 5d light field capture array. In: NEM Summit 2020. New European Media Initiative.

  18. Lin, Z., & Shum, H. -Y. (2004). A geometric analysis of light field rendering. International Journal of Computer Vision, 58(2), 121–138. https://doi.org/10.1023/B:VISI.0000015916.91741.27

    Article  Google Scholar 

  19. Hamzah, R. A., & Ibrahim, H. (2016). Literature survey on stereo vision disparity map algorithms. Journal of Sensors 2016.

  20. Alain, M., Aenchbacher, W., & Smolic, A. (2019). Interactive light field tilt-shift refocus with generalized shift-and-sum. ArXiv abs/1910.04699

  21. Ng, R., Levoy, M., Brédif, M., Duval, G., Horowitz, M., Hanrahan, P. (2005). Light field photography with a hand-held plenoptic camera. PhD thesis, Stanford University.

  22. Sugita, K., Naemura, T., Harashima, H., & Takahashi, K. (2004). Focus measurement on programmable graphics hardware for all in-focus rendering from light fields. In: Virtual Reality Conference, IEEE, p. 255. IEEE Computer Society, Los Alamitos, CA, USA. https://doi.org/10.1109/VR.2004.1310096

  23. Yang, R., Welch, G., & Bishop, G. (2002). Real-time consensus-based scene reconstruction using commodity graphics hardware+, 22, 225–234. https://doi.org/10.1109/PCCGA.2002.1167864

  24. Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., & Tan, P. (2020). Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504.

  25. Shi, L., Hassanieh, H., Davis, A., Katabi, D., & Durand, F. (2015). Light field reconstruction using sparsity in the continuous fourier domain. ACM Transactions on Graphics, 34(1). https://doi.org/10.1145/2682631

  26. Vagharshakyan, S., Bregovic, R., & Gotchev, A. (2018). Light field reconstruction using shearlet transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(1), 133–147. https://doi.org/10.1109/TPAMI.2017.2653101

    Article  Google Scholar 

  27. Hirschmuller, H. (2005). Accurate and efficient stereo processing by semi-global matching and mutual information. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05),  2, 807–814. IEEE.

  28. Anisimov, Y., Wasenmüller, O., & Stricker, D. (2019). Rapid light field depth estimation with semi-global matching. 2019 IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP), 109–116.

  29. Kolmogorov, V., & Zabih, R. (2001). Multi-camera scene reconstruction via graph cuts, 2352. https://doi.org/10.1007/3-540-47977-5_6

  30. Wu, Y., Wang, Y., Liang, J., Bajic, I. V., & Wang, A. (2020). Light field all-in-focus image fusion based on spatially-guided angular information. Journal of Visual Communication and Image Representation, 72, 102878. https://doi.org/10.1016/j.jvcir.2020.102878

    Article  Google Scholar 

  31. Sun, D., Roth, S., & Black, M. J. (2010). Secrets of optical flow estimation and their principles. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2432–2439. https://doi.org/10.1109/CVPR.2010.5539939

  32. Jiang, X., Pendu, M. L., & Guillemot, C. (2018). Depth estimation with occlusion handling from a sparse set of light field views. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 634–638. https://doi.org/10.1109/ICIP.2018.8451466

  33. Chen, Y., Alain, M., & Smolic, A. (2017). Fast and accurate optical flow based depth map estimation from light fields. In: Irish Machine Vision and Image Processing Conference (IMVIP).

  34. Lin, H., Chen, C., Kang, S. B., & Yu, J. (2015). Depth recovery from light field using focal stack symmetry. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3451–3459.

  35. Tao, M. W., Hadap, S., Malik, J., & Ramamoorthi, R. (2013). Depth from combining defocus and correspondence using light-field cameras. In: 2013 IEEE International Conference on Computer Vision, pp. 673–680.

  36. Neri, A., Carli, M., & Battisti, F. (2015). A multi-resolution approach to depth field estimation in dense image arrays. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 3358–3362.

  37. Hosni, A., Bleyer, M., Rhemann, C., Gelautz, M., & Rother, C. (2011). Real-time local stereo matching using guided image filtering. In: 2011 IEEE International Conference on Multimedia and Expo, pp. 1–6. https://doi.org/10.1109/ICME.2011.6012131

  38. Penner, E., & Zhang, L. (2017). Soft 3d reconstruction for view synthesis. ACM Transactions on Graphics, 36(6). https://doi.org/10.1145/3130800.3130855

  39. Eigen, D., Puhrsch, C., & Fergus, R. (2014). Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. NIPS’14, pp. 2366–2374. MIT Press, Cambridge, MA, USA.

  40. Peng, J., Xiong, Z., Liu, D., & Chen, X. (2018). Unsupervised depth estimation from light field using a convolutional neural network. In: 2018 International Conference on 3D Vision (3DV), pp. 295–303. https://doi.org/10.1109/3DV.2018.00042

  41. Eslami, S. M. A., JimenezRezende, D., Besse, F., Viola, F., Morcos, A. ., Garnelo, M., Ruderman, A., Rusu, A. A., Danihelka, I., Gregor, K., Reichert, D. P., Buesing, L., Weber, T., Vinyals, O., Rosenbaum, D., Rabinowitz, N., King, H., Hillier, C., Botvinick, M., Wierstra, D., Kavukcuoglu, K., & Hassabis, D. (2018). Neural scene representation and rendering. Science, 360(6394), 1204–1210. https://doi.org/10.1126/science.aar6170

  42. Han, X., Laga, H., & Bennamoun, M. (2019). Image-based 3d object reconstruction: State-of-the-art and trends in the deep learning era. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–1. https://doi.org/10.1109/tpami.2019.2954885.

  43. Ni, L., Jiang, H., Cai, J., Zheng, J., Li, H., & Liu, X. (2019). Unsupervised Dense Light Field Reconstruction with Occlusion Awareness. Computer Graphics Forum, 38(7), 425–436. https://doi.org/10.1111/cgf.13849

    Article  Google Scholar 

  44. Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2020). Nerf: Representing scenes as neural radiance fields for view synthesis. In: ECCV.

  45. Navarro, J., & Sabater, N. (2021). Learning occlusion-aware view synthesis for light fields. Pattern Analysis and Applications, 24(3), 1319–1334. https://doi.org/10.1007/s10044-021-00956-2

    Article  Google Scholar 

  46. Mildenhall, B., Srinivasan, P. P., Ortiz-Cayon, R., Kalantari, N. K., Ramamoorthi, R., Ng, R., & Kar, A. (2019). Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines.

  47. Jiang, H., Sun, D., Jampani, V., Yang, M. -H., Learned-Miller, E., & Kautz, J. (2017). Super slomo: High quality estimation of multiple intermediate frames for video interpolation. CVPR 2018. https://doi.org/10.48550/ARXIV.1712.00080

  48. Wang, H., Sun, M., & Yang, R. (2007). Space-time light field rendering. IEEE Transactions on Visualization and Computer Graphics, 13(4), 697–710.

    Article  Google Scholar 

  49. Wang, T. -C., Zhu, J. -Y., Kalantari, N. K., Efros, A. A., & Ramamoorthi, R. (2017). Light field video capture using a learning-based hybrid imaging system. ACM Transactions on Graphics (TOG), 36(4), 1–13.

    Google Scholar 

  50. Sabater, N., Boisson, G., Vandame, B., Kerbiriou, P., Babon, F., Hog, M., Gendrot, R., Langlois, T., Bureller, O., Schubert, A., et al. (2017). Dataset and pipeline for multi-view light-field video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 30–40.

  51. Daqbala, L., Ziegler, M., Didyk, P., Zilly, F., Keinert, J., Myszkowski, K., Seidel, H.-P., Rokita, P., & Ritschel, T. (2016). Efficient Multi-image Correspondences for On-line Light Field Video Processing. Computer Graphics Forum. https://doi.org/10.1111/cgf.13037

    Article  Google Scholar 

  52. Salvador, G., Chau, J., Quesada, J., & Carranza, C. (2018). Efficient gpu-based implementation of the median filter based on a multi-pixel-per-thread framework, pp. 121–124. https://doi.org/10.1109/SSIAI.2018.8470318

  53. Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International journal of computer vision, 47(1), 7–42.

    Article  Google Scholar 

  54. Kawase, M. (2003). Frame buffer postprocessing effects in double-steal (wrechless). In: Game Developers Conference 2003, 3.

  55. Vaish, V., & Adams, A. (2008). The (new) stanford light field archive. Computer Graphics Laboratory, Stanford University, 6(7).

  56. Rerabek, M., & Ebrahimi, T. (2016). New light field image dataset. In: 8th International Conference on Quality of Multimedia Experience (QoMEX).

  57. Reda, F., Kontkanen, J., Tabellion, E., Sun, D., Pantofaru, C., & Curless, B. (2022). Film: Frame interpolation for large motion. ECCV 2022.

  58. Brox, T., Bruhn, A., Papenberg, N., & Weickert, J. (2004). High accuracy optical flow estimation based on a theory for warping, 3024, 25–36. https://doi.org/10.1007/978-3-540-24673-2_3

    Article  Google Scholar 

  59. Müller, T., Evans, A., Schied, C., & Keller, A. (2022). Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4), 102–110215. https://doi.org/10.1145/3528223.3530127

    Article  Google Scholar 

  60. Reda, F., Kontkanen, J., Tabellion, E., Sun, D., Pantofaru, C., & Curless, B. (2022). Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Motion". GitHub.

  61. Choi, M., Choi, J., Baik, S., Kim, T. H., & Lee, K. M. (2020). Scene-adaptive video frame interpolation via meta-learning. In: CVPR.

  62. Bařina, D., Chlubna, T., Šolony, M., Dlabaja, D., & Zemčík, P. (2019). Evaluation of 4d light field compression methods. In: International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG), Part I. Computer Science Research Notes (CSRN), vol. 2901, pp. 55–61. Union Agency. https://doi.org/10.24132/CSRN.2019.2901.1.7

Download references

Acknowledgements

This work was supported by the KDT JU project AIDOaRt, grant agreement No 101007350. The authors would like to thank the anonymous reviewers who helped to improve the quality of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomáš Chlubna.

Ethics declarations

Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chlubna, T., Milet, T., Zemčík, P. et al. Real-Time Light Field Video Focusing and GPU Accelerated Streaming. J Sign Process Syst 95, 703–719 (2023). https://doi.org/10.1007/s11265-023-01874-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-023-01874-8

Keywords

Navigation