2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm

Yao, Li; Liu, Zhukui; Wang, Bingfeng

doi:10.1007/s11042-018-6583-3

2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm

Published: 07 September 2018

Volume 78, pages 10543–10564, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

365 Accesses
Explore all metrics

Abstract

3D display has become the inevitable trend of display technology. Converting the traditional and classical 2D videos to 3D videos is an important and effective measure to solve the shortage of 3D contents. The major work about 2D-to-3D video conversion is how to extract depth information from the 2D video, and synthesize a new image from the existing viewpoint. We propose a depth extraction method based on dense edge-preserving optical flow from 2D videos, reducing the matching error in textureless regions. Moreover, we use the Gaussian Pyramid and Laplace Pyramid at cross scales to fill the holes in the image at new view point after 3D warping. The experiments show that our results outperform state-of-the-art methods in visual effect and statistics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint processing and fast encoding algorithm for multi-view depth video

Article Open access 01 September 2016

Monocular vision-based depth map extraction method for 2D to 3D video conversion

Article Open access 03 June 2016

Depth Map Enhancement with Interaction in 2D-to-3D Video Conversion

References

Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916
Article Google Scholar
Assa J, Wolf L (2010) Diorama construction from a single image. Comput Graphics Forum 26(3):599–608
Article Google Scholar
Barnes C, Shechtman E, Finkelstein A, Goldman D (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph-TOG 28(3):24
Google Scholar
Butler DJ, Wulff J, Stanley GB, Black MJ (2012) A naturalistic open source movie for optical flow evaluation. In: European conference on computer vision. Springer, pp 611–625
Cheng C-C, Li C-T, Chen L-G (2010) A novel 2dd-to-3d conversion system using edge information. IEEE Trans Consum Electron 56(3):1739–1745
Article Google Scholar
Cozman F, Krotkov E (1997) Depth from scattering. In: 1997 IEEE computer society conference on computer vision and pattern recognition, 1997. Proceedings. IEEE, pp 801–806
Criminisi A, Perez P, Toyama K (2003) Object removal by exemplar-based inpainting. In: 2003 IEEE Computer society conference on computer vision and pattern recognition, 2003. Proceedings, vol 2. IEEE, pp II–721
Criminisi A, Pérez P, Toyama K (2004) Region filling and object removal by exemplar-based image inpainting. IEEE Trans Image Process 13(9):1200–1212
Article Google Scholar
Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low-and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst Hum 43(4):996–1002
Article Google Scholar
Do L, Zinger S et al (2010) Quality improving techniques for free-viewpoint dibr. In: IS&T/SPIE electronic imaging. International Society for Optics and Photonics, pp 75240I–75240I
Dollár P, Zitnick CL (2013) Structured forests for fast edge detection. In: Proceedings of the IEEE international conference on computer vision, pp 1841–1848
Du C, Chen YL, Ye M, Ren L (2016) Edge snapping-based depth enhancement for dynamic occlusion handling in augmented reality, pp 54–62
Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Scandinavian conference on image analysis. Springer, pp 363–370
Guttmann M, Wolf L, Cohen-Or D (2009) . In: Semi-automatic stereo extraction from video footage, vol 30, pp 136–142
Hebborn AK, Hohner N, Muller S (2017) Occlusion matting: realistic occlusion handling for augmented reality applications. In: IEEE international symposium on mixed and augmented reality, pp 62–71
Horn BK, Brooks MJ (1986) The variational approach to shape from shading. Comput Vis Graph Image Process 33(2):174–208
Article Google Scholar
Karsch K, Liu C, Kang SB (2014) Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans Pattern Anal Mach Intell 36(11):2144
Article Google Scholar
Kim D-S, Lee S-S, Choi B-H (2010) A real-time stereo depth extraction hardware for intelligent home assistant robot. IEEE Trans Consum Electron 3:56
Google Scholar
Konrad J, Wang M, Ishwar P (2012) 2d-to-3d image conversion by learning depth from examples. In: Computer vision and pattern recognition workshops, pp 16–22
Liu Y, Zhang X, Cui J, Wu C, Aghajan H, Zha H (2010) Visual analysis of child-adult interactive behaviors in video sequences. In: International conference on virtual systems and multimedia, pp 26–33
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: International conference on artificial intelligence, pp 1617–1623
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: Thirtieth AAAI conference on artificial intelligence, pp 1266–1272
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Article Google Scholar
Loh AM, Hartley RI et al (2005) Shape from non-homogeneous, non-stationary, anisotropic, perspective texture. In: BMVC, pp 69–78
Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. IJCAI 81(1):674–679
Google Scholar
Mark WR, McMillan L, Bishop G (1997) Post-rendering 3d warping. In: Proceedings of the 1997 symposium on Interactive 3D graphics. ACM, pp 7–ff
Moré JJ (1978) The levenberg-marquardt algorithm: implementation and theory. In: Numerical analysis. Springer, pp 105–116
Mpeg-ftv test sequence download page. http://www.tanimoto.nuee.nagoya-u.ac.jp/fukushima/mpegftv/
Oh K-J, Yea S, Ho Y-S (2009) Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-d video. In: Picture coding symposium, 2009. PCS 2009. IEEE, pp 1–4
Pérez JS, Meinhardt-Llopis E, Facciolo G (2013) Tv-l1 optical flow estimation. Image Process On Line 2013:137–150
Article Google Scholar
Pourazad MT, Nasiopoulos P, Ward RK (2009) An h. 264-based scheme for 2d to 3d video conversion. IEEE Trans Consum Electron 55(2):742–748
Article Google Scholar
Revaud J, Weinzaepfel P, Harchaoui Z, Schmid C (2015) Epicflow: edge-preserving interpolation of correspondences for optical flow. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1164–1172
Saxena A, Sun M, Ng AY (2009) Make3d: Learning 3d scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Article Google Scholar
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1–3):7–42
Article Google Scholar
Schmeing M, Jiang X (2011) Time-consistency of disocclusion filling algorithms in depth image based rendering. In: 3dtv conference: the true vision—capture, transmission and display of 3d video, pp 1–4
Shade J, Gortler S, He L-W, Szeliski R (1998) Layered depth images. In: Proceedings of the 25th annual conference on Computer graphics and interactive techniques. ACM, pp 231–242
Shao M, Simchony T, Chellappa R (1988) New algorithms from reconstruction of a 3-d depth map from one or more images. In: Computer society conference on computer vision and pattern recognition, 1988. Proceedings CVPR ’88, pp 530–535
Taketomi Y, Ikeoka H, Hamamoto T (2013) Depth estimation based on defocus blur using a single image taken by a tilted lens optics camera, pp 403–408
Tam WJ, Zhang L (2006) 3d-tv content generation: 2d-to-3d conversion. In: 2006 IEEE international conference on multimedia and expo. IEEE, pp 1869–1872
Tao M, Bai J, Kohli P, Paris S (2012) Simpleflow: a non-iterative, sublinear optical flow algorithm. In: Computer graphics forum, vol 31, no 2pt1. Wiley Online Library, pp 345–353
Telea A (2004) An image inpainting technique based on the fast marching method. J Graph Tools 9(1):23–34
Article Google Scholar
Ward B, Kang SB, Bennett EP (2011) Depth director: a system for adding depth to movies. IEEE Comput Graph Appl 31(1):36–48
Article Google Scholar
Wasserman L (2013) All of statistics: a concise course in statistical inference. Springer Science & Business Media
Wei Q (2005) Converting 2d to 3d: a survey. In: International conference, Page (s), vol 7, p 14
Weinzaepfel P, Revaud J, Harchaoui Z, Schmid C (2013) Deepflow: large displacement optical flow with deep matching. In: Proceedings of the IEEE international conference on computer vision, pp 1385–1392
Wong EF, Wong KT (2004) Single image depth from defocus. Master’s thesis, Delft university of Technology, The Netherlands
Xie J, Girshick R, Farhadi A (2016) Deep3d: fully automatic 2d-to-3d video conversion with deep convolutional neural networks. In: European conference on computer vision. Springer, pp 842–857
Yao L, Han Y, Li X (2016) Virtual viewpoint synthesis using cuda acceleration. In: ACM conference on virtual reality software and technology, pp 367–368
Zach C, Pock T, Bischof H (2007) A duality based approach for realtime tv-l1 optical flow. In: Joint pattern recognition symposium. Springer, pp 214–223
Zhang L, Tam WJ (2005) Stereoscopic image generation based on depth images for 3d tv. IEEE Trans Broadcast 51(2):191–199
Article Google Scholar
Zhang R, Tsai P-S, Cryer JE, Shah M (1999) Shape-from-shading: a survey. IEEE Trans Pattern Anal Mach Intell 21(8):690–706
Article Google Scholar
Zitnick CL, Kang SB, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. In: ACM Transactions on Graphics (TOG), vol 23, no. 3. ACM, pp 600–608

Download references

Acknowledgements

This work is supported by Industrial Prospective Project of Jiangsu Technology Department under Grant No. BE2017081.

Author information

Authors and Affiliations

College of Computer Science and Engineering, Southeast University, Nanjing, People’s Republic of China
Li Yao, Zhukui Liu & Bingfeng Wang
Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, People’s Republic of China
Li Yao

Authors

Li Yao
View author publications
You can also search for this author inPubMed Google Scholar
Zhukui Liu
View author publications
You can also search for this author inPubMed Google Scholar
Bingfeng Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Li Yao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, L., Liu, Z. & Wang, B. 2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm. Multimed Tools Appl 78, 10543–10564 (2019). https://doi.org/10.1007/s11042-018-6583-3

Download citation

Received: 15 March 2017
Revised: 06 July 2018
Accepted: 21 August 2018
Published: 07 September 2018
Issue Date: April 2019
DOI: https://doi.org/10.1007/s11042-018-6583-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

2D-to-3D conversion using optical flow based depth generation and cross-scale hole filling algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint processing and fast encoding algorithm for multi-view depth video

Monocular vision-based depth map extraction method for 2D to 3D video conversion

Depth Map Enhancement with Interaction in 2D-to-3D Video Conversion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now