ABSTRACT
Different from classic reconstruction of physical depth in computer vision, depth for 2D-to-3D stereo conversion is assigned by humans using semi-automatic painting interfaces and, consequently, is often dramatically wrong. Here we seek to better understand why it still does not fail to convey a sensation of depth. To this end, four typical disparity distortions resulting from manual 2D-to-3D stereo conversion are analyzed: i) smooth remapping, ii) spatial smoothness, iii) motion-compensated, temporal smoothness, and iv) completeness. A perceptual experiment is conducted to quantify the impact of each distortion on the plausibility of the 3D impression relative to a reference without distortion. Close-to-natural videos with known depth were distorted in one of the four above-mentioned aspects and subjects had to indicate if the distortion still allows for a plausible 3D effect. The smallest amounts of distortion that result in a significant rejection suggests a conservative upper bound on the quality requirement of 2D-to-3D conversion.
Supplemental Material
Available for Download
Supplemental files
- Assa, J., and Wolf, L. 2007. Diorama construction from single images. Comp. Graph. Forum (Proc. EG) 26, 3, 599--608.Google ScholarCross Ref
- Borji, A., and Itti, L. 2013. State-of-the-art in visual attention modeling. Pattern Analysis and Machine Intelligence, IEEE Transactions on 35, 1, 185--207. Google ScholarDigital Library
- Butler, D. J., Wulff, J., Stanley, G. B., and Black, M. J. 2012. A naturalistic open source movie for optical flow evaluation. In European Conf. on Computer Vision (ECCV), 611--625. Google ScholarDigital Library
- Cheng, C.-C., Li, C.-T., and Chen, L.-G. 2010. An ultra-low-cost 2D-to-3D video conversion system. SID 41, 1, 766--9.Google ScholarCross Ref
- Dabala, Ł., Kellnhofer, P., Ritschel, T., Didyk, P., Templin, K., Myszkowski, K., Rokita, P., and Seidel, H.-P. 2014. Manipulating refractive and reflective binocular disparity. Comp. Graph. Forum (Proc. Eurographics 2014) 33, 2, 53--62. Google ScholarDigital Library
- Didyk, P., Ritschel, T., Eisemann, E., Myszkowski, K., and Seidel, H.-P. 2011. A perceptual model for disparity. ACM Trans. Graph. (Proc. SIGGRAPH) 30, 96:1--96:10. Google ScholarDigital Library
- Didyk, P., Ritschel, T., Eisemann, E., Myszkowski, K., Seidel, H.-P., and Matusik, W. 2012. A luminance-contrast-aware disparity model and applications. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 31, 6. Google ScholarDigital Library
- Fehn, C. 2004. Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. In Stereoscopic Displays and Virtual Reality Systems XI, SPIE, vol. 5291, 93--104.Google Scholar
- Guttmann, M., Wolf, L., and Cohen-Or, D. 2009. Semiautomatic stereo extraction from video footage. In Proc. ICCV, 136--142.Google Scholar
- Hoiem, D., Efros, A. A., and Hebert, M. 2005. Automatic photo pop-up. ACM Trans. Graph. 24, 3, 577--584. Google ScholarDigital Library
- Howard, I., and Rogers, B. 2012. Perceiving in Depth, Volume 2: Stereoscopic Vision. Oxford Psychology Series.Google Scholar
- Huang, X., Wang, L., Huang, J., Li, D., and Zhang, M. 2009. A depth extraction method based on motion and geometry for 2D to 3D conversion. In Proc. IITA, 294--298. Google ScholarDigital Library
- Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE PAMI 20, 11, 1254--9. Google ScholarDigital Library
- Jones, G., Lee, D., Holliman, N., and Ezra, D. 2001. Controlling perceived depth in stereoscopic images. In SPIE, vol. 4297, 42--53.Google Scholar
- Kane, D., Guan, P., and Banks, M. 2014. The limits of human stereopsis in space and time. J Neurosc. 34, 4, 1397--408.Google ScholarCross Ref
- Karsch, K., Liu, C., and Kang, S. B. 2014. Depth transfer: Depth extraction from video using non-parametric sampling. IEEE PAMI 36, 11, 2144--58.Google ScholarCross Ref
- Konrad, J., Wang, M., and Ishwar, P. 2012. 2D-to-3D image conversion by learning depth from examples. In CVPR, 16--22.Google Scholar
- Kopf, J., Cohen, M. F., Lischinski, D., and Uyttendaele, M. 2007. Joint bilateral upsampling. ACM Trans. Graph. (Proc. SIGGRAPH) 26, 3. Google ScholarDigital Library
- Lang, M., Hornung, A., Wang, O., Poulakos, S., Smolic, A., and Gross, M. 2010. Nonlinear disparity mapping for stereoscopic 3D. ACM Trans. Graph. (Proc. SIGGRAPH) 29, 4. Google ScholarDigital Library
- Lang, M., Wang, O., Aydin, T., Smolic, A., and Gross, M. 2012. Practical temporal consistency for image-based graphics applications. ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4. Google ScholarDigital Library
- Liu, X., Mao, X., Yang, X., Zhang, L., and Wong, T.-T. 2013. Stereoscopizing cel animations. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 32, 6, 223. Google ScholarDigital Library
- Merkle, P., Morvan, Y., Smolic, A., Farin, D., Müller, K., de With, P. H. N., and Wiegand, T. 2009. The effects of multiview depth video compression on multiview rendering. Signal Processing: Image Communcation 24, 1--2. Google ScholarDigital Library
- Murata, H., Mori, Y., Yamashita, S., Maenaka, A., Okada, S., Oyamada, K., and Kishimoto, S. 1998. A real-time 2-D to 3-D image conversion technique using computed image depth. SID 29, 1, 919--23.Google ScholarCross Ref
- Pajak, D., Herzog, R., Mantiuk, R., Didyk, P., Eisemann, E., Myszkowski, K., and Pulli, K. 2014. Perceptual depth compression for stereo applications. Computer Graphics Forum (Proc. Eurographics) 33, 2. Google ScholarDigital Library
- Ramanarayanan, G., Ferwerda, J., Walter, B., and Bala, K. 2007. Visual equivalence: Towards a new standard for image fidelity. ACM Trans. Graph. (Proc. SIGGRAPH) 26, 3. Google ScholarDigital Library
- Richardt, C., Stoll, C., Dodgson, N., Seidel, H.-P., and Theobalt, C. 2012. Coherent spatiotemporal filterung, upsampling and rendering of RGBZ videos. Comp. Graph. Forum 31, 2. Google ScholarDigital Library
- Robinson, A. E., and MacLeod, D. I. A. 2013. Depth and luminance edges attract. Journal of Vision 13, 11.Google ScholarCross Ref
- Saxena, A., Sun, M., and Ng, A. Y. 2009. Make3D: Learning 3D scene structure from a single still image. PAMI 31, 5, 824--40. Google ScholarDigital Library
- Shinya, M. 1993. Spatial anti-aliasing for animation sequences with spatio-temporal filtering. In Proc. SIGGRAPH, 289--96. Google ScholarDigital Library
- Wandell, B. A. 1995. Foundations of vision. Sinauer Associates.Google Scholar
- Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Processing 13, 4, 600--12. Google ScholarDigital Library
- Ward, B., Kang, S. B., and Bennett, E. 2011. Depth director: A system for adding depth to movies. IEEE Comp. Graph. and App. 31, 1, 36--48. Google ScholarDigital Library
- Yamada, K., and Suzuki, Y. 2009. Real-time 2D-to-3D conversion at full HD 1080p resolution. In IEEE ISCE, 103--106.Google Scholar
- Yang, Z., and Purves, D. 2003. A statistical explanation of visual space. Nature Neuroscience 6, 6, 632--640.Google ScholarCross Ref
- Zhang, G., Hua, W., Qin, X., Wong, T.-T., and Bao, H. 2007. Stereoscopic video synthesis from a monocular video. IEEE TVCG 13, 4, 686--96. Google ScholarDigital Library
- Zhang, L., Vazquez, C., and Knorr, S. 2011. 3D-TV content creation: Automatic 2D-to-3D video conversion. IEEE Trans. Broadcasting 57, 2, 372--83.Google ScholarCross Ref
Index Terms
- What makes 2D-to-3D stereo conversion perceptually plausible?
Recommendations
Gradient-based 2D-to-3D Conversion for Soccer Videos
MM '15: Proceedings of the 23rd ACM international conference on MultimediaA wide spread adoption of 3D videos and technologies is hindered by the lack of high-quality 3D content. One promising solution to address this problem is to use automated 2D-to-3D conversion. However, current conversion methods, while general, produce ...
2D to 3D conversion with motion-type adaptive depth estimation
2D to 3D conversion is an important task for 3DTV broadcasting services due to the lack of stereoscopic 3D contents. In this paper, we propose 2D to 3D conversion with motion-type adaptive depth estimation. Because the most important depth cue is motion ...
Toward Naturalistic 2D-to-3D Conversion
Natural scene statistics (NSSs) models have been developed that make it possible to impose useful perceptually relevant priors on the luminance, colors, and depth maps of natural scenes. We show that these models can be used to develop 3D content creation ...
Comments