article

High-quality video view interpolation using a layered representation

Authors:

C. Lawrence Zitnick,

Sing Bing Kang,

Matthew Uyttendaele,

Richard SzeliskiAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 23, Issue 3

Pages 600 - 608

https://doi.org/10.1145/1015706.1015766

Published: 01 August 2004 Publication History

Abstract

The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.

Supplementary Material

MOV File (pps051.mov)

Download
12.21 KB

References

[1]

Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Conference on Computer Vision and Pattern Recognition (CVPR), 434--441.

Digital Library

[2]

Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. Proceedings of SIGGRAPH 2001, 425--432.

Digital Library

[3]

Carceroni, R. L., and Kutulakos, K. N. 2001. Multi-view scene capture by surfel sampling: From video streams to non-rigid 3D motion, shape and reflectance. In International Conference on Computer Vision (ICCV), vol. II, 60--67.

[4]

Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Transactions on Graphics 22, 3, 569--577.

Digital Library

[5]

Chang, C.-L., et al. 2003. Inter-view wavelet compression of light fields with disparity-compensated lifting. In Visual Communication and Image Processing (VCIP 2003), 14--22.

[6]

Chuang, Y.-Y., et al. 2001. A Bayesian approach to digital matting. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 264--271.

[7]

Chuang, Y.-Y., et al. 2002. Video matting of complex scenes. ACM Transactions on Graphics 21, 3, 243--248.

Digital Library

[8]

Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. Computer Graphics (SIGGRAPH'96), 11--20.

Digital Library

[9]

Debevec, P. E., Yu, Y., and Borshukov, G. D. 1998. Efficient view-dependent image-based rendering with projective texture-mapping. Eurographics Rendering Workshop 1998, 105--116.

[10]

Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2003. Image-based rendering using image-based priors. In International Conference on Computer Vision (ICCV), vol. 2, 1176--1183.

Digital Library

[11]

Goldlücke, B., Magnor, M., and Wilburn, B. 2002. Hardware-accelerated dynamic light field rendering. In Proceedings Vision, Modeling and Visualization VMV 2002, 455--462.

[12]

Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 43--54.

Digital Library

[13]

Gross, M., et al. 2003. blue-c: A spatially immersive display and 3D video portal for telepresence. Proceedings of SIGGRAPH 2003 (ACM Transactions on Graphics), 819--827.

Digital Library

[14]

Hall-Holt, O., and Rusinkiewicz, S. 2001. Stripe boundary codes for real-time structured-light range scanning of moving objects. In International Conference on Computer Vision (ICCV), vol. II, 359--366.

[15]

Heigl, B., et al. 1999. Plenoptic modeling and rendering from image sequences taken by hand-held camera. In DAGM'99, 94--101.

Digital Library

[16]

Kanade, T., Rander, P. W., and Narayanan, P. J. 1997. Virtualized reality: constructing virtual worlds from real scenes. IEEE MultiMedia Magazine, 1(1):34--47.

Digital Library

[17]

Levoy, M., and Hanrahan, P. 1996. Light field rendering. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 31--42.

Digital Library

[18]

Matusik, W., et al. 2000. Image-based visual hulls. Proceedings of SIGGRAPH 2000, 369--374.

Digital Library

[19]

Patras, I., Hendriks, E., and Lagendijk, R. 2001. Video segmentation by MAP labeling of watershed segments. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3, 326--332.

Digital Library

[20]

Perona, P., and Malik, J. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 7, 626--639.

Digital Library

[21]

Pulli, K., et al. 1997. View-based rendering: Visualizing real objects from scanned range and color data. In Proceedings of the 8th Eurographics Workshop on Rendering, 23--34.

Digital Library

[22]

Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47, 1, 7--42.

Digital Library

[23]

Schirmacher, H., Ming, L., and Seidel, H.-P. 2001. On-the-fly processing of generalized Lumigraphs. In Proceedings of Eurographics, Computer Graphics Forum 20, 3, 165--173.

[24]

Seitz, S. M., and Dyer, C. M. 1997. Photorealistic scene reconstrcution by voxel coloring. In Conference on Computer Vision and Pattern Recognition (CVPR), 1067--1073.

Digital Library

[25]

Shade, J., Gortler, S., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Computer Graphics (SIGGRAPH'98) Proceedings, ACM SIGGRAPH, 231--242.

Digital Library

[26]

Smolic, A., and Kimata, H. 2003. AHG on 3DAV Coding. ISO/IEC JTC1/SC29/WG11 MPEG03/M9635.

[27]

Szeliski, R., and Golland, P. 1999. Stereo matching with transparency and matting. International Journal of Computer Vision 32, 1, 45--61.

Digital Library

[28]

Tao, H., Sawhney, H., and Kumar, R. 2001. A global matching framework for stereo computation. In International Conference on Computer Vision (ICCV), vol. I, 532--539.

[29]

Tsin, Y., Kang, S. B., and Szeliski, R. 2003. Stereo matching with reflections and translucency. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. I, 702--709.

Digital Library

[30]

Vedula, S., Baker, S., Seitz, S., and Kanade, T. 2000. Shape and motion carving in 6D. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 592--598.

[31]

Wang, J. Y. A., and Adelson, E. H. 1993. Layered representation for motion analysis. In Conference on Computer Vision and Pattern Recognition (CVPR), 361--366.

[32]

Wexler, Y., Fitzgibbon, A., and Zisserman, A. 2002. Bayesian estimation of layers from multiple images. In Seventh European Conference on Computer Vision (ECCV), vol. III, 487--501.

Digital Library

[33]

Wilburn, B., Smulski, M., Lee, H. H. K., and Horowitz, M. 2002. The light field video camera. In SPIE Electonic Imaging: Media Processors, vol. 4674, 29--36.

[34]

Yang, J. C., Everett, M., Buehler, C., and McMillan, L. 2002. A real-time distributed light field camera. In Eurographics Workshop on Rendering, 77--85.

Digital Library

[35]

Yang, R., Welch, G., and Bishop, G. 2002. Real-time consensus-based scene reconstruction using commodity graphics hardware. In Proceedings of Pacific Graphics, 225--234.

Digital Library

[36]

Zhang, Y., and Kambhamettu, C. 2001. On 3D scene flow and structure estimation. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 778--785.

[37]

Zhang, L., Curless, B., and Seitz, S. M. 2003. Spacetime stereo: Shape recovery for dynamic scenes. In Conference on Computer Vision and Pattern Recognition, 367--374.

[38]

Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11, 1330--1334.

Digital Library

Cited By

ZHANG Runnan 张ZHOU Ning 周ZHOU Zihao 周DU Heheng 杜CHEN Qian 陈ZUO Chao 左(2024)光场表征及其分辨率提升技术：文献综述及最新进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024034753:9(20240347)Online publication date: 2024
https://doi.org/10.3788/IRLA20240347
Chen BHuang ZZheng QTang WWang JLv HChen CWang JLiu QDe V(2024)CEDAR: Computing-in-pixel Edge-aware Detection and Reconstruction Architecture for High-resolution 3D ImagingProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3657313(1-6)Online publication date: 23-Jun-2024
https://dl.acm.org/doi/10.1145/3649329.3657313
Somraj NChoudhary KMupparaju SSoundararajan R(2024)Factorized Motion Fields for Fast Sparse Input Dynamic View SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657498(1-12)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657498
Show More Cited By

Index Terms

High-quality video view interpolation using a layered representation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Video segmentation
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
    2. Rendering

Recommendations

High-quality video view interpolation using a layered representation
SIGGRAPH '04: ACM SIGGRAPH 2004 Papers

The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video ...
Real-time high-quality View-Dependent Texture Mapping using per-pixel visibility
GRAPHITE '05: Proceedings of the 3rd international conference on Computer graphics and interactive techniques in Australasia and South East Asia

We present an extension of View-Dependent Texture Mapping (VDTM) allowing rendering of complex geometric meshes at high frame rates without usual blurring or skinning artifacts. We combine a hybrid geometric and image-based representation of a given 3D ...
Special Section on CAD/Graphics 2013: Parallel and adaptive visibility sampling for rendering dynamic scenes with spatially varying reflectance

Fast rendering of dynamic scenes with natural illumination, all-frequency shadows and spatially varying reflections is important but challenging. One main difficulty brought by moving objects is that the runtime visibility update of dynamic occlusion is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 23, Issue 3

August 2004

684 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/1015706

Editor:
John C. Hart

Issue’s Table of Contents

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2004

Published in TOG Volume 23, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

884
Total Citations
View Citations
5,207
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)20

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

ZHANG Runnan 张ZHOU Ning 周ZHOU Zihao 周DU Heheng 杜CHEN Qian 陈ZUO Chao 左(2024)光场表征及其分辨率提升技术：文献综述及最新进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024034753:9(20240347)Online publication date: 2024
https://doi.org/10.3788/IRLA20240347
Chen BHuang ZZheng QTang WWang JLv HChen CWang JLiu QDe V(2024)CEDAR: Computing-in-pixel Edge-aware Detection and Reconstruction Architecture for High-resolution 3D ImagingProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3657313(1-6)Online publication date: 23-Jun-2024
https://dl.acm.org/doi/10.1145/3649329.3657313
Somraj NChoudhary KMupparaju SSoundararajan R(2024)Factorized Motion Fields for Fast Sparse Input Dynamic View SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657498(1-12)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657498
Park BKim C(2024)Point-DynRF: Point-based Dynamic Radiance Fields from a Monocular Video2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00314(3159-3169)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00314
Huang XZhang QFeng YLi HWang Q(2024)LTM-NeRF: Embedding 3D Local Tone Mapping in HDR Neural Radiance FieldIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.344862046:12(10944-10959)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3448620
Park BGo HKim C(2024)Bridging Implicit and Explicit Geometric Transformation for Single-Image View SynthesisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337875746:9(6326-6340)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3378757
Zhao BLi X(2024)Edge-Aware Network for Flow-Based Video Frame InterpolationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.317828135:1(1401-1408)Online publication date: Jan-2024
https://doi.org/10.1109/TNNLS.2022.3178281
Chiang SWang CYang DLiao WChen W(2024)Online Multicast Traffic Engineering for Multi-View Videos With View Synthesis in SDNIEEE/ACM Transactions on Networking10.1109/TNET.2024.336616632:4(2778-2793)Online publication date: Aug-2024
https://doi.org/10.1109/TNET.2024.3366166
Guo SHu JZhou KWang JSong LXie RZhang W(2024)Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation NetworkIEEE Transactions on Multimedia10.1109/TMM.2024.335563926(6701-6716)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3355639
Zhu GQin ZDing YLiu YQin Z(2024)MFNet:Real-Time Motion Focus Network for Video Frame InterpolationIEEE Transactions on Multimedia10.1109/TMM.2023.330844226(3251-3262)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3308442
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents