skip to main content
article

High-quality video view interpolation using a layered representation

Published: 01 August 2004 Publication History

Abstract

The ability to interactively control viewpoint while watching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we show how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed, we can synthesize any intermediate view between cameras at any time, with the potential for space-time manipulation.In our approach, we first use a novel color segmentation-based stereo algorithm to generate high-quality photoconsistent correspondences across all camera views. Mattes for areas near depth discontinuities are then automatically extracted to reduce artifacts during view synthesis. Finally, a novel temporal two-layer compressed representation that handles matting is developed for rendering at interactive rates.

Supplementary Material

MOV File (pps051.mov)

References

[1]
Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Conference on Computer Vision and Pattern Recognition (CVPR), 434--441.
[2]
Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. Proceedings of SIGGRAPH 2001, 425--432.
[3]
Carceroni, R. L., and Kutulakos, K. N. 2001. Multi-view scene capture by surfel sampling: From video streams to non-rigid 3D motion, shape and reflectance. In International Conference on Computer Vision (ICCV), vol. II, 60--67.
[4]
Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Transactions on Graphics 22, 3, 569--577.
[5]
Chang, C.-L., et al. 2003. Inter-view wavelet compression of light fields with disparity-compensated lifting. In Visual Communication and Image Processing (VCIP 2003), 14--22.
[6]
Chuang, Y.-Y., et al. 2001. A Bayesian approach to digital matting. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 264--271.
[7]
Chuang, Y.-Y., et al. 2002. Video matting of complex scenes. ACM Transactions on Graphics 21, 3, 243--248.
[8]
Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. Computer Graphics (SIGGRAPH'96), 11--20.
[9]
Debevec, P. E., Yu, Y., and Borshukov, G. D. 1998. Efficient view-dependent image-based rendering with projective texture-mapping. Eurographics Rendering Workshop 1998, 105--116.
[10]
Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2003. Image-based rendering using image-based priors. In International Conference on Computer Vision (ICCV), vol. 2, 1176--1183.
[11]
Goldlücke, B., Magnor, M., and Wilburn, B. 2002. Hardware-accelerated dynamic light field rendering. In Proceedings Vision, Modeling and Visualization VMV 2002, 455--462.
[12]
Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 43--54.
[13]
Gross, M., et al. 2003. blue-c: A spatially immersive display and 3D video portal for telepresence. Proceedings of SIGGRAPH 2003 (ACM Transactions on Graphics), 819--827.
[14]
Hall-Holt, O., and Rusinkiewicz, S. 2001. Stripe boundary codes for real-time structured-light range scanning of moving objects. In International Conference on Computer Vision (ICCV), vol. II, 359--366.
[15]
Heigl, B., et al. 1999. Plenoptic modeling and rendering from image sequences taken by hand-held camera. In DAGM'99, 94--101.
[16]
Kanade, T., Rander, P. W., and Narayanan, P. J. 1997. Virtualized reality: constructing virtual worlds from real scenes. IEEE MultiMedia Magazine, 1(1):34--47.
[17]
Levoy, M., and Hanrahan, P. 1996. Light field rendering. In Computer Graphics (SIGGRAPH'96) Proceedings, ACM SIGGRAPH, 31--42.
[18]
Matusik, W., et al. 2000. Image-based visual hulls. Proceedings of SIGGRAPH 2000, 369--374.
[19]
Patras, I., Hendriks, E., and Lagendijk, R. 2001. Video segmentation by MAP labeling of watershed segments. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3, 326--332.
[20]
Perona, P., and Malik, J. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 7, 626--639.
[21]
Pulli, K., et al. 1997. View-based rendering: Visualizing real objects from scanned range and color data. In Proceedings of the 8th Eurographics Workshop on Rendering, 23--34.
[22]
Scharstein, D., and Szeliski, R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47, 1, 7--42.
[23]
Schirmacher, H., Ming, L., and Seidel, H.-P. 2001. On-the-fly processing of generalized Lumigraphs. In Proceedings of Eurographics, Computer Graphics Forum 20, 3, 165--173.
[24]
Seitz, S. M., and Dyer, C. M. 1997. Photorealistic scene reconstrcution by voxel coloring. In Conference on Computer Vision and Pattern Recognition (CVPR), 1067--1073.
[25]
Shade, J., Gortler, S., He, L.-W., and Szeliski, R. 1998. Layered depth images. In Computer Graphics (SIGGRAPH'98) Proceedings, ACM SIGGRAPH, 231--242.
[26]
Smolic, A., and Kimata, H. 2003. AHG on 3DAV Coding. ISO/IEC JTC1/SC29/WG11 MPEG03/M9635.
[27]
Szeliski, R., and Golland, P. 1999. Stereo matching with transparency and matting. International Journal of Computer Vision 32, 1, 45--61.
[28]
Tao, H., Sawhney, H., and Kumar, R. 2001. A global matching framework for stereo computation. In International Conference on Computer Vision (ICCV), vol. I, 532--539.
[29]
Tsin, Y., Kang, S. B., and Szeliski, R. 2003. Stereo matching with reflections and translucency. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. I, 702--709.
[30]
Vedula, S., Baker, S., Seitz, S., and Kanade, T. 2000. Shape and motion carving in 6D. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 592--598.
[31]
Wang, J. Y. A., and Adelson, E. H. 1993. Layered representation for motion analysis. In Conference on Computer Vision and Pattern Recognition (CVPR), 361--366.
[32]
Wexler, Y., Fitzgibbon, A., and Zisserman, A. 2002. Bayesian estimation of layers from multiple images. In Seventh European Conference on Computer Vision (ECCV), vol. III, 487--501.
[33]
Wilburn, B., Smulski, M., Lee, H. H. K., and Horowitz, M. 2002. The light field video camera. In SPIE Electonic Imaging: Media Processors, vol. 4674, 29--36.
[34]
Yang, J. C., Everett, M., Buehler, C., and McMillan, L. 2002. A real-time distributed light field camera. In Eurographics Workshop on Rendering, 77--85.
[35]
Yang, R., Welch, G., and Bishop, G. 2002. Real-time consensus-based scene reconstruction using commodity graphics hardware. In Proceedings of Pacific Graphics, 225--234.
[36]
Zhang, Y., and Kambhamettu, C. 2001. On 3D scene flow and structure estimation. In Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, 778--785.
[37]
Zhang, L., Curless, B., and Seitz, S. M. 2003. Spacetime stereo: Shape recovery for dynamic scenes. In Conference on Computer Vision and Pattern Recognition, 367--374.
[38]
Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11, 1330--1334.

Cited By

View all
  • (2024)光场表征及其分辨率提升技术:文献综述及最新进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024034753:9(20240347)Online publication date: 2024
  • (2024)CEDAR: Computing-in-pixel Edge-aware Detection and Reconstruction Architecture for High-resolution 3D ImagingProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3657313(1-6)Online publication date: 23-Jun-2024
  • (2024)Factorized Motion Fields for Fast Sparse Input Dynamic View SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657498(1-12)Online publication date: 13-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 23, Issue 3
August 2004
684 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/1015706
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2004
Published in TOG Volume 23, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Computer Vision
  2. Dynamic Scenes
  3. Image-Based Rendering

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)20
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)光场表征及其分辨率提升技术:文献综述及最新进展(特邀)Infrared and Laser Engineering10.3788/IRLA2024034753:9(20240347)Online publication date: 2024
  • (2024)CEDAR: Computing-in-pixel Edge-aware Detection and Reconstruction Architecture for High-resolution 3D ImagingProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3657313(1-6)Online publication date: 23-Jun-2024
  • (2024)Factorized Motion Fields for Fast Sparse Input Dynamic View SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657498(1-12)Online publication date: 13-Jul-2024
  • (2024)Point-DynRF: Point-based Dynamic Radiance Fields from a Monocular Video2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00314(3159-3169)Online publication date: 3-Jan-2024
  • (2024)LTM-NeRF: Embedding 3D Local Tone Mapping in HDR Neural Radiance FieldIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.344862046:12(10944-10959)Online publication date: 1-Dec-2024
  • (2024)Bridging Implicit and Explicit Geometric Transformation for Single-Image View SynthesisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337875746:9(6326-6340)Online publication date: 1-Sep-2024
  • (2024)Edge-Aware Network for Flow-Based Video Frame InterpolationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.317828135:1(1401-1408)Online publication date: Jan-2024
  • (2024)Online Multicast Traffic Engineering for Multi-View Videos With View Synthesis in SDNIEEE/ACM Transactions on Networking10.1109/TNET.2024.336616632:4(2778-2793)Online publication date: Aug-2024
  • (2024)Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation NetworkIEEE Transactions on Multimedia10.1109/TMM.2024.335563926(6701-6716)Online publication date: 18-Jan-2024
  • (2024)MFNet:Real-Time Motion Focus Network for Video Frame InterpolationIEEE Transactions on Multimedia10.1109/TMM.2023.330844226(3251-3262)Online publication date: 1-Jan-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media