Constructing a Multivalued Representation for View Synthesis

Chang, Nelson L.; Zakhor, Avideh

doi:10.1023/A:1012476031602

Constructing a Multivalued Representation for View Synthesis

Published: November 2001

Volume 45, pages 157–190, (2001)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Nelson L. Chang¹ &
Avideh Zakhor²

118 Accesses
12 Citations
6 Altmetric
Explore all metrics

Abstract

A fundamental problem in computer vision and graphics is that of arbitrary view synthesis for static 3-D scenes, whereby a user-specified viewpoint of the given scene may be created directly from a representation. We propose a novel compact representation for this purpose called the multivalued representation (MVR). Starting with an image sequence captured by a moving camera undergoing either unknown planar translation or orbital motion, a MVR is derived for each preselected reference frame, and may then be used to synthesize arbitrary views of the scene. The representation itself is comprised of multiple depth and intensity levels in which the k-th level consists of points occluded by exactly k surfaces. To build a MVR with respect to a particular reference frame, dense depth maps are first computed for all the neighboring frames of the reference frame. The depth maps are then combined together into a single map, where points are organized by occlusions rather than by coherent affine motions. This grouping facilitates an automatic process to determine the number of levels and helps to reduce the artifacts caused by occlusions in the scene. An iterative multiframe algorithm is presented for dense depth estimation that both handles low-contrast regions and produces piecewise smooth depth maps. Reconstructed views as well as arbitrary flyarounds of real scenes are presented to demonstrate the effectiveness of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Anandan, P. 1984. Computing dense displacement fields with confi-dence measures in scenes containing occlusion. In Proceedings of the SPIE: Intelligent Robots and Computer Vision, 5–8 November, Vol. 521, Cambridge, MA, pp. 184–194.
Google Scholar
Anandan, P., Bergen, J.R., Hanna, K.J., and Hingorani, R. 1993. Hierarchical model-based motion estimation. In Motion Analysis and Image Sequence Processing, Chap. 1, M.I. Sezan and R.L. Lagendijk (Eds.). Kluwer Academic Publishers.
Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Proceedings of CVPR, Santa Barbara, CA, pp. 434–441.
Chang, N.L. 1994. View reconstruction from uncalibrated cameras for three-dimensional scenes. Master's Thesis, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley.
Chang, N.L. 1999. Depth-based representations of three-dimensional scenes for view synthesis. Ph.D. Thesis, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley. URL: www.video.eecs.berkeley.edu/∼nlachang/MVR.
Chang, N.L. and Zakhor, A. 1997a. Multivalued representations for image reconstruction and new view synthesis. Qualifying Examination Proposal, University of California at Berkeley. Also Technical Report, Video and Image Processing Lab, Feb. 1997.
Chang, N.L. and Zakhor, A. 1997b. View generation for threedimensional scenes from video sequences. IEEE Trans.on Image Proc., 6(4):584–598.
Google Scholar
Chang, N.L. and Zakhor, A. 1998. Finite sensor effects for estimating structure-from-motion. In Proceedings of ICIP, 5–8 October, Vol. 1, Chicago, IL, pp. 918–922.
Google Scholar
Chang, N.L. and Zakhor, A. 1999. A multivalued representation for view synthesis. In Proceedings of ICIP (Invited paper), 25–28 October, Vol. 2, Kobe, Japan, pp. 505–509.
Google Scholar
Chen, S.E. and Williams, L. 1993. View interpolation for image synthesis. In Proceedings of SIGGRAPH, 1–6 August, New York, NY, pp. 279–288.
Cox, I.J., Hingorani, S., Maggs, B.M., and Rao, S.B. 1992. Stereo without disparity gradient smoothing: A bayesian sensor fusion solution. In Proceedings of BMVC, 22–24 September, Leeds, UK, pp. 337–346.
Darrell, T. and Pentland, A.P. 1995. Cooperative robust estimation using layers of support. IEEE Trans.on Patt.Anal.Mach.Intell., 17(5):474–487.
Google Scholar
Debevec, P.E. 1996. Modeling and rendering architecture from photographs. Ph.D. Thesis, Computer Sciences Division, University of California at Berkeley.
Falkenhagen, L. 1994. Depth estimation from stereoscopic image pairs assuming piecewise continuous surfaces. In Workshops in Computing, Image Processing for Broadcast and Video Production, Hamburg, pp. 115–127.
Faugeras, O.D. 1994. Three-Dimensional Computer Vision. MIT Press: Cambridge, MA.
Google Scholar
Fua, P. 1993. A parallel stereo algorithm that produces dense depth maps and preserves image features. Machine Vision and Applications, 6(1):35–49.
Google Scholar
Gortler, S.J., Grzeszczuk, R., Szeliski, R., and Cohen, M.F. 1996. The lumigraph. In Proceedings of SIGGRAPH, 4–9 August, New Orleans, LA, pp. 43–54.
Hartley, R.I. 1997. In defense of the eight-point algorithm. IEEE Trans.on Patt.Anal.Mach.Intell., 19(6):580–593.
Google Scholar
Haralick, R.M. and Shapiro, L.G. 1985. Image segmentation techniques. Computer Vision, Graphics, and Image Processing, 29(1):100–132.
Google Scholar
Intille, S.S. and Bobick, A.F. 1994. Disparity-space images and large occlusion stereo. Technical Report 220, MIT Media Lab Perceptual Computing Group.
Kanade, T., Rander, P.W., and Narayanan, P.J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE Multimedia, 4(1):34–47.
Google Scholar
Kang, S.B. and Szeliski, R. 1997. 3-d scene data recovery using omnidirectional multibaseline stereo. International Journal of Computer Vision, 25(2):167–183.
Google Scholar
Koch, R. 1993. Automatic reconstruction of buildings from stereoscopic image sequences. In Proceedings of EUROGRAPHICS, 6–10 September, Barcelona, Spain, pp. 339–350.
Laveau, S. and Faugeras, O. 1994. 3-D scene representation as a collection of images and fundamental matrices. Technical Report 2205, INRIA.
Levoy, M. and Hanrahan, P. 1996. Light field rendering. In Proceedings of SIGGRAPH, 4–9 August, New Orleans, LA, pp. 31–42.
Lim, J.S. 1990. Two-Dimensional Signal and Image Processing. Prentice-Hall: Englewood Cliffs, NJ.
Google Scholar
Longuet-Higgins, H.C. 1981. A computer algorithm for reconstructing a scene from two projections.Nature, 293(5828):133–135.
Google Scholar
Matthies, L., Kanade, T., and Szeliski, R. 1989. Kalman filter-based algorithms for estimating depth from image sequences. International Journal of Computer Vision, 3(3):209–238.
Google Scholar
Maybank, S. 1993. Theory of Reconstruction from Image Motion. Spring-Verlag: Berlin.
Google Scholar
McMillan, L. 1995. A list-priority rendering algorithm for redisplaying projected surfaces. Technical Report 95–005, University of North Carolina.
McMillan, L. and Bishop, G. 1995. Plenoptic modeling: An imagebased rendering system. In Proceedings of SIGGRAPH, 6–11 August, Los Angeles, CA, pp. 39–46.
Meier, T. and Ngan, K.N. 1998. Automatic segmentation of moving objects for video object plane generation. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):525–538.
Google Scholar
Murray, R.M., Li, Z., and Sastry, S.S. 1994. A Mathematical Introduction to Robotic Manipulation. CRC Press: Boca Raton.
Google Scholar
Ohta, Y. and Kanade, T. 1985. Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans.Pattern Anal.Mach.Intell., PAMI-7(2):139–154.
Google Scholar
Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Trans.on Patt.Anal.Mach.Intell., 15(4):353–363.
Google Scholar
Pal, N.R. and Pal, S.K. 1993. A review on image segmentation techniques. Pattern Recognition, 26(9):1277–1294.
Google Scholar
Rousseeuw, P.J. and Leroy, A.M. 1987. Robust Regression and Outlier Detection. Wiley: New York.
Google Scholar
Sawhney, H.S. and Ayer, S. 1996. Compact representations of videos through dominant and multiple motion estimation. IEEE Trans.on Patt.Anal.Mach.Intell., 18(8):814–830.
Google Scholar
Seitz, S.M. and Dyer, C.R. 1996. View morphing. In Proceedings of SIGGRAPH, 4–9 August, New Orleans, LA, pp. 21–30.
Shade, J., Gortler, S., He, L.W., and Szeliski, R. 1998. Layered depth images. In Proceedings of SIGGRAPH. Orlando, FL.
Shi, J., Belongie, S., Leung, T., and Malik, J. 1998. Image and video segmentation: The normalized cut framework. In Proceedings of ICIP, Vol. 1, 4–7 October, Chicago, IL, pp. 943–947.
Google Scholar
Shum, H.Y., Ikeuchi, K., and Reddy, R. 1995. Principal component analysis with missing data and its application to polyhedral object modeling. IEEE Trans.on Pattern Anal.Mach.Intell., 17(9):854–867.
Google Scholar
Shum, H.Y., Han, M., and Szeliski, R. 1998. Interactive construction of 3-d models from panoramic mosaics. In Proceedings of CVPR, 23–25 June, Santa Barbara, CA, pp. 427–433.
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization. International Journal of Computer Vision, 9(2):137–154.
Google Scholar
Tsai, R.Y. 1987. A versatile camera calibration technique for highaccuracy 3-d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics and Automation, RA–3(4):323–344.
Google Scholar
Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of threedimensional motion parameters of rigid objects with curved surfaces. IEEE Trans.on Patt.Anal.Mach.Intel., PAMI–6(1):13–27.
Google Scholar
Vass, J., Palaniappan, K., and Zhuang, X. 1998. Automatic spatiotemporal video sequence segmentation. In Proceedings of ICIP, Vol. 1, 4–7 October, Chicago, IL, pp. 958–962.
Google Scholar
Wang, D. 1998. Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):539–546.
Google Scholar
Wang, J.Y.A. and Adelson, E.H. 1994. Representing moving images with layers. IEEE Trans.on Image Proc., 3(5):625–638.
Google Scholar
Weiss, Y. and Adelson, E.H. 1996. A unified mixture framework for motion segmentation: Incorporating spatial coherence and estimating the number of models. In Proceedings of CVPR, 18–20 June, San Francisco, CA, pp. 321–326.
Zhang, Z., Deriche, R., Faugeras, O.D., and Luong, Q.T. 1995. A robust technique for matching two uncalibrated images through the recovery of the unknownepipolar geometry. Artificial Intelligence, 78(1/2):87–119.
Google Scholar

Download references

Author information

Authors and Affiliations

Imaging Technology Department, Hewlett-Packard Laboratories, 1501 Page Mill Road, MS 4U-6, Palo Alto, CA, 94304, USA
Nelson L. Chang
Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, 94720, USA
Avideh Zakhor

Authors

Nelson L. Chang
View author publications
You can also search for this author in PubMed Google Scholar
Avideh Zakhor
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, N.L., Zakhor, A. Constructing a Multivalued Representation for View Synthesis. International Journal of Computer Vision 45, 157–190 (2001). https://doi.org/10.1023/A:1012476031602

Download citation

Issue Date: November 2001
DOI: https://doi.org/10.1023/A:1012476031602

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Constructing a Multivalued Representation for View Synthesis

Abstract

Access this article

Similar content being viewed by others

altiro3d: scene representation from single image and novel view synthesis

Novel View-Synthesis from Multiple Sources for Conversion to 3DS

View synthesis for pose computation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Constructing a Multivalued Representation for View Synthesis

Abstract

Access this article

Similar content being viewed by others

altiro3d: scene representation from single image and novel view synthesis

Novel View-Synthesis from Multiple Sources for Conversion to 3DS

View synthesis for pose computation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation