Skip to main content
Log in

Constructing a Multivalued Representation for View Synthesis

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

A fundamental problem in computer vision and graphics is that of arbitrary view synthesis for static 3-D scenes, whereby a user-specified viewpoint of the given scene may be created directly from a representation. We propose a novel compact representation for this purpose called the multivalued representation (MVR). Starting with an image sequence captured by a moving camera undergoing either unknown planar translation or orbital motion, a MVR is derived for each preselected reference frame, and may then be used to synthesize arbitrary views of the scene. The representation itself is comprised of multiple depth and intensity levels in which the k-th level consists of points occluded by exactly k surfaces. To build a MVR with respect to a particular reference frame, dense depth maps are first computed for all the neighboring frames of the reference frame. The depth maps are then combined together into a single map, where points are organized by occlusions rather than by coherent affine motions. This grouping facilitates an automatic process to determine the number of levels and helps to reduce the artifacts caused by occlusions in the scene. An iterative multiframe algorithm is presented for dense depth estimation that both handles low-contrast regions and produces piecewise smooth depth maps. Reconstructed views as well as arbitrary flyarounds of real scenes are presented to demonstrate the effectiveness of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anandan, P. 1984. Computing dense displacement fields with confi-dence measures in scenes containing occlusion. In Proceedings of the SPIE: Intelligent Robots and Computer Vision, 5–8 November, Vol. 521, Cambridge, MA, pp. 184–194.

    Google Scholar 

  • Anandan, P., Bergen, J.R., Hanna, K.J., and Hingorani, R. 1993. Hierarchical model-based motion estimation. In Motion Analysis and Image Sequence Processing, Chap. 1, M.I. Sezan and R.L. Lagendijk (Eds.). Kluwer Academic Publishers.

  • Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In Proceedings of CVPR, Santa Barbara, CA, pp. 434–441.

  • Chang, N.L. 1994. View reconstruction from uncalibrated cameras for three-dimensional scenes. Master's Thesis, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley.

  • Chang, N.L. 1999. Depth-based representations of three-dimensional scenes for view synthesis. Ph.D. Thesis, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley. URL: www.video.eecs.berkeley.edu/∼nlachang/MVR.

  • Chang, N.L. and Zakhor, A. 1997a. Multivalued representations for image reconstruction and new view synthesis. Qualifying Examination Proposal, University of California at Berkeley. Also Technical Report, Video and Image Processing Lab, Feb. 1997.

  • Chang, N.L. and Zakhor, A. 1997b. View generation for threedimensional scenes from video sequences. IEEE Trans.on Image Proc., 6(4):584–598.

    Google Scholar 

  • Chang, N.L. and Zakhor, A. 1998. Finite sensor effects for estimating structure-from-motion. In Proceedings of ICIP, 5–8 October, Vol. 1, Chicago, IL, pp. 918–922.

    Google Scholar 

  • Chang, N.L. and Zakhor, A. 1999. A multivalued representation for view synthesis. In Proceedings of ICIP (Invited paper), 25–28 October, Vol. 2, Kobe, Japan, pp. 505–509.

    Google Scholar 

  • Chen, S.E. and Williams, L. 1993. View interpolation for image synthesis. In Proceedings of SIGGRAPH, 1–6 August, New York, NY, pp. 279–288.

  • Cox, I.J., Hingorani, S., Maggs, B.M., and Rao, S.B. 1992. Stereo without disparity gradient smoothing: A bayesian sensor fusion solution. In Proceedings of BMVC, 22–24 September, Leeds, UK, pp. 337–346.

  • Darrell, T. and Pentland, A.P. 1995. Cooperative robust estimation using layers of support. IEEE Trans.on Patt.Anal.Mach.Intell., 17(5):474–487.

    Google Scholar 

  • Debevec, P.E. 1996. Modeling and rendering architecture from photographs. Ph.D. Thesis, Computer Sciences Division, University of California at Berkeley.

  • Falkenhagen, L. 1994. Depth estimation from stereoscopic image pairs assuming piecewise continuous surfaces. In Workshops in Computing, Image Processing for Broadcast and Video Production, Hamburg, pp. 115–127.

  • Faugeras, O.D. 1994. Three-Dimensional Computer Vision. MIT Press: Cambridge, MA.

    Google Scholar 

  • Fua, P. 1993. A parallel stereo algorithm that produces dense depth maps and preserves image features. Machine Vision and Applications, 6(1):35–49.

    Google Scholar 

  • Gortler, S.J., Grzeszczuk, R., Szeliski, R., and Cohen, M.F. 1996. The lumigraph. In Proceedings of SIGGRAPH, 4–9 August, New Orleans, LA, pp. 43–54.

  • Hartley, R.I. 1997. In defense of the eight-point algorithm. IEEE Trans.on Patt.Anal.Mach.Intell., 19(6):580–593.

    Google Scholar 

  • Haralick, R.M. and Shapiro, L.G. 1985. Image segmentation techniques. Computer Vision, Graphics, and Image Processing, 29(1):100–132.

    Google Scholar 

  • Intille, S.S. and Bobick, A.F. 1994. Disparity-space images and large occlusion stereo. Technical Report 220, MIT Media Lab Perceptual Computing Group.

  • Kanade, T., Rander, P.W., and Narayanan, P.J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE Multimedia, 4(1):34–47.

    Google Scholar 

  • Kang, S.B. and Szeliski, R. 1997. 3-d scene data recovery using omnidirectional multibaseline stereo. International Journal of Computer Vision, 25(2):167–183.

    Google Scholar 

  • Koch, R. 1993. Automatic reconstruction of buildings from stereoscopic image sequences. In Proceedings of EUROGRAPHICS, 6–10 September, Barcelona, Spain, pp. 339–350.

  • Laveau, S. and Faugeras, O. 1994. 3-D scene representation as a collection of images and fundamental matrices. Technical Report 2205, INRIA.

  • Levoy, M. and Hanrahan, P. 1996. Light field rendering. In Proceedings of SIGGRAPH, 4–9 August, New Orleans, LA, pp. 31–42.

  • Lim, J.S. 1990. Two-Dimensional Signal and Image Processing. Prentice-Hall: Englewood Cliffs, NJ.

    Google Scholar 

  • Longuet-Higgins, H.C. 1981. A computer algorithm for reconstructing a scene from two projections.Nature, 293(5828):133–135.

    Google Scholar 

  • Matthies, L., Kanade, T., and Szeliski, R. 1989. Kalman filter-based algorithms for estimating depth from image sequences. International Journal of Computer Vision, 3(3):209–238.

    Google Scholar 

  • Maybank, S. 1993. Theory of Reconstruction from Image Motion. Spring-Verlag: Berlin.

    Google Scholar 

  • McMillan, L. 1995. A list-priority rendering algorithm for redisplaying projected surfaces. Technical Report 95–005, University of North Carolina.

  • McMillan, L. and Bishop, G. 1995. Plenoptic modeling: An imagebased rendering system. In Proceedings of SIGGRAPH, 6–11 August, Los Angeles, CA, pp. 39–46.

  • Meier, T. and Ngan, K.N. 1998. Automatic segmentation of moving objects for video object plane generation. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):525–538.

    Google Scholar 

  • Murray, R.M., Li, Z., and Sastry, S.S. 1994. A Mathematical Introduction to Robotic Manipulation. CRC Press: Boca Raton.

    Google Scholar 

  • Ohta, Y. and Kanade, T. 1985. Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans.Pattern Anal.Mach.Intell., PAMI-7(2):139–154.

    Google Scholar 

  • Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Trans.on Patt.Anal.Mach.Intell., 15(4):353–363.

    Google Scholar 

  • Pal, N.R. and Pal, S.K. 1993. A review on image segmentation techniques. Pattern Recognition, 26(9):1277–1294.

    Google Scholar 

  • Rousseeuw, P.J. and Leroy, A.M. 1987. Robust Regression and Outlier Detection. Wiley: New York.

    Google Scholar 

  • Sawhney, H.S. and Ayer, S. 1996. Compact representations of videos through dominant and multiple motion estimation. IEEE Trans.on Patt.Anal.Mach.Intell., 18(8):814–830.

    Google Scholar 

  • Seitz, S.M. and Dyer, C.R. 1996. View morphing. In Proceedings of SIGGRAPH, 4–9 August, New Orleans, LA, pp. 21–30.

  • Shade, J., Gortler, S., He, L.W., and Szeliski, R. 1998. Layered depth images. In Proceedings of SIGGRAPH. Orlando, FL.

  • Shi, J., Belongie, S., Leung, T., and Malik, J. 1998. Image and video segmentation: The normalized cut framework. In Proceedings of ICIP, Vol. 1, 4–7 October, Chicago, IL, pp. 943–947.

    Google Scholar 

  • Shum, H.Y., Ikeuchi, K., and Reddy, R. 1995. Principal component analysis with missing data and its application to polyhedral object modeling. IEEE Trans.on Pattern Anal.Mach.Intell., 17(9):854–867.

    Google Scholar 

  • Shum, H.Y., Han, M., and Szeliski, R. 1998. Interactive construction of 3-d models from panoramic mosaics. In Proceedings of CVPR, 23–25 June, Santa Barbara, CA, pp. 427–433.

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization. International Journal of Computer Vision, 9(2):137–154.

    Google Scholar 

  • Tsai, R.Y. 1987. A versatile camera calibration technique for highaccuracy 3-d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE Journal of Robotics and Automation, RA–3(4):323–344.

    Google Scholar 

  • Tsai, R.Y. and Huang, T.S. 1984. Uniqueness and estimation of threedimensional motion parameters of rigid objects with curved surfaces. IEEE Trans.on Patt.Anal.Mach.Intel., PAMI–6(1):13–27.

    Google Scholar 

  • Vass, J., Palaniappan, K., and Zhuang, X. 1998. Automatic spatiotemporal video sequence segmentation. In Proceedings of ICIP, Vol. 1, 4–7 October, Chicago, IL, pp. 958–962.

    Google Scholar 

  • Wang, D. 1998. Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 8(5):539–546.

    Google Scholar 

  • Wang, J.Y.A. and Adelson, E.H. 1994. Representing moving images with layers. IEEE Trans.on Image Proc., 3(5):625–638.

    Google Scholar 

  • Weiss, Y. and Adelson, E.H. 1996. A unified mixture framework for motion segmentation: Incorporating spatial coherence and estimating the number of models. In Proceedings of CVPR, 18–20 June, San Francisco, CA, pp. 321–326.

  • Zhang, Z., Deriche, R., Faugeras, O.D., and Luong, Q.T. 1995. A robust technique for matching two uncalibrated images through the recovery of the unknownepipolar geometry. Artificial Intelligence, 78(1/2):87–119.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, N.L., Zakhor, A. Constructing a Multivalued Representation for View Synthesis. International Journal of Computer Vision 45, 157–190 (2001). https://doi.org/10.1023/A:1012476031602

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1012476031602

Navigation