Skip to main content
Log in

Image-Based Modeling by Joint Segmentation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The paper first traces the image-based modeling back to feature tracking and factorization that have been developed in the group led by Kanade since the eighties. Both feature tracking and factorization have inspired and motivated many important algorithms in structure from motion, 3D reconstruction and modeling. We then revisit the recent quasi-dense approach to structure from motion. The key advantage of the quasi-dense approach is that it not only delivers the structure from motion in a robust manner for practical modeling purposes, but also it provides a cloud of sufficiently dense 3D points that allows the objects to be explicitly modeled. To structure the available 3D points and registered 2D image information, we argue that a joint segmentation of both 3D and 2D is the fundamental stage for the subsequent modeling. We finally propose a probabilistic framework for the joint segmentation. The optimal solution to such a joint segmentation is still generally intractable, but approximate solutions are developed in this paper. These methods are implemented and validated on real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Blake, A., Rother, C., Brown, M., Pérez, P., and Torr, P.H.S. 2004. Interactive image segmentation using an adaptive GMMRF model. In ECCV (1), pp. 428–441.

  • Boykov, Y. and Jolly, M. 2001. Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In ICCV, pp. 105–112,

  • Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell., 23(11):1222–1239.

    Article  Google Scholar 

  • Criminisi, A., Cross, G., Blake, A., and Kolmogorov, V. 2006. Bilayer segmentation of live video. In CVPR.

  • Dempster, A.P., Laird, N.M., and Rubin, D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B:1–38.

  • Faugeras, O. 1992. What can be seen in three dimensions with an uncalibrated stereo rig? In Sandini, G. (Ed.), In Proceedings of the 2nd European Conference on Computer Vision, Santa Margherita Ligure, Italy, pp. 563–578. Springer-Verlag

  • Faugeras, O., Luong, Q., and Papadopoulo, T. 2001. The geometry of multiple images. The MIT Press, Cambridge, MA, USA.

    MATH  Google Scholar 

  • Förstner, W. 1994. A framework for low level feature extraction. In Proceedings of the 3rd European Conference on Computer Vision, Stockholm, Sweden, pp. 383–394.

  • Fua, P. 1991. Combining stereo and monocular information to compute dense depth maps that preserve discontinuities. In Proceedings of the 12th International Joint Conference on Artificial Intelligence, Sydney, Australia.

  • Gargallo, P. and Sturm, P. 2005. Bayesian 3D Modeling from Images Using Multiple Depth Maps. In CVPR (2), pp. 885–891,

  • Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Alvey Vision Conference, pp. 147–151.

  • Hartley, R.I. 1992. Estimation of relative camera positions for uncalibrated cameras. In Sandini, G. (Ed.), In Proceedings of the 2nd European Conference on Computer Vision, Santa Margherita Ligure, Italy, pp. 579–587, Springer-Verlag.

  • Hartley, R.I. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.

  • Urban, M., Matas, J., Chum, O., and Pajdla, T. 2002. Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference, pp. 384–393.

  • Koenderink, J.J. and van Doorn, A.J. 1989. Affine structure from motion. Technical report, Utrecht University, Utrecht, The Netherlands, also appeared in Journal of the Optical Society of America A, 8(2):377–385, 1991.

  • Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., and Rother, C. 2005. Bi-Layer segmentation of binocular stereo video. In CVPR (2), pp. 407–414.

  • Kolmogorov, V. and Zabih, R. 2002. Multi-camera scene reconstruction via graph cuts. In ECCV (3), pp. 82–96.

  • Lafferty, J.D., McCallum, A., and Pereira, F.C.N. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML, pp. 282–289.

  • Laveau, S. 1996. Géométrie d'un système de $N$ caméras. Théorie, estimation, et applications. Thèse de doctorat, École Polytechnique.

  • Lhuillier, M. 1998. Efficient dense matching for textured scenes using region growing. In Proceedings of the ninth British Machine Vision Conference, Southampton, England, pp. 700–709.

  • Lhuillier, M. and Quan, L. 2002. Image-based rendering by match propagation and joint view triangualtion. IEEE Transactions on Pattern Analysis and Machine Intelligence}, 24(8):1140–1146.

    Article  Google Scholar 

  • Lhuillier, M. and Quan, L. 2005. A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3):418–433.

    Article  Google Scholar 

  • Li, Y., Sun, J., Tang, C., and Shum, H. 2004. Lazy snapping. In Proceedings of ACM SIGGRAPH., pp. 303–308.

  • Lowe, D. 2004. Distinctive image feature from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110.

    Article  Google Scholar 

  • Lucas, B.D. and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence.

  • Malik, J., Belongie, S., Leung, T.K., and Shi, J. 2001. Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1):7–27.

    Article  MATH  Google Scholar 

  • Mohr, R., Quan, L., and Veillon, F. 1995. Relative 3D reconstruction using multiple uncalibrated images. International Journal of Robotic Research, 14(6):619–632.

    Article  Google Scholar 

  • Mohr, R., Quan, L., Veillon, F., and Boufama, B. 1992. Relative 3D reconstruction using multiple uncalibrated images. Technical Report RT 84-I-IMAG LIFIA 12, LIFIA–IRIMAG.

  • Moravec, H. 1979. Visual mapping by a robot rover. In Proceedings of the 6th International Joint Conference on Artificial Intelligence, Tokyo, Japan, pp. 598–600.

  • Moravec, H. 1981. Obstable avoidance and navigation in the real world by a seeing robot rover. Technical report CMU-RI-tr-3, Carnegie Mellon University.

  • Nister, D. 2001. Automatic Dense Reconstruction from Uncalibrated Video Sequences. Ph.d. thesis, NADA, KTH, Sweden.

  • Patras, I., Hendriks, E.A., and Lagendijk, R.L. 2001. Video segmentation by MAP labeling of watershed segments. IEEE Trans. Pattern Anal. Mach. Intell., 23(3):326–332.

    Article  Google Scholar 

  • Pollard, S.B., Mayhew, J.E.W., and Frisby, J.P. 1985. PMF: a stereo correspondence algorithm using a disparity gradient limit. Perception. 14, pp. 449–470.

    Article  Google Scholar 

  • Pollefeys, M., Koch, R., and Van Gool, L. 1998. Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters. In Proceedings of the 6th International Conference on Computer Vision, Bombay, India, pp. 90–95.

  • Quan, L. 1995. Invariants of six points and projective reconstruction from three uncalibrated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(1):34–46.

    Article  Google Scholar 

  • Quan, L., Tan, P., Zeng, G., Yuan, L., Wang, J., and Kang, S.B. 2006. Image-based plant modeling. In Proceedings of ACM SIGGRAPH.

  • Rother, C., Kolmogorov, V., and Blake, A. 2004. GrabCut: interactive foreground extraction using iterated graph cuts. In Proceedings of ACM SIGGRAPH., pp. 309–314.

  • Shi, J. and Malik, J. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905.

    Article  Google Scholar 

  • Shi, J. and Tomasi, C. 1994. Good features to track. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Seattle, Washington, USA, pp. 593–600.

  • Sturm, P. and Triggs, B. 1996. A factorization based algorithm for multi-image projective structure and motion. In B. Buxton and R. Cipolla, editors, Proceedings of the 4th European Conference on Computer Vision, Cambridge, England, volume 1065 of Lecture Notes in Computer Science, pp. 709–720. Springer-Verlag.

  • Sun, J., Zhang, W., Tang, X., and Shum, H. 2006. Background Cut. In ECCV.

  • Tanner, M.A. and Wong, W.H. 1987. The calculation of posterior distributions by data augmentation (with discussion). In Journal of the American Statistical Association, 82, 528–550.

    Article  MATH  Google Scholar 

  • Tipping, M.E. and Bishop, C.M. 1999. Mixtures of probabilistic principal component analysers. Neural Computation, 11(2):443–482.

    Article  Google Scholar 

  • Tomasi, C. 1991. Shape and Motion from Image Streams: a Factorization Method. PhD thesis, Carnegie Mellon University, USA.

  • Tomasi, C. and Kanade, T. 1991. Detection and tracking of point features. Technical report CMU-CS-91-132, Carnegie Mellon University.

  • Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2):137–154.

    Article  Google Scholar 

  • Torr, P.H.S., Szeliski, R., and Anandan, P. 2001. An integrated bayesian approach to layer extraction from image sequences. IEEE Trans. Pattern Anal. Mach. Intell., 23(3):297–303.

    Article  Google Scholar 

  • Triggs, B. 1996. Factorization methods for projective structure and motion. In Proceedings of the Conference on Computer Vision and Pattern Recognition, San Francisco, California, USA, pp. 845–851.

  • Triggs, B. 1997. Autocalibration and the absolute quadric. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Puerto Rico, USA, pp. 609–614. IEEE Computer Society Press.

  • Triggs, B. 2004. Detecting keypoints wiht stable position, orientation and scale under illumination changes. In European Conference on Computer Vision. Springer-Verlag.

  • Triggs, B., McLauchlan, P.F., Hartley, R.I., and Fitzgibbon, A. 2000. Bundle ajustment—a modern synthesis. In Triggs, B., Zisserman, A., and Szeliski, R. (Eds.), Vision Algorithms: Theory and Practice, volume 1883 of Lecture Notes in Computer Science, pp. 298–372. Springer-Verlag.

  • Tu, Z. and Zhu, S.C. 2002. Image segmentation by data-driven Markov chain Monte Carlo. IEEE Trans. Pattern Anal. Mach. Intell., 24(5):657–673.

    Article  Google Scholar 

  • Tuytelaars, T. and Van Gool, 2000. Wide baseline stereo based on local, affinely invariant regions. In British Machine Vision Conference, pp. 412–422.

  • Wang, J.Y.A., and Adelson, E.H. 1994. Representing moving images with layers. IEEE Transactions on Image Processing, 3(5):625–638.

    Article  Google Scholar 

  • Wei, Y., Ofek, E., Quan, L., and Shum, H. 2005. Modeling hair from multiple views. ACM Transactions on Graphics (TOG), Proceedings of ACM SIGGRAPH 2005 (SIGGRAPH), vol. 27, no. 3.

  • Wills, J., Agarwal, S., and Belongie, S. 2003. What went where. In CVPR (1), pp. 37–44,

  • Xiao, J. and Shah, M. 2004. Motion layer extraction in the presence of occlusion using graph cut. In CVPR (2), pp. 972–979.

  • Zabih, R. and Kolmogorov, V. 2004. Spatially coherent clustering using graph cuts. In CVPR (2), pp. 437–444.

  • Zeng, G., Paris, S., Quan, L., and Sillion, F. to appear. Accurate and scalable surface representation and reconstruction from images. IEEE Transaction on Pattern Analysis and Machine Intelligence, (IEEE TPAMI).

  • Zhang, Z., Deriche, R., Faugeras, O.D., and Luong, Q.T. 1995. A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence, 78(1–2):87–120. Appeared in October 1995, also INRIA Research Report No.2273, May 1994.

    Google Scholar 

  • Zhu, X. and Lafferty, J. 2005. Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. In ICML, pp. 1052–1059.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Long Quan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Quan, L., Wang, J., Tan, P. et al. Image-Based Modeling by Joint Segmentation. Int J Comput Vis 75, 135–150 (2007). https://doi.org/10.1007/s11263-007-0044-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-007-0044-1

Keywords

Navigation