Skip to main content
Log in

Three-Dimensional Shape Knowledge for Joint Image Segmentation and Pose Tracking

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this article we present the integration of 3-D shape knowledge into a variational model for level set based image segmentation and contour based 3-D pose tracking. Given the surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object contour extracted by the segmentation method is applied to estimate the 3-D pose parameters of the object. Vice-versa, the surface model projected to the image plane helps in a top-down manner to improve the extraction of the contour. While common alternative segmentation approaches, which integrate 2-D shape knowledge, face the problem that an object can look very differently from various viewpoints, a 3-D free form model ensures that for each view the model can fit the data in the image very well. Moreover, one additionally solves the problem of determining the object’s pose in 3-D space. The performance is demonstrated by numerous experiments with a monocular and a stereo camera system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Araújo, H., Carceroni, R.L., and Brown, C.M. 1998. A fully projective formulation to improve the accuracy of Lowe’s pose-estimation algorithm. Computer Vision and Image Understanding, 70(2):227–238.

    Article  Google Scholar 

  • Besl, P. and McKay, N. 1992. A method for registration of 3D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12:239–256.

    Article  Google Scholar 

  • Besl, P.J. 1990. The free-form surface matching problem. In Machine Vision for Three-Dimensional Scenes, H. Freemann (Ed.), Academic, Press: San Diego, pp. 25–71.

  • Beveridge, J.R. 1993. Local search algorithms for geometric object recognition: Optimal correspondence and pose. Technical Report Technical Report CS 93–5, University of Massachusetts, Amherst.

  • Blake, A. and Zisserman, A. 1987. Visual Reconstruction. MIT Press: Cambridge, MA.

    Google Scholar 

  • Blaschke, W. 1960. Kinematik und Quaternionen, Mathematische Monographien. 4. Deutscher Verlag der Wissenschaften.

  • Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Santa Barbara, California, pp. 8–15.

  • Bregler, C., Malik, J., and Pullen, K. 2004. Twist based acquisition and tracking of animal and human kinetics. International Journal of Computer Vision, 56(3):179–194.

    Article  Google Scholar 

  • Brox, T., Rosenhahn, B., and Weickert, J. 2005. Three-dimensional shape knowledge for joint image segmentation and pose estimation. In Pattern Recognition, W. Kropatsch, R. Sablatnig, and A. Hanbury (Eds.), volume 3663 of LNCS, Springer, pp. 109–116.

  • Brox, T. and Weickert, J. 2005. Level set segmentation with multiple regions. Technical Report 145, Dept. of Mathematics, Saarland University, Saarbrücken, Germany.

  • Brox, T. and Weickert, J. 2006. A TV flow based local scale estimate and its application to texture discrimination. Journal of Visual Communication and Image Representation, To appear.

  • Campbell, R. and Flynn, P. 2001. A survey of free-form object representation and recognition techniques. Computer Vision and Image Understanding, (81):166–210.

  • Caselles, V., Catté, F., Coll, T., and Dibos, F. 1993. A geometric model for active contours in image processing. Numerische Mathematik, 66:1–31.

    Article  MathSciNet  MATH  Google Scholar 

  • Chan, T. and Vese, L. 1999. An active contour model without edges. In Scale-Space Theories in Computer Vision, M. Nielsen, P. Johansen, O. F. Olsen, and J. Weickert (Eds.), volume 1682 of LNCS, Springer, pp. 141–151.

  • Chan, T. and Vese, L. 2001. Active contours without edges. IEEE Transactions on Image Processing, 10(2):266–277.

    Article  MATH  Google Scholar 

  • Cremers, D., Osher, S., and Soatto, S. 2004. A multi-modal translation-invariant shape prior for level set segmentation. In Pattern Recognition, C.-E. Rasmussen, H. Bülthoff, M. Giese, and B. Schölkopf (Eds.), volume 3175 of LNCS, Springer, Berlin, pp. 36–44.

  • Cremers, D., Schnörr, C., and Weickert, J. 2001. Diffusion-snakes: Combining statistical shape knowledge and image information in a variational framework. In Proc. First IEEE Workshop on Variational and Level Set Methods in Computer Vision, Vancouver, Canada, IEEE Computer Society Press, pp. 137–144.

  • Cremers, D. and Soatto, S. 2005. Motion competition: A variational framework for piecewise parametric motion segmentation. International Journal of Computer Vision, 62(3):249–265.

    Article  Google Scholar 

  • Cremers, D., Tischhäuser, F., Weickert, J., and Schnörr, C. 2002. Diffusion snakes: Introducing statistical shape knowledge into the mumford-shah functional. International Journal of Computer Vision, 50(3):295–313.

    Article  MATH  Google Scholar 

  • Dempster, A., Laird, N., and Rubin, D. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society series B, 39:1–38.

    Google Scholar 

  • Dervieux, A. and Thomasset, F. 1979. A finite element method for the simulation of Rayleigh–Taylor instability. In Approximation Methods for Navier–Stokes Problems, R. Rautman (Ed.), volume 771 of Lecture Notes in Mathematics, Springer pp. 145–158.

  • Drummond, T. and Cipolla, R. 2000. Real-time tracking of multiple articulated structures in multiple views. In Proc. 6th European Conference on Computer Vision, ECCV, Dublin, Ireland, Springer, pp. 20–36.

  • Faugeras, O. 1993. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press: Cambridge, MA.

    Google Scholar 

  • Faugeras, O. and Keriven, R. 1998. Variational principles, surface evolution, PDE’s, level set methods, and the stereo problem. IEEE Transactions on Image Processing, 7(3):336–344.

    Article  MathSciNet  MATH  Google Scholar 

  • Felzenszwalb, P.F. and Huttenlocher, D.P. 2004. Distance transforms of sampled functions. Technical Report TR2004-1963, Computer Science Department, Cornell University.

  • Gallier, J. 2001. Geometric Methods and Applications For Computer Science and Engineering. Springer-Verlag: New York Inc.

  • Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741.

    Article  MATH  Google Scholar 

  • Goddard, J. 1997. Pose and Motion Estimation From Vision Using Dual Quaternion-Based Extended Kalman Filtering. PhD thesis, Knoxville.

  • Grimson, W.E.L. 1990. Object Recognition by Computer. MIT Press: Cambridge, MA.

    Google Scholar 

  • Haag, M. and Nagel, H.-H. 1999. 3d-model-based vehicle tracking in traffic image sequences. International Journal of Computer Vision, 35(3):295–319.

    Article  Google Scholar 

  • Heiler, M. and Schnörr, C. 2005. Natural image statistics for natural image segmentation. International Journal of Computer Vision, 63(1):5–19.

    Article  Google Scholar 

  • Kadir, T. and Brady, M. 2003. Unsupervised non-parametric region segmentation using level sets. In Proc. Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 1267–1274.

  • Kass, M., Witkin, A., and Terzopoulos, D. 1988. Snakes: Active contour models. International Journal of Computer Vision, 1:321–331.

    Article  Google Scholar 

  • Kim, J., Fisher, J., Yezzi, A., Cetin, M., and Willsky, A. 2002. Nonparametric methods for image segmentation using information theory and curve evolution. In IEEE International Conference on Image Processing, Rochester, NY vol. 3, pp. 797–800, .

  • Kim, J., Fisher, J., Yezzi, A., Cetin, M., and Willsky, A. 2005. A nonparametric statistical method for image segmentation using information theory and curve evolution. IEEE Transactions on Image Processing, 14(10):1486–1502.

    Article  Google Scholar 

  • Kriegman, D., Vijayakumar, B., and Ponce, J. 1992. Constraints for recognizing and locating curved 3D objects from monocular image features. In Proc. 2nd European Conference on Computer Vision (ECCV ’92), G. Sandini (Ed.), volume 588 of Lecture Notes in Computer Science, Springer, pp. 829–833.

  • Lepetit, V. and Fua, P. 2005. Monocular model-based 3D tracking of rigid objects: A survey. Computer Graphics and Vision, 1(1):1–89.

    Article  Google Scholar 

  • Leventon, M.E., Grimson, W.E.L., and Faugeras, O. 2000. Statistical shape influence in geodesic active contours. In Proc. 2000 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head, SC, vol. 1, pp. 316–323.

  • Li, S.Z. 1995. Markov Random Field Modeling in Computer Vision. Springer Verlag: New York.

    Google Scholar 

  • Lowe, D. 1980. Solving for the parameters of object models from image descriptions. In Proc. ARPA Image Understanding Workshop, pp. 121–127.

  • Lowe, D. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence, 31(3):355–395.

    Article  Google Scholar 

  • Ma, Y., Soatto, S., Kosecka, J., Sastry, S.S., and Soatta, S. 2003. An Invitation to 3-D Vision. Springer Verlag: New York.

    Google Scholar 

  • Malik, J., Belongie, S., Leung, T., and Shi, J. 2001. Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1):7–27.

    Article  MATH  Google Scholar 

  • Malladi, R., Sethian, J.A., and Vemuri, B.C. 1995. Shape modeling with front propagation: A level set approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(2):158–175.

    Article  Google Scholar 

  • Mansouri, A., Mitiche, A., and Vázquez, C. 2004. Image partioning by level set multiregion competition. In Proc. International Conference on Image Processing, vol. 4, pp. 2721–2724.

  • Marchand, E., Bouthemy, P., and Chaumette, F. 2001. A 2D-3D model-based approach to real-time visual tracking. Image and Vision Computing, 19(13):941–955.

    Article  Google Scholar 

  • McLachlan, G. and Krishnan, T. 1997. The EM Algorithm and Extensions. Wiley series in probability and statistics. John Wiley & Sons.

  • Mumford, D. and Shah, J. 1989. Optimal approximations by piecewise smooth functions and associated variational problems. Communications on Pure and Applied Mathematics, 42:577–685.

    MathSciNet  MATH  Google Scholar 

  • Murray, R., Li, Z., and Sastry, S. 1994. Mathematical Introduction to Robotic Manipulation. CRC Press: Boca Raton, FL.

    MATH  Google Scholar 

  • Osher, S. and Sethian, J.A. 1988. Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton–Jacobi formulations. Journal of Computational Physics, 79:12–49.

    Article  MathSciNet  MATH  Google Scholar 

  • Paragios, N. and Deriche, R. 1999. Unifying boundary and region-based information for geodesic active tracking. In Proc. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Forth Collins, Colorado, vol. 2, pp. 300–305.

  • Paragios, N. and Deriche, R. 2002. Geodesic active regions: A new paradigm to deal with frame partition problems in computer vision. Journal of Visual Communication and Image Representation, 13(1/2):249–268.

    Google Scholar 

  • Paragios, N. and Deriche, R. 2002. Geodesic active regions and level set methods for supervised texture segmentation. International Journal of Computer Vision, 46(3):223–247.

    Article  MATH  Google Scholar 

  • Paragios, N., Rousson, M., and Ramesh, V. 2003. Distance transforms for non-rigid registration. Computer Vision and Image Understanding, 23:142–165.

    Article  Google Scholar 

  • Riklin-Raviv, T., Kiryati, N., and Sochen, N. 2004. Unlevel-sets: Geometry and prior-based segmentation. In Proc. 8th European Conference on Computer Vision, T. Pajdla and J. Matas (Eds.), volume 3024 of LNCS, Springer, Berlin, pp. 50–61.

  • Rosenhahn, B. 2003. Pose Estimation Revisited. PhD thesis, University of Kiel, Germany.

  • Rosenhahn, B. and Sommer, G. 2004. Pose estimation of free-form objects. In Computer Vision - Proc. 8th European Conference on Computer Vision, T. Pajdla and J. Matas (Eds.), vol. 3021 of LNCS, Springer, pp. 414–427.

  • Rousson, M. Brox, T., and Deriche, R. 2003. Active unsupervised texture segmentation on a diffusion based feature space. In Proc. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, pp. 699–704.

  • Rousson, M. and Deriche, R. 2002. A variational framework for active and adaptive segmentation of vector-valued images. In Proc. IEEE Workshop on Motion and Video Computing, Orlando, Florida, pp. 56–62.

  • Rousson, M. and Paragios, N. 2002. Shape priors for level set representations. In Computer Vision – ECCV 2002, A. Heyden, G. Sparr, M. Nielsen, and P. Johansen (Eds.), vol. 2351 of LNCS, Springer, Berlin pp. 78–92.

  • Rousson, M., Paragios, N., and Deriche, R. 2004. Implicit active shape models for 3D segmentation in MR imaging. In 7th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 3216 of LNCS, Springer, Berlin, pp. 209–216.

  • Shevlin, F. 1998. Analysis of orientation problems using Plücker lines. In International Conference on Pattern Recognition (ICPR), Brisbane vol. 1, pp. 685–689.

  • Shi, J. and Malik, J. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905.

    Article  Google Scholar 

  • Sifakis, E., Garcia, C., and Tziritas, G. 2002. Bayesian level sets for image segmentation. Journal of Visual Communication and Image Representation, 13(1/2):44–64.

    Google Scholar 

  • Sommer, G. (Ed) 2001. Geometric Computing with Clifford Algebra. Springer Verlag: Berlin.

    Google Scholar 

  • Tsai, A., Yezzi, A., and Willsky, A. 2001. Curve evolution implementation of the Mumford-Shah functional for image segmentation, denoising, interpolation, and magnification. IEEE Transactions on Image Processing, 10(8):1169–1186.

    Article  MATH  Google Scholar 

  • Vacchetti, L., Lepetit, V., and Fua, P. 2004. Stable real-time 3D tracking using online and offline information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10):1391–1391.

    Google Scholar 

  • Vese, L. and Chan, T. 2002. A multiphase level set framework for image segmentation using the Mumford and Shah model. International Journal of Computer Vision, 50(3):271–293.

    Article  MATH  Google Scholar 

  • Yezzi, A. and Soatto, S. 2003a. Stereoscopic segmentation. International Journal of Computer Vision, 53(1):31–43.

    Article  Google Scholar 

  • Yezzi, A. and Soatto, S. 2003b. Structure from motion for scenes without features. In Proc. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, WI, vol. 1, pp. 171–178.

  • Yezzi, A., Zollei, L., and Kapur, T. 2001. A variational framework for joint segmentation and registration. In Proc. IEEE Workshop on Mathematical Methods in Biomedical Image Analysis, pp. 44–51.

  • Zerroug, M. and Nevatia, R. 1996. Pose estimation of multi-part curved objects. In Proc. Image Understanding Workshop, pp. 831–835.

  • Zhang, Z. 1994. Iterative points matching for registration of free form curves and surfaces. International Journal of Computer Vision, 13(2):119–152.

    Article  Google Scholar 

  • Zhao, H.K., Chan, T., Merriman, B., and Osher, S. 1996. A variational level set approach to multiphase motion. Journal of Computational Physics, 127:179–195.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu, S.-C. and Yuille, A. 1996. Region competition: unifying snakes, region growing, and Bayes/MDL for multiband image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(9):884–900.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bodo Rosenhahn.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rosenhahn, B., Brox, T. & Weickert, J. Three-Dimensional Shape Knowledge for Joint Image Segmentation and Pose Tracking. Int J Comput Vision 73, 243–262 (2007). https://doi.org/10.1007/s11263-006-9965-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-006-9965-3

Keywords

Navigation