Skip to main content
Log in

Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We describe a flexible model for representing images of objects of a certain class, known a priori, such as faces, and introduce a new algorithm for matching it to a novel image and thereby perform image analysis. The flexible model, known as a multidimensional morphable model, is learned from example images of objects of a class. In this paper we introduce an effective stochastic gradient descent algorithm that automatically matches a model to a novel image. Several experiments demonstrate the robustness and the broad range of applicability of morphable models. Our approach can provide novel solutions to several vision tasks, including the computation of image correspondence, object verification and image compression.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Atick, J., Griffin, P., and Redlich, A. 1995. Statistical approach to shape from shading: Reconstruction of 3D face surfaces from single 2D images. Neural Computation.

  • Bergen, J. and Hingorani, R. 1990. Hierarchical motion-based frame rate conversion. Technical Report, David Sarnoff Research Center.

  • Besl, P. and Jain, R. 1985. Three-dimensional object recognition. Computing Surveys, 17(1):75-145.

    Article  Google Scholar 

  • Beymer, D. 1995a. Vectorizing face images by interleaving shape and texture computations. AI Memo 1537, MIT.

  • Beymer, D. 1995b. Pose-invariant face recognition using real and virtual views. Ph.D. Thesis, Massachusetts Institute of Technology.

  • Beymer, D. and Poggio, T. 1995. Face recognition from one example view. AI Memo 1536, MIT.

  • Beymer, D. and Poggio, T. 1996. Image representations for visual learning. Science, 272:1905-1909.

    Google Scholar 

  • Beymer, D., Shashua, A., and Poggio, T. 1993. Example based image analysis and synthesis. AI Memo 1431, MIT.

  • Blake, A. and Isard, M. 1994. 3D position attitude and shape input using video tracking of hands and lips. In Computer Graphics Proceedings, pp. 185-192.

  • Bulthoff, H., Edelman, S., and Tarr, M. 1995. How are threedimensional objects represented in the brain? Cerebral Cortex, 5(3):247-260.

    Google Scholar 

  • Burt, P. 1984. The pyramid as a structure for efficient computation. Multiresolution Image Processing and Analysis. Springer-Verlag, pp. 6-37.

  • Burt, P. and Adelson, E. 1983. The laplacian pyramid as a compact image code. IEEE Transactions on Communications, COM-31(4):532-540.

    Article  Google Scholar 

  • Choi, C., Okazaki, T., Harashima, H., and Takebe, T. 1991. A system of analyzing and synthesizing facial images. In Proc. IEEE, pp. 2665-2668.

  • Cootes, T. and Taylor, C. 1992. Active shape models-smart snakes. In British Machine Vision Conference, pp. 266-275.

  • Cootes, T. and Taylor, C. 1994. Using grey-level models to improve active shape model search. In International Conference on Pattern Recognition, pp. 63-67.

  • Cootes, T., Taylor, C., Cooper, D., and Graham, J. 1992. Training models of shape from sets of examples. In British Machine Vision Conference, pp. 9-18.

  • Cootes, T., Taylor, C., and Lanitis, A. 1994. Multi-resolution search with active shape models. In International Conference on Pattern Recognition, pp. 610-612.

  • Cootes, T., Taylor, C., Lanitis, A., Cooper, D., and Graham, J. 1993. Building and using flexible models incorporating grey-level information. In ICCV, Berlin, pp. 242-246.

  • Edelman, S. and Bulthoff, H. 1990. Viewpoint-specific representations in three dimensional object recognition, AI Memo 1239, MIT.

  • Ezzat, T. 1996. Example-based image analysis and synthesis for images of human faces, Master’s Thesis, Massachusetts Institute of Technology.

  • Hallinan, P. 1995. A deformable model for the recognition of human faces under arbitrary illumination, Ph.D. Thesis, Harvard University.

  • Hill, A., Cootes, T., and Taylor, C. 1992. A generic system for image interpretation using flexible templates. In British Machine Vision Conference, pp. 276-285.

  • Jones, M. and Poggio, T. 1995. Model-based matching of line drawings by linear combinations of prototypes. In Proceedings of the Fifth International Conference on Computer Vision, pp. 531-536.

  • Jones, M., Sinha, P., Vetter, T., and Poggio, T. 1997. Topdown learning of low-level vision tasks. Current Biology7(12): 991-994.

    Article  Google Scholar 

  • Kirby, M. and Sirovich, L. 1990. The application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1):103-108.

    Article  Google Scholar 

  • Lanitis, A., Taylor, C., and Cootes, T. 1995. A unified approach to coding and interpreting face images. In ICCV, Cambridge, MA, pp. 368-373.

  • Logothetis, N., Pauls, J., and Poggio, T. 1995. Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5(5):552-563.

    Article  Google Scholar 

  • Murase, H. and Nayar, S. 1995. Visual learning and recognition of 3-D objects from appearance. International Journal of Computer Vision, 14:5-24.

    Google Scholar 

  • Nastar, C., Moghaddam, B., and Pentland, A. 1996. Generalized image matching. Statistical learning of physically-based deformations, In ECCV, Cambridge, UK.

  • Pauls, J., Bricolo, E., and Logothetis, N. 1996. Physiological evidence for viewer centered representation in the monkey. In Early Visual Learning, S. Nayar and T. Poggio (Eds.), Oxford University Press.

  • Poggio, T. 1990. A theory of how the brain might work. AI Memo 1253, MIT.

  • Poggio, T. and Beymer, D. 1996. Learning to see. IEEE Spectrum, pp. 60-69.

  • Poggio, T. and Brunelli, R. 1992. A novel approach to graphics. AI Memo 1354, MIT.

  • Poggio, T. and Vetter, T. 1992. Recognition and structure from one 2D model view: Observations on prototypes, object classes and symmetries. AI Memo 1347, MIT.

  • Rikert, T. and Jones, M. 1998. Gaze estimation using morphable models. Submitted to the Conference on Face and Gesture Recognition.

  • Robbins, H. and Munroe, S. 1951. A stochastic approximation method. Annals of Mathematical Statistics, 22:400-407.

    Google Scholar 

  • Shashua, A. 1992a. Geometry and photometry in 3D visual recognition. Ph.D. Thesis, Massachusetts Institute of Technology.

  • Shashua, A. 1992b. Projective structure from two uncalibrated images: Structure from motion and recognition. AI Memo 1363, MIT.

  • Sinha, P. 1995.Perceiving and recognizing 3D forms. Ph.D. Thesis, Massachusetts Institute of Technology.

  • Sinha, P. and Poggio, T. 1996. Role of learning in three-dimensional form perception. Nature, 384(6608):460-463.

    Article  Google Scholar 

  • Troje, N. and Bulthoff, H. 1995. Face recognition under varying pose: The role of texture and shape.Vision Research, 36(12):1761-1771.

    Article  Google Scholar 

  • Turk, M. and Pentland, A. 1991. Face recognition using eigenfaces. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 586-591.

  • Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:992-1005.

    Article  Google Scholar 

  • Vetter, T., Jones, M., and Poggio, T. 1997. Abootstrapping algorithm for learning linearized models of object classes. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 40-46.

  • Vetter, T. and Poggio, T. 1995. Linear object classes and image synthesis from a single example image. AI Memo 1531, MIT.

  • Viola, P. 1995. Alignment by maximization of mutual information. Ph.D. Thesis, Massachusetts Institute of Technology.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jones, M.J., Poggio, T. Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes. International Journal of Computer Vision 29, 107–131 (1998). https://doi.org/10.1023/A:1008074226832

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008074226832