Skip to main content
Log in

Pictorial Structures for Object Recognition

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this paper we present a computationally efficient framework for part-based modeling and recognition of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to represent an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We address the problem of using pictorial structure models to find instances of an object in an image as well as the problem of learning an object model from training examples, presenting efficient algorithms in both cases. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amini, A.A., Weymouth, T.E., and Jain, R.C. 1990. Using dynamic programming for solving variational problems in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence,12(9):855-867.

    Google Scholar 

  • Amit, Y. and Geman, D. 1999. A computational model for visual selection. Neural Computation,11(7):1691-1715.

    Google Scholar 

  • Ayache, N.J. and Faugeras, O.D. 1986. Hyper: A new approach for the recognition and positioning of two-dimensional objects. IEEE Transactions on Pattern Analysis and Machine Intelligence,8(1):44-54.

    Google Scholar 

  • Berger, J.O. 1985. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag.

  • Borgefors, G. 1986. Distance transformations in digital images. Computer Vision, Graphics, and Image Processing,34(3):344-371.

    Google Scholar 

  • Borgefors, G. 1988. Hierarchical chamfer matching: A parametric edge matching algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence,10(6):849-865.

    Google Scholar 

  • Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence,23(11):1222-1239.

    Google Scholar 

  • Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 8-15.

  • Burl, M.C. and Perona, P. 1996. Recognition of planar object classes. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 223-230.

  • Burl, M.C., Weber, M., and Perona, P. 1998. Aprobabilistic approach to object recognition using local photometry and global geometry. In European Conference on Computer Vision, pp. II:628-641.

    Google Scholar 

  • Chow, C.K. and Liu, C.N. 1968. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory,14(3):462-467.

    Google Scholar 

  • Cormen, T.H., Leiserson, C.E., and Rivest, R.L. 1996. Introduction to Algorithms. MIT Press and McGraw-Hill.

  • Dickinson, S.J., Biederman, I., Pentland, A.P., Eklundh, J.O., Bergevin, R., and Munck-Fairwood, R.C. 1993. The use of geons for generic 3-d object recognition. In International Joint Conference on Artificial Intelligence, pp. 1693-1699.

  • Felzenszwalb, P.F. and Huttenlocher, D.P. 2000. Efficient matching of pictorial structures. In IEEE Conference on Computer Vision and Pattern Recognition, pp. II:66-73.

    Google Scholar 

  • Fischler, M.A. and Bolles, R.C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM,24(6):381-395.

    Google Scholar 

  • Fischler, M.A. and Elschlager, R.A. 1973. The representation and matching of pictorial structures. IEEE Transactions on Computer,22(1):67-92.

    Google Scholar 

  • Freeman, W.T. and Adelson, E.H. 1991. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence,13(9):891-906.

    Google Scholar 

  • Gdalyahu, Y. and Weinshall, D. 1999. Flexible syntactic matching of curves and its application to automatic hierarchical classification of silhouettes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(12):1312-1328.

    Google Scholar 

  • Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence,6(6):721-741.

    Google Scholar 

  • Grimson, W.E.L. and Lozano-Perez, T. 1987. Localizing overlapping parts by searching the interpretation tree. IEEE Transactions on Pattern Analysis and Machine Intelligence,9(4):469-482.

    Google Scholar 

  • Gumbel, E.J., Greenwood, J.A., and Durand, D. 1953. The circular normal distribution: Theory and tables. Journal of the American Statistical Association,48:131-152.

    Google Scholar 

  • Huttenlocher, D.P., Klanderman, G.A., and Rucklidge, W.J. 1993. Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence,15(9):850-863.

    Google Scholar 

  • Huttenlocher, D.P. and Ullman, S. 1990. Recognizing solid objects by alignment with an image. International Journal of Computer Vision,5(2):195-212.

    Google Scholar 

  • Ioffe, S. and Forsyth, D.A. 2001. Probabilistic methods for finding people. International Journal of Computer Vision, 43(1):45-68.

    Google Scholar 

  • Ishikawa, H. and Geiger, D. 1998. Segmentation by grouping junctions. In IEEE Conference on Computer Vision and Pattern Recognition, pp. 125-131.

  • Ju, S.X., Black, M.J., and Yacoob, Y. 1996. Cardboard people: A parameterized model of articulated motion. In International Conference on Automatic Face and Gesture Recognition, pp. 38-44.

  • Karzanov, A.V. 1992. Quick algorithm for determining the distances from the points of the given subset of an integer lattice to the points of its complement. Cybernetics and System Analysis, pp. 177-181. Translation from the Russian by Julia Komissarchik.

  • Lamdan, Y., Schwartz, J.T., and Wolfson, H.J. 1990. Affine invariant model-based object recognition. IEEE Transactions on Robotics and Automation,6(5):578-589.

    Google Scholar 

  • Moghaddam, B. and Pentland, A.P. 1997. Probabilistic visual learning for object representation. IEEE Transactions on Pattern Analysis and Machine Intelligence,19(7):696-710.

    Google Scholar 

  • Murase, H. and Nayar, S.K. 1995. Visual learning and recognition of 3-d objects from appearance. International Journal of Computer Vision,14(1):5-24.

    Google Scholar 

  • Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.

  • Pentland, A.P. 1987. Recognition by parts. In IEEE International Conference on Computer Vision, pp. 612-620.

  • Rabiner, L. and Juang, B. 1993. Fundamentals of Speech Recognition. Prentice Hall.

  • Ramanan, D. and Forsyth, D.A. 2003. Finding and tracking people from the bottom up. In IEEE Conference on Computer Vision and Pattern Recognition, pp. II:467-474.

  • Rao, R.P.N. and Ballard, D.H. 1995. An active vision architecture based on iconic representations. Artificial Intelligence,78(1/2):461-505.

    Google Scholar 

  • Rivlin, E., Dickinson, S.J., and Rosenfeld, A. Recognition by functional parts. Computer Vision and Image Understanding,62(2):164-176, September 1995.

    Google Scholar 

  • Roberts, L.G. 1965. Machine perception of 3-d solids. In Optical and Electro-optical Information Processing, pp. 159-197.

  • Rucklidge, W. 1996. Efficient Visual Recognition Using the Hausdorff Distance. Springer-Verlag, LNCS 1173.

  • Sebastian, T.B., Klein, P.N., and Kimia, B.B. 2001. Recognition of shapes by editing shock graphs. In IEEE International Conference on Computer Vision, pp. I:755-762.

    Google Scholar 

  • Turk, M. and Pentland, A.P. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience,3(1):71-96.

    Google Scholar 

  • Wells, W.M. III 1986. Efficient synthesis of Gaussian filters by cascaded uniform filters. IEEE Transactions on Pattern Analysis and Machine Intelligence,8(2):234-239.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Felzenszwalb, P.F., Huttenlocher, D.P. Pictorial Structures for Object Recognition. International Journal of Computer Vision 61, 55–79 (2005). https://doi.org/10.1023/B:VISI.0000042934.15159.49

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VISI.0000042934.15159.49

Navigation