Skip to main content
Log in

Probabilistic Methods for Finding People

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Finding people in pictures presents a particularly difficult object recognition problem. We show how to find people by finding candidate body segments, and then constructing assemblies of segments that are consistent with the constraints on the appearance of a person that result from kinematic properties. Since a reasonable model of a person requires at least nine segments, it is not possible to inspect every group, due to the huge combinatorial complexity.

We propose two approaches to this problem. In one, the search can be pruned by using projected versions of a classifier that accepts groups corresponding to people. We describe an efficient projection algorithm for one popular classifier, and demonstrate that our approach can be used to determine whether images of real scenes contain people.

The second approach employs a probabilistic framework, so that we can draw samples of assemblies, with probabilities proportional to their likelihood, which allows to draw human-like assemblies more often than the non-person ones. The main performance problem is in segmentation of images, but the overall results of both approaches on real images of people are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agin, G.J. 1972. Representation and description of curved objects. Ph.D. Thesis, Stanford University, Stanford, CA.

    Google Scholar 

  • Binford, T.O. 1971. Visual perception by computer. In Proc. IEEE Conference on Systems and Control, Miami, FL.

  • Blake, A. and Isard, M. 1998. Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion. Springer Verlag, London.

    Google Scholar 

  • Brady, J.M. and Asada, H. 1984. Smoothed local symmetries and their implementation. International Journal of Robotics Research, 3(3):36-61, New York.

    Google Scholar 

  • Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 8-15, Santa Barbara, CA.

  • Brooks, R.A. 1981. Symbolic reasoning among 3-D models and 2-D images. Ph.D. Thesis, Stanford University, Computer Science Dept. Stanford, CA.

    Google Scholar 

  • Burl, M.C., Leung, T.K., and Perona. P. 1995. Face localisation via shape statistics. In Int. Workshop on Automatic Face and Gesture Recognition.

  • Cutler, R. and Davis, L.S. 2000. Robust real-time periodic motion detection, analysis and applications. IEEE T. Pattern Analysis and Machine Intelligence, 22(8):781-796.

    Google Scholar 

  • Dempster, A.P., Laird, N.M., and Rubin, D.B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B (39), pp. 185-197.

  • Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In IEEE Conf. on Computer Vision and Pattern Recognition.

  • Dickinson, S. Pentland, A.P., and Rosenfeld. A. 2000. 3D shape recovery using distributed aspect matching. IEEE Trans. Patt. Anal. Mach. Intell., 14(2):174-198.

    Google Scholar 

  • Faugeras, O.D. and Hebert, M.1986. The representation, recognition, and locating of 3-D objects. International Journal of Robotics Research, 5(3):27-52.

    Google Scholar 

  • Felzenszwalb, P. and Huttenlocher, D. 2000. Efficient matching of pictorial structures. In IEEE Conf. on Computer Vision and Pattern Recognition.

  • Forsyth, D.A. and Fleck, M.M. 1997. Body plans. In IEEE Conf. on Computer Vision and Pattern Recognition.

  • Forsyth, D.A. and Fleck, M.M. 1999. Automatic detection of human nudes.Int. J. Computer Vision, 32(1):63-77.

    Google Scholar 

  • Forsyth, D.A., Fleck, M.M., and Bregler, C. 1996. Finding naked people. In European Conference on Computer Vision.

  • Freund, Y. and Schapire, R.E. 1996. Experiments with a newboosting algorithm. In Machine Learning-13.

  • Gavrila, D.M. and Davis, L.S. 1996. 3d model-based tracking of humans in action: A multi-view approach. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 73-80.

  • Grimson, W.E.L. and Lozano-Pérez, T. 1987. Localizing overlapping parts by searching the interpretation tree. IEEE Trans. Patt. Anal. Mach. Intell., 9(4):469-482.

    Google Scholar 

  • Haddon, J. and Forsyth, D.A. 1997. Shading primitives. In Int. Conf. on Computer Vision.

  • Haritaoglu, I., Harwood, D., and Davis, L.S. 2000. W4: Real-time surveillance of people and their activities. IEEE T.Pattern Analysis and Machine Intelligence, 22(8):809-830.

    Google Scholar 

  • Hogg, D. 1983. Model based vision: a program to see a walking person. Image and Vision Computing, 1(1):5-20.

    Google Scholar 

  • Huang, C-Y., Camps, O.T., and Kanungo, T. 1997. Object recognition using appearance-based parts and relations. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 877-883.

  • Huttenlocher, D.P. and Ullman, S. 1987. Object recognition using alignment. In Proc. Int. Conf. Comp. Vision, London, U.K. pp. 102-111.

  • Kanazawa, K., Koller, D., and Russell, S. 1995. Stochastic simulation algorithms for dynamic probabilistic networks. In Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference.

  • Leung, T.K., Burl, M.C., and Perona, P. 1995. Finding faces in cluttered scenes using random labelled graph matching. In Int. Conf. on Computer Vision.

  • Liu, F. and Picard, R.W. 1996. Detecting and segmenting periodic motion. Media lab vision and modelling tr-400, MIT, Cambridge, MA.

    Google Scholar 

  • Meila, M. and Jordan, M. 2000. Learning with mixtures of trees. submitted Journal of Machine Learning Research.

  • Neal, R.M. 1998. Annealed importance sampling. Technical Report no. 9805, University of Toronto.

  • Nevatia, R. and Binford, T.O. 1977. Description and recognition of complex curved objects. Artificial Intelligence, 8(1):77-98.

    Google Scholar 

  • Niyogi, S.A. and Adelson, E.H. 1995. Analyzing and recognizing walking figures in xyt. Media lab vision and modelling tr-223, MIT, Cambridge, MA.

    Google Scholar 

  • Oren, M., Papageorgiou, C., Sinha, P., and Osuna, E. 1997. Pedestrian detection using wavelet templates. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 193-199.

  • O'Rourke, J. and Badler, N. 1980. Model-based image analysis of human motion using constraint propagation. IEEE T. Pattern Analysis and Machine Intelligence, 2(6):522-546.

    Google Scholar 

  • Poggio, T. and Sung, K.-K. 1995. Finding human faces with a gaussian mixture distribution-based face model. In Asian Conf. on Computer Vision, pp. 435-440.

  • Rehg, J. and Kanade, T. 1994. Visual tracking of high dof articulated structures: An application to human hand tracking. In European Conference on Computer Vision, pp. 35-46.

  • Rohr, K. 1993. Incremental recognition of pedestrians from image sequences. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 9-13.

  • Rowley, H.A., Baluja, S., and Kanade, T. 1996a. Human face detection in visual scenes. In Touretzky, D.S., Mozer, M.C., and Hasselmo, M.E. (Eds.). Advances in Neural Information Processing, 8:875-881, MIT Press: Cambridge, MA, USA.

    Google Scholar 

  • Rowley, H.A., Baluja, S., and Kanade, T. 1996b. Neural networkbased face detection. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 203-208.

  • Rowley, H.A., Baluja, S., and Kanade, T. 1998a. Neural networkbased face detection. IEEE T. Pattern Analysis and Machine Intelligence, 20(1):23-38.

    Google Scholar 

  • Rowley, H.A., Baluja, S., and Kanade, T. 1998b. Rotation invariant neural network-based face detection. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 38-44.

  • Shi, J. and Malik, J. 1997. Normalised cuts and image segmentation. In IEEE Conf. on Computer Vision and Pattern Recognition, pp. 731-737.

  • Shuppan, E. Pose File, 1993-1996. Vol. 1-7. Books Nippan. A collection of photographs of human models, annotated in Japanese, Japan.

  • Sung, K-K. and Poggio, T. 1998. Example-based learning for viewbased human face detection. PAMI, 20(1):39-51.

    Google Scholar 

  • Thompson, D.W. and Mundy, J.L. 1987. Three-dimensional model matching from an unconstrained viewpoint. In IEEE Int. Conf. on Robotics and Automation, Raleigh, NC, pp. 208-220.

  • Ullman, S. 1996. High-level Vision: Object Recognition and Visual Cognition. MIT Press: Cambridge, MA, USA.

    Google Scholar 

  • Ulupinar, F. and Nevatia, R. 1988. Using symmetries for analysis of shape from contour. In Proc. Int. Conf. Comp. Vision, Tampa, FL, pp. 414-426.

  • Vapnik, V.N. 1996. The Nature of Statistical Learning Theory. Springer Verlag.

  • Wren, C.R., Azarbayejani, A., Darrell, T., and Pentland, A.P. 1997. Pfinder: Real-time tracking of the human body. PAMI, 19(7):780-785.

    Google Scholar 

  • Zerroug, M. and Nevatia, R. 1999. Part-based 3d descriptions of complex objects from a single image. PAMI, 21(9):835-848.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ioffe, S., Forsyth, D. Probabilistic Methods for Finding People. International Journal of Computer Vision 43, 45–68 (2001). https://doi.org/10.1023/A:1011179004708

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011179004708

Navigation