Skip to main content

Advertisement

Log in

Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

An approach is presented to match imaged trajectories of anatomical landmarks (e.g. hands, shoulders and feet) using semantic correspondences between human bodies. These correspondences are used to provide geometric constraints for matching actions observed from different viewpoints and performed at different rates by actors of differing anthropometric proportions. The fact that the human body has approximate anthropometric proportion allows innovative use of the machinery of epipolar geometry to provide constraints for analyzing actions performed by people of different sizes, while ensuring that changes in viewpoint do not affect matching. In addition, for linear time warps, a novel measure, constructed only from image measurements of the locations of anatomical landmarks across time, is proposed to ensure that similar actions performed at different rates are accurately matched as well. An additional feature of this new measure is that two actions from cameras moving at constant (and possibly different) velocities can also be matched. Finally, we describe how dynamic time warping can be used in conjunction with the proposed measure to match actions in the presence of nonlinear time warps. We demonstrate the versatility of our algorithm in a number of challenging sequences and applications, and report quantitative evaluation of the matching approach presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aggarwal, J., & Cai, Q. (1999). Human motion analysis: a review. Computer Vision and Image Understanding, 73(3), 428–440.

    Article  Google Scholar 

  • Aggarwal, J., & Park, S. (2004). Human motion: modeling and recognition of actions and interactions. In Second international symposium on 3d data processing, visualization and transmission, 2004.

  • Akita, K. (1984). Image sequence analysis of real world human motion. Pattern Recognition, 17(1), 73.

    Article  Google Scholar 

  • Ayers, D., & Shah, M. (1998). Recognizing human actions in a static room. In Proc. IEEE workshop on applications of computer vision, WACV’98 (pp. 42–47) 1998.

  • Badler, N., Philips, C., & Webber, B. (1993). Simulating humans. London: Oxford University Press.

    MATH  Google Scholar 

  • Black, M., & Jepson, A. (1998). Eigentracking: robust matching and tracking of articulated objects using a view-based representation. In European conference on computer vision (pp. 63–84) 1998.

  • Black, M., & Yacoob, Y. (1995). Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In IEEE international conference on computer vision (pp. 374–381) June 1995.

  • Blakemore, S., & Decety, J. (2004). From the perception of action to the understanding of intention. Nature Reviews, 2(1), 561.

    Google Scholar 

  • Blank, M., Gorelick, L., Shechtman, E., Irani, M., & Basri, R. (2005). Actions as space-time shapes. In Proceedings of the IEEE international conference on computer vision (pp. 1395–1402) 2005.

  • Bobick, A. F., & Ivanov, Y. (1998). Action recognition using probabilistic parsing. In Proceedings of the IEEE international conference on computer vision and pattern recognition (pp. 196–202) Santa Barbara, CA, 1998.

  • Boiman, O., & Irani, M. (2005). Detecting irregularities in images and in video. In Proceedings of the IEEE international conference on computer vision, Oct. 2005.

  • Bregler, C., Hertzmann, A., & Biermann, H. (2000). Recovering non-rigid 3d shape from image streams. In IEEE international conference on computer vision and pattern recognition (pp. 13–15) 2000.

  • Bridger, R. (1982). Human performance engineering: a guide for system designers. New York: Prentice Hall.

    Google Scholar 

  • Bridger, R. (1995). Introduction to ergonomics. New York: McGraw-Hill.

    Google Scholar 

  • Burns, J., Weiss, R., & Riseman, E. (1992). The non-existence of general-case view-invariants. In J. Mundy & A. Zisserman (Eds.) Geometric invariance in computer vision.

  • Buxton, H. (2003). Learning and understanding dynamic scene activity: a review. Image and Vision Computing, 21, 125–136.

    Article  Google Scholar 

  • Campbell, L. W., Becker, D. A., Azarbayejani, A., Bobick, A. F., & Pentland, A. (1996). Invariant features for 3d gesture recognition. In Proceedings, international conference on automatic face and gesture recognition (pp. 157–162) 1996.

  • Caspi, Y., & Irani, M. (2000). A step towards sequence-to-sequence alignment. In Proceedings of the IEEE international conference on computer vision and pattern recognition (pp. 682–689) 2000.

  • Cedras, C., & Shah, M. (1995). Motion-based recognition: a survey. Image and Vision Computing, 13(2), 129–155.

    Article  Google Scholar 

  • Daems, A., & Verfaillie, K. (1999). Viewpoint-dependent priming effects in the perception of human actions and body postures. Visual Cognition, 6, 665–693.

    Article  Google Scholar 

  • Darrell, T. J., Essa, I. A., & Pentland, A. P. (1995). Task-specific gesture analysis in real-time using interpolated views. IEEE Transactions on Pattern Analysis and Machine Vision, 18(12), 1236.

    Article  Google Scholar 

  • Davis, J., & Bobick, A. (1997). The representation and recognition of action using temporal templates. In IEEE international conference on computer vision and pattern recognition (pp. 928–934) 1997.

  • Davis, J., & Shah, M. (1994). Three-dimensional gesture recognition. In Proc. of Asilomar conference on signals, systems, and computers, 1994.

  • Decety, J., & Grezes, J. (1999). Neural mechanisms subserving the perception of human actions. Trends in Cognitive Sciences, 3, 172.

    Article  Google Scholar 

  • Easterby, R., Kroemer, K., & Chaffin, D. (1982). Anthropometry and biomechanics—theory and application. New York: Plenum Press.

    Google Scholar 

  • Farnell, B. (1999). Moving bodies, acting selves. Annual Review of Anthropology, 28, 341.

    Article  Google Scholar 

  • Fogassi, L., Gallese, V., Fadiga, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119(2), 593.

    Article  Google Scholar 

  • Gavrila, D. M. (1999). The visual analysis of human movement: a survey. Computer Vision and Image Understanding, 73(1), 82–98.

    Article  MATH  Google Scholar 

  • Goldman, A. (1970). A theory of human action. Englewood Cliffs: Prentice Hall.

    Google Scholar 

  • Gould, K., & Shah, M. (1989). The trajectory primal sketch: a multi-scale scheme for representing motion characteristics. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 79–85) San Diego, June 1989.

  • Gritai, A., & Shah, M. (2006). Tracking of human body joints using anthropometry. In IEEE international conference on multimedia and expo, Toronto, Canada, 2006.

  • Haritaoglu, I., Harwood, D., & Davis, L. (2000). W4: Real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 809.

    Article  Google Scholar 

  • Hartley, R. I., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Herman, M. (1979). Understanding body postures of human stick figures. PhD Thesis, University of Maryland.

  • Hogg, D. C. (1984) Interpreting images of a known moving object. PhD thesis, University of Sussex.

  • Horn, B., & Weldon, E. (1988). Direct methods for recovering motion. International Journal of Computer Vision, 2(1), 51.

    Article  Google Scholar 

  • Hu, W., Wang, L., & Tan, T. (2003). Recent development in human motion analysis. Pattern Recognition, 36(3), 585.

    Article  Google Scholar 

  • Johansson, G. (1993). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14(2), 201–211.

    Google Scholar 

  • Johnson, N., Galata, A., & Hogg, D. (2001). Learning variable length Markov models of behaviour. Computer Vision and Image Understanding Journal, 81, 398–413.

    Article  MATH  Google Scholar 

  • Ju, S., Black, M., & Yacoob, Y. (1996). Cardboard people: A parameterized model of articulated image motion. In Proc. IEEE int. conf. on automatic face and gesture recognition (pp. 38–44) 1996.

  • Kroemer, K., Easterby, R., & Chaffin, D. (1982). Anthropometry and biomechanics—theory and application. New York: Plenum Press.

    Google Scholar 

  • Laptev, I. et al. (2005). Periodic motion detection and segmentation via approximate sequence alignment. In Proceedings of the IEEE international conference on computer vision, 2005.

  • Li, H., & Greenspan, M. (2005). Multi-scale gesture recognition from time-varying contours. In Proceedings of the IEEE international conference on computer vision (pp. 236–243) 2005.

  • Liao, W., Aggarwal, J., Cai, Q., & Sabata, B. (1994). Articulated and elastic non-rigid motion: a review. In Workshop on motion of non-rigid and articulated objects, 1994.

  • Moeslund, T., & Granumm, E. (2001). A survey of computer vision based human motion capture. Computer Vision and Image Understanding, 81(3), 231.

    Article  MATH  Google Scholar 

  • Nishikawa, A., Ohnishi, A., & Miyazaki, F. (1998). Description and recognition of human gestures based on the transition of curvature from motion images. In Proc. IEEE int. conf. on automatic face and gesture recognition (pp. 552–557) 1998.

  • Niyogi, S., & Adelson, E. H. (1994). Analyzing and recognizing walking figures in xyt. In IEEE international conference on computer vision and pattern recognition (pp. 469–474) 1994.

  • O’Rourke, J., & Badler, N. (1980). Model-based image analysis of human motion using constraint propagation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2(6), 522.

    Google Scholar 

  • Oliver, N., Rosario, B., & Pentland, A. (1999). A Bayesian computer vision system for modeling human interactions. In Proceedings of ICVS99, Gran Canaria, Spain, January 1999.

  • Oliver, N., Horvitz, E., & Garg, A. (2002). Layered representations for human activity recognition. In Fourth IEEE int. conf. on multimodal interfaces (pp. 3–8) 2002.

  • Parameswaran, V., & Chellappa, R. (2002). Quasi-invariants for human action representation and recognition. In International conference on pattern recognition, 2002.

  • Parameswaran, V., & Chellappa, R. (2003). View invariants for human action recognition. In Proceedings of the IEEE international conference on computer vision and pattern recognition, 2003.

  • Parameswaran, V., & Chellappa, R. (2009). Using 2d projective invariance for human action recognition. International Journal of Computer Vision.

  • Polana, R., & Nelson, R. C. (1994). Detecting activities. Journal of Visual Communication and Image Representation, 5, 172–180.

    Article  Google Scholar 

  • Prinz, W. (1997). Perception and action planning. European Journal of Cognitive Psychology, 9(2), 129.

    Article  Google Scholar 

  • Rao, C., & Shah, M. (2001). View invariance in action recognition. In Proceedings of the IEEE international conference on computer vision and pattern recognition, Kauai, Hawaii, Dec. 2001.

  • Rao, C., Gritai, A., Shah, M., & Syeda-Mahmood, T. (2003). View-invariant alignment and matching of video sequences. In Proceedings of the IEEE international conference on computer vision (pp. 939–945) 2003.

  • Rashid, R. (1980). Towards a system for the interpretation of moving light display. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2(6), 574.

    Google Scholar 

  • Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Spech, and Signal Processing, 26(1), 43.

    Article  MATH  Google Scholar 

  • Seitz, S. M., & Dyer, C. R. (1997). View-invariant analysis of cyclic motion. International Journal of Computer Vision, 25, 1–25.

    Article  Google Scholar 

  • Shah, M., Gritai, A., & Sheikh, Y. (2004). On the use of anthropometry in the invariant analysis of human actions. In International conference on pattern recognition, 2004.

  • Shechtman, E., & Irani, M. (2005). Space-time behavior based correlation. In IEEE conference on computer vision and pattern recognition, June 2005.

  • Sheikh, Y., Gritai, A., & Shah, M. (2007). On the spacetime geometry of Galilean cameras. In Proceedings of the IEEE international conference on computer vision and pattern recognition, 2007.

  • Singer, Y., Fine, S., & Tishby, N. (1998). The hierarchical hidden Markov model: analysis and applications. Machine Learning, 32(1), 41–62.

    Article  MATH  Google Scholar 

  • Starner, T., & Pentland, A. (1996). Motion-based recognition. In Computational imaging and vision series. Real-time American sign language recognition from video using hidden Markov models. Dordrecht: Kluwer Academic.

    Google Scholar 

  • Sukthankar, R., Ke, Y., & Hebert, M. (2005). Efficient visual event detection using volumetric features. In Proceedings of the IEEE international conference on computer vision, Oct. 2005.

  • Venkatesh, S., Nguyen, N., Phung, D., & Bui, H. H. (2005). Learning and detecting activities from movement trajectories using the hierarchical hidden Markov models. In Proceedings of the IEEE international conference on computer vision and pattern recognition, San Diego, CA, 2005.

  • Verfaillie, K. (1992). Variant points of view on viewpoint invariance. Canadian Journal of Psychology, 46, 215.

    Google Scholar 

  • Von Mises, L. (1966). Human action: a treatise on economics. Chicago: Henry Regnery.

    Google Scholar 

  • Yamato, J., Ohya, J., & Ishii, L. (1995). Recognizing human action in time-sequential images using hidden Markov model. In Proc. of the IEEE conference on computer vision and pattern recognition (pp. 624–630) 1995.

  • Yang, M., & Ahuja, N. (1998). Extracting gestural motion trajectories. In Proc. IEEE int. conf. on automatic face and gesture recognition (pp. 10–15) 1998.

  • Yang, J., Xu, Y., & Chen, C. S. (1997). Human action learning via hidden Markov model. IEEE Transactions on System, Man, and Cybernetics, 27(1), 34–44.

    Article  Google Scholar 

  • Yilmaz, A., & Shah, M. (2005). Actions as objects: a novel action representation. In IEEE Proceedings on the international conference on computer vision and pattern recognition, 2005.

  • Zatsiorsky, V. (2002). Kinematics of human motion. Champaign: Human Kinetics.

    Google Scholar 

  • Zelnik-Manor, L., & Irani, M. (2001). Event-based analysis of video. In IEEE conference on computer vision and pattern recognition, Dec. 2001.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexei Gritai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gritai, A., Sheikh, Y., Rao, C. et al. Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms. Int J Comput Vis 84, 325–343 (2009). https://doi.org/10.1007/s11263-009-0239-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-009-0239-8

Keywords

Navigation