Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms

Gritai, Alexei; Sheikh, Yaser; Rao, Cen; Shah, Mubarak

doi:10.1007/s11263-009-0239-8

Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms

Published: 24 April 2009

Volume 84, pages 325–343, (2009)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Alexei Gritai¹,
Yaser Sheikh²,
Cen Rao³ &
…
Mubarak Shah⁴

257 Accesses
19 Citations
Explore all metrics

Abstract

An approach is presented to match imaged trajectories of anatomical landmarks (e.g. hands, shoulders and feet) using semantic correspondences between human bodies. These correspondences are used to provide geometric constraints for matching actions observed from different viewpoints and performed at different rates by actors of differing anthropometric proportions. The fact that the human body has approximate anthropometric proportion allows innovative use of the machinery of epipolar geometry to provide constraints for analyzing actions performed by people of different sizes, while ensuring that changes in viewpoint do not affect matching. In addition, for linear time warps, a novel measure, constructed only from image measurements of the locations of anatomical landmarks across time, is proposed to ensure that similar actions performed at different rates are accurately matched as well. An additional feature of this new measure is that two actions from cameras moving at constant (and possibly different) velocities can also be matched. Finally, we describe how dynamic time warping can be used in conjunction with the proposed measure to match actions in the presence of nonlinear time warps. We demonstrate the versatility of our algorithm in a number of challenging sequences and applications, and report quantitative evaluation of the matching approach presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exact Function Alignment Under Elastic Riemannian Metric

ShapeFit and ShapeKick for Robust, Scalable Structure from Motion

Uncertainty-DTW for Time Series and Sequences

References

Aggarwal, J., & Cai, Q. (1999). Human motion analysis: a review. Computer Vision and Image Understanding, 73(3), 428–440.
Article Google Scholar
Aggarwal, J., & Park, S. (2004). Human motion: modeling and recognition of actions and interactions. In Second international symposium on 3d data processing, visualization and transmission, 2004.
Akita, K. (1984). Image sequence analysis of real world human motion. Pattern Recognition, 17(1), 73.
Article Google Scholar
Ayers, D., & Shah, M. (1998). Recognizing human actions in a static room. In Proc. IEEE workshop on applications of computer vision, WACV’98 (pp. 42–47) 1998.
Badler, N., Philips, C., & Webber, B. (1993). Simulating humans. London: Oxford University Press.
MATH Google Scholar
Black, M., & Jepson, A. (1998). Eigentracking: robust matching and tracking of articulated objects using a view-based representation. In European conference on computer vision (pp. 63–84) 1998.
Black, M., & Yacoob, Y. (1995). Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In IEEE international conference on computer vision (pp. 374–381) June 1995.
Blakemore, S., & Decety, J. (2004). From the perception of action to the understanding of intention. Nature Reviews, 2(1), 561.
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., & Basri, R. (2005). Actions as space-time shapes. In Proceedings of the IEEE international conference on computer vision (pp. 1395–1402) 2005.
Bobick, A. F., & Ivanov, Y. (1998). Action recognition using probabilistic parsing. In Proceedings of the IEEE international conference on computer vision and pattern recognition (pp. 196–202) Santa Barbara, CA, 1998.
Boiman, O., & Irani, M. (2005). Detecting irregularities in images and in video. In Proceedings of the IEEE international conference on computer vision, Oct. 2005.
Bregler, C., Hertzmann, A., & Biermann, H. (2000). Recovering non-rigid 3d shape from image streams. In IEEE international conference on computer vision and pattern recognition (pp. 13–15) 2000.
Bridger, R. (1982). Human performance engineering: a guide for system designers. New York: Prentice Hall.
Google Scholar
Bridger, R. (1995). Introduction to ergonomics. New York: McGraw-Hill.
Google Scholar
Burns, J., Weiss, R., & Riseman, E. (1992). The non-existence of general-case view-invariants. In J. Mundy & A. Zisserman (Eds.) Geometric invariance in computer vision.
Buxton, H. (2003). Learning and understanding dynamic scene activity: a review. Image and Vision Computing, 21, 125–136.
Article Google Scholar
Campbell, L. W., Becker, D. A., Azarbayejani, A., Bobick, A. F., & Pentland, A. (1996). Invariant features for 3d gesture recognition. In Proceedings, international conference on automatic face and gesture recognition (pp. 157–162) 1996.
Caspi, Y., & Irani, M. (2000). A step towards sequence-to-sequence alignment. In Proceedings of the IEEE international conference on computer vision and pattern recognition (pp. 682–689) 2000.
Cedras, C., & Shah, M. (1995). Motion-based recognition: a survey. Image and Vision Computing, 13(2), 129–155.
Article Google Scholar
Daems, A., & Verfaillie, K. (1999). Viewpoint-dependent priming effects in the perception of human actions and body postures. Visual Cognition, 6, 665–693.
Article Google Scholar
Darrell, T. J., Essa, I. A., & Pentland, A. P. (1995). Task-specific gesture analysis in real-time using interpolated views. IEEE Transactions on Pattern Analysis and Machine Vision, 18(12), 1236.
Article Google Scholar
Davis, J., & Bobick, A. (1997). The representation and recognition of action using temporal templates. In IEEE international conference on computer vision and pattern recognition (pp. 928–934) 1997.
Davis, J., & Shah, M. (1994). Three-dimensional gesture recognition. In Proc. of Asilomar conference on signals, systems, and computers, 1994.
Decety, J., & Grezes, J. (1999). Neural mechanisms subserving the perception of human actions. Trends in Cognitive Sciences, 3, 172.
Article Google Scholar
Easterby, R., Kroemer, K., & Chaffin, D. (1982). Anthropometry and biomechanics—theory and application. New York: Plenum Press.
Google Scholar
Farnell, B. (1999). Moving bodies, acting selves. Annual Review of Anthropology, 28, 341.
Article Google Scholar
Fogassi, L., Gallese, V., Fadiga, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119(2), 593.
Article Google Scholar
Gavrila, D. M. (1999). The visual analysis of human movement: a survey. Computer Vision and Image Understanding, 73(1), 82–98.
Article MATH Google Scholar
Goldman, A. (1970). A theory of human action. Englewood Cliffs: Prentice Hall.
Google Scholar
Gould, K., & Shah, M. (1989). The trajectory primal sketch: a multi-scale scheme for representing motion characteristics. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 79–85) San Diego, June 1989.
Gritai, A., & Shah, M. (2006). Tracking of human body joints using anthropometry. In IEEE international conference on multimedia and expo, Toronto, Canada, 2006.
Haritaoglu, I., Harwood, D., & Davis, L. (2000). W4: Real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 809.
Article Google Scholar
Hartley, R. I., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
MATH Google Scholar
Herman, M. (1979). Understanding body postures of human stick figures. PhD Thesis, University of Maryland.
Hogg, D. C. (1984) Interpreting images of a known moving object. PhD thesis, University of Sussex.
Horn, B., & Weldon, E. (1988). Direct methods for recovering motion. International Journal of Computer Vision, 2(1), 51.
Article Google Scholar
Hu, W., Wang, L., & Tan, T. (2003). Recent development in human motion analysis. Pattern Recognition, 36(3), 585.
Article Google Scholar
Johansson, G. (1993). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14(2), 201–211.
Google Scholar
Johnson, N., Galata, A., & Hogg, D. (2001). Learning variable length Markov models of behaviour. Computer Vision and Image Understanding Journal, 81, 398–413.
Article MATH Google Scholar
Ju, S., Black, M., & Yacoob, Y. (1996). Cardboard people: A parameterized model of articulated image motion. In Proc. IEEE int. conf. on automatic face and gesture recognition (pp. 38–44) 1996.
Kroemer, K., Easterby, R., & Chaffin, D. (1982). Anthropometry and biomechanics—theory and application. New York: Plenum Press.
Google Scholar
Laptev, I. et al. (2005). Periodic motion detection and segmentation via approximate sequence alignment. In Proceedings of the IEEE international conference on computer vision, 2005.
Li, H., & Greenspan, M. (2005). Multi-scale gesture recognition from time-varying contours. In Proceedings of the IEEE international conference on computer vision (pp. 236–243) 2005.
Liao, W., Aggarwal, J., Cai, Q., & Sabata, B. (1994). Articulated and elastic non-rigid motion: a review. In Workshop on motion of non-rigid and articulated objects, 1994.
Moeslund, T., & Granumm, E. (2001). A survey of computer vision based human motion capture. Computer Vision and Image Understanding, 81(3), 231.
Article MATH Google Scholar
Nishikawa, A., Ohnishi, A., & Miyazaki, F. (1998). Description and recognition of human gestures based on the transition of curvature from motion images. In Proc. IEEE int. conf. on automatic face and gesture recognition (pp. 552–557) 1998.
Niyogi, S., & Adelson, E. H. (1994). Analyzing and recognizing walking figures in xyt. In IEEE international conference on computer vision and pattern recognition (pp. 469–474) 1994.
O’Rourke, J., & Badler, N. (1980). Model-based image analysis of human motion using constraint propagation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2(6), 522.
Google Scholar
Oliver, N., Rosario, B., & Pentland, A. (1999). A Bayesian computer vision system for modeling human interactions. In Proceedings of ICVS99, Gran Canaria, Spain, January 1999.
Oliver, N., Horvitz, E., & Garg, A. (2002). Layered representations for human activity recognition. In Fourth IEEE int. conf. on multimodal interfaces (pp. 3–8) 2002.
Parameswaran, V., & Chellappa, R. (2002). Quasi-invariants for human action representation and recognition. In International conference on pattern recognition, 2002.
Parameswaran, V., & Chellappa, R. (2003). View invariants for human action recognition. In Proceedings of the IEEE international conference on computer vision and pattern recognition, 2003.
Parameswaran, V., & Chellappa, R. (2009). Using 2d projective invariance for human action recognition. International Journal of Computer Vision.
Polana, R., & Nelson, R. C. (1994). Detecting activities. Journal of Visual Communication and Image Representation, 5, 172–180.
Article Google Scholar
Prinz, W. (1997). Perception and action planning. European Journal of Cognitive Psychology, 9(2), 129.
Article Google Scholar
Rao, C., & Shah, M. (2001). View invariance in action recognition. In Proceedings of the IEEE international conference on computer vision and pattern recognition, Kauai, Hawaii, Dec. 2001.
Rao, C., Gritai, A., Shah, M., & Syeda-Mahmood, T. (2003). View-invariant alignment and matching of video sequences. In Proceedings of the IEEE international conference on computer vision (pp. 939–945) 2003.
Rashid, R. (1980). Towards a system for the interpretation of moving light display. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2(6), 574.
Google Scholar
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Spech, and Signal Processing, 26(1), 43.
Article MATH Google Scholar
Seitz, S. M., & Dyer, C. R. (1997). View-invariant analysis of cyclic motion. International Journal of Computer Vision, 25, 1–25.
Article Google Scholar
Shah, M., Gritai, A., & Sheikh, Y. (2004). On the use of anthropometry in the invariant analysis of human actions. In International conference on pattern recognition, 2004.
Shechtman, E., & Irani, M. (2005). Space-time behavior based correlation. In IEEE conference on computer vision and pattern recognition, June 2005.
Sheikh, Y., Gritai, A., & Shah, M. (2007). On the spacetime geometry of Galilean cameras. In Proceedings of the IEEE international conference on computer vision and pattern recognition, 2007.
Singer, Y., Fine, S., & Tishby, N. (1998). The hierarchical hidden Markov model: analysis and applications. Machine Learning, 32(1), 41–62.
Article MATH Google Scholar
Starner, T., & Pentland, A. (1996). Motion-based recognition. In Computational imaging and vision series. Real-time American sign language recognition from video using hidden Markov models. Dordrecht: Kluwer Academic.
Google Scholar
Sukthankar, R., Ke, Y., & Hebert, M. (2005). Efficient visual event detection using volumetric features. In Proceedings of the IEEE international conference on computer vision, Oct. 2005.
Venkatesh, S., Nguyen, N., Phung, D., & Bui, H. H. (2005). Learning and detecting activities from movement trajectories using the hierarchical hidden Markov models. In Proceedings of the IEEE international conference on computer vision and pattern recognition, San Diego, CA, 2005.
Verfaillie, K. (1992). Variant points of view on viewpoint invariance. Canadian Journal of Psychology, 46, 215.
Google Scholar
Von Mises, L. (1966). Human action: a treatise on economics. Chicago: Henry Regnery.
Google Scholar
Yamato, J., Ohya, J., & Ishii, L. (1995). Recognizing human action in time-sequential images using hidden Markov model. In Proc. of the IEEE conference on computer vision and pattern recognition (pp. 624–630) 1995.
Yang, M., & Ahuja, N. (1998). Extracting gestural motion trajectories. In Proc. IEEE int. conf. on automatic face and gesture recognition (pp. 10–15) 1998.
Yang, J., Xu, Y., & Chen, C. S. (1997). Human action learning via hidden Markov model. IEEE Transactions on System, Man, and Cybernetics, 27(1), 34–44.
Article Google Scholar
Yilmaz, A., & Shah, M. (2005). Actions as objects: a novel action representation. In IEEE Proceedings on the international conference on computer vision and pattern recognition, 2005.
Zatsiorsky, V. (2002). Kinematics of human motion. Champaign: Human Kinetics.
Google Scholar
Zelnik-Manor, L., & Irani, M. (2001). Event-based analysis of video. In IEEE conference on computer vision and pattern recognition, Dec. 2001.

Download references

Author information

Authors and Affiliations

Cernium Corporation, Reston, USA
Alexei Gritai
Carnegie Melon University, Pittsburgh, USA
Yaser Sheikh
PVI Virtual Media Services, New York, USA
Cen Rao
School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, USA
Mubarak Shah

Authors

Alexei Gritai
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Sheikh
View author publications
You can also search for this author in PubMed Google Scholar
Cen Rao
View author publications
You can also search for this author in PubMed Google Scholar
Mubarak Shah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexei Gritai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gritai, A., Sheikh, Y., Rao, C. et al. Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms. Int J Comput Vis 84, 325–343 (2009). https://doi.org/10.1007/s11263-009-0239-8

Download citation

Received: 02 January 2008
Accepted: 15 April 2009
Published: 24 April 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s11263-009-0239-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms

Abstract

Access this article

Similar content being viewed by others

Exact Function Alignment Under Elastic Riemannian Metric

ShapeFit and ShapeKick for Robust, Scalable Structure from Motion

Uncertainty-DTW for Time Series and Sequences

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Matching Trajectories of Anatomical Landmarks Under Viewpoint, Anthropometric and Temporal Transforms

Abstract

Access this article

Similar content being viewed by others

Exact Function Alignment Under Elastic Riemannian Metric

ShapeFit and ShapeKick for Robust, Scalable Structure from Motion

Uncertainty-DTW for Time Series and Sequences

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation