Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking

Cheung, Kong-man (German); Baker, Simon; Kanade, Takeo

doi:10.1007/s11263-005-6879-4

Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking

Published: 01 April 2005

Volume 63, pages 225–245, (2005)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Kong-man (German) Cheung¹,
Simon Baker² &
Takeo Kanade²

615 Accesses
93 Citations
9 Altmetric
Explore all metrics

Abstract

In Part I of this paper we developed the theory and algorithms for performing Shape-From-Silhouette (SFS) across time. In this second part, we show how our temporal SFS algorithms can be used in the applications of human modeling and markerless motion tracking. First we build a system to acquire human kinematic models consisting of precise shape (constructed using the temporal SFS algorithm for rigid objects), joint locations, and body part segmentation (estimated using the temporal SFS algorithm for articulated objects). Once the kinematic models have been built, we show how they can be used to track the motion of the person in new video sequences. This marker-less tracking algorithm is based on the Visual Hull alignment algorithm used in both temporal SFS algorithms and utilizes both geometric (silhouette) and photometric (color) information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

LSD-SLAM: Large-Scale Direct Monocular SLAM

The Pascal Visual Object Classes Challenge: A Retrospective

Article 25 June 2014

References

Allen, B., Curless, B., and Popovic, Z. 2003. The space of human body shapes: Reconstruction and parameterization from range scans. In Computer Graphics Annual Conference Series (SIGGRAPH’03), San Diego, CA, pp. 587–594.
Beymer, D. and Konolige, K. 1999. Real-time tracking of multiple people using stereo. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.
Barron, C. and Kakadiaris, I. 2000. Estimating anthropometry and pose from a single image. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Hilton Head Island SC.
Blinn, J. 1982. A generalization of algebraic surface drawing. ACM Transactions on Graphics, 1(3):235–256.
Article Google Scholar
Bregler, C. and Malik, J. 1997. Video motion capture. Technical Report CSD-97-973, University of California Berkeley.
Google Scholar
Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential map. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’98), Santa Barbara, CA, vol. 1, pp. 8–15 .
Baker, S. and Matthews, I. 2004. Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3):221–255.
Article Google Scholar
Cai, Q. and Aggarwal, J. 1996. Tracking human motion using multiple cameras. In Proceedings of International Conference on Pattern Recognition (ICPR’96), vol. 3, pp. 68–72.
Cai, Q. and Aggarwal, J. 1998. Automatic tracking of human motion in indoor scenes across multiple synchronized video streams. In Proceedings of the Sixth International Conference on Computer Vision (ICCV’98), Bombay, India.
Cheung, K., Baker, S., Hodgins, J., and Kanade, T. 2004. Markerless human motion transfer. In Proceedings of the Second International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT’04), Thessaloniki, Greece.
Cheung, G., Baker, S., and Kanade, T. 2003b. Shape-from-silhouette for articulated objects and its use for human body kinematics estimation and motion capture. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’03), Madison, MI.
Cheung, G., Baker, S., and Kanade, T. 2003a. Visual hull alignment and refinement across time: A 3D reconstruction algorithm combining shape-frame-silhouette with stereo. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’03), Madison, MI.
Cheung, K., Baker, S., and Kanade, T. 2005. Shape-from-silhouette across time part I: Theory and algorithms. International Journal on Computer Vision, 62(3):221–247.
Article Google Scholar
Cheung, G. 2003. Visual Hull Construction, Alignment and Refinement for Human Kinematic Modeling, Motion Tracking and Rendering. PhD thesis, Carnegie Mellon University.
Cheung, G., Kanade, T., Bouquet, J., and Holler, M. 2000. A real time system for robust 3D voxel reconstruction of human motions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Hilton Head Island, SC.
Coen, M. 1998. Design principals for intelligent environments. In Proceedings of AAAI Spring Symposium on Intelligent Environments, Stanford, CA.
Cham, T. and Rehg, J. 1999a. A multiple hypothesis approach to figure tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’99), Ft. Collins, CO.
Cham, T. and Rehg, J. 1999b. Dynamic feature ordering for efficient registration. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.
Carranza, J., Theobalt, C., Magnor, M., and Seidel, H. 2003. Free-viewpoint video of human actors. In Computer Graphics Annual Conference Series (SIGGRAPH’03), San Diego, CA, pp. 569–577.
Cybearware. http://www.cyberware.com.
Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Hilton Head Island, SC.
Drummond, T. and Cipolla, R. 2001. Real-time tracking of highly articulated structures in the presence of noisy measurements. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada, pp. 315–320.
DiFranco, D., Cham, T., and Rehg, J. 1999. Recovering of 3D articulated motion from 2d correspondences. Technical Report CRL 99/7, Compaq Cambridge Research Laboratory.
Difranco, D., Cham, T., and Rehg, J. 2001. Reconstruction of 3D figure motion from 2D correspondences. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Kauai, HI.
Delamarre, Q. and Faugeras, O. 1999. 3D articulated models and multi-view tracking with silhouettes. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.
Google Scholar
Fua, P., Gruen, A., D’Apuzzo, N., and Plänkers, R. 2002. Markerless full body shape and motion capture from video sequences. International Archives of Photogrammetry and Remote Sensing, 34(5):256–261.
Google Scholar
Fua, P., Herda, L., Plänkers, R., and Boulic, R. 2000. Human shape and motion recovery using animation models. In XIX ISPRS Congress.
Gavrila, G. and Davis, L. 1996. Tracking of humans in action: 3D model-based approach. In ARPA Image Understanding Workshop 1996.
Haritaoglu, I., Harwood, D., and Davis, L.S. 1998. W4: Who? when? where? what? a real time system for detecting and tracking people. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (ICAFGR’98), Japan.
Ju, S., Black, M., and Yacoob, Y. 1996. Cardboard people: A parameterized model of articulated image motion. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (ICAFGR’96), Vermont, USA.
Jojic, N., Turk, M., and Huang, T. 1999. Tracking self-occluding articulated objects in dense disparity maps. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.
Kakadiaris, I. and Metaxas, D. 1995. 3D human body model acquisition from multiple views. In Proceedings of International Conference on Computer Vision (ICCV’95), Cambridge MA, pp. 618–623.
Kakadiaris, I. and Metaxas, D. 1998. 3D human body model acquisition from multiple views. International Journal on Computer Vision, 30(3):191–218.
Article Google Scholar
Kakadiaris, I., Metaxas, D., and Bajcsy, R. 1994. Active part-decomposition, shape and motion estimation of articulated objects: A physics-based approach. Technical Report IRCS Report 94-18, University of Pennsylvania.
Krahnstoever, N., Yeasin, M., and Sharma, R. 2001. Automatic acquisition and initialization of kinematic models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Technical Sketches, Kauai, HI.
Krahnstoever, N., Yeasin, M., and Sharma, R. 2003. Automatic acquisition and initialization of articulated models. In To appear in Machine Vision and Applications.
Liebowitz, D. and Carlsson, S. 2001. Uncalibrated motion capture exploiting articulated structure constraints. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada.
Leung, M. and Yang, Y. 1995. First sight: A human body outline labeling system. IEEE Transactions Pattern Analysis and Machine Intelligence, 17(4):359–377.
Article Google Scholar
Lucente, M., Zwart, G., and George, A. 1998. Visualization space: A testbed for deviceless multimodal user interface. In Proceedings of AAAI Spring Symposium on Intelligent Environments, Stanford, CA.
Matusik, W. 2001. Image-based visual hulls. Master’s thesis, Massachusetts Institute of Technology.
Meta motion. http://www.metamotion.com.
Moeslund, T. and Granum, E. 2001. A survey of computer vision-based human motion capture. Computer Vision and Image Understanding: CVIU, 81(3):231–268.
Article Google Scholar
Mikic, I., Hunter, E., Trivedi, M., and Cosman, P. 2001. Articulated body posture estimation from multi-camera voxel data. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Kauai, HI.
Murray, R., Li, Z., and Sastry, S. 1994. A Mathematical Introduction to Robotic Manipulation, CRC Press.
Moezzi, S., Tai, L., and Gerard, P. 1997. Virtual view generation for 3D digital video. IEEE Computer Society Multimedia, 4(1).
Mikic, I., Trivedi, M., Hunter, E., and Cosman, P. 2003. Human body model acquisition and tracking using voxel data. International Journal on Computer Vision, 53(3):199–223.
Article Google Scholar
O’Brien, J., Bodenheimer, R., Brostow, G., and Hodgins, J. 2000. Automatic joint parameter estimation from magnetic motion capture data. In Proceedings of Graphics Interface’00, pp. 53–60.
Plänkers, R. and Fua, P. 2001. Articulated soft objects for video-based body modeling. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada, pp. 394–401.
Plänkers, R., Fua, P., and D’Apuzzo, N. 1999. Automated body modeling from video sequences. In Proceedings of the 1999 International Workshop on Modeling People (MPEOPLE’99), Corfu, Greece.
Pavlovic, V., Rehg, J., Cham, T., and Murphy, K. 1999. A dynamic bayesian network approach to figure tracking using learned dynamic models. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.
Rehg, J. and Kanade, T. 1995. Model-based tracking of self-occluding articulated objects. In Proceedings of International Conference on Computer Vision (ICCV’95), Cambridge MA., pp. 612–617.
Sidenbladh, H., Black, M., and Fleet, D. 2000a. Stochastic tracking of 3D human figures using 2D image motion. In Proceedings of European Conference on Computer Vision (ECCV’00), Dublin, Ireland.
Sullivan, J. and Carlsson, S. 2002. Recognizing and tracking human action. In Proceedings of European Conference on Computer Vision (ECCV’02), Denmark.
Sidenbladh, H., DeLaTorre, F., and Black, M. 2000b. A framework for modeling the appearance of 3D articulated figures. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (ICAFGR’00).
Shafer, S., Krumm, J., Brumitt, B., Meyers, B., Czerwinski, M., and Robbins, D. 1998. The new easyliving project at microsoft research. In Proceedings of Joint DARPA/NIST Smart Spaces Workshop, Gaithersburgh, MD.
Sand, P., McMillan, L., and Popovic, J. 2003. Continuous capture of skin deformation. In Computer Graphics Annual Conference Series (SIGGRAPH’03), San Diego, CA, pp. 578–586.
Thirdtech inc. http://www.3rdtech.com.
Vicon motion systems. http://www.vicon.com.
Wren, C., Azarbayejani, A., Darrell, T., and Pentland, A. 1997. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780–785.
Article Google Scholar
Yamamoto, M., Sato, A., Kawada, S., Kondo, T., and Osaki, Y. 1998. Incremental tracking of human actions from multiple views. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’98), CA, vol. 1, pp. 2–7.

Download references

Author information

Authors and Affiliations

Neven Vision, 2400 Broadway, Suite #240, Santa Monica, CA, 90404-3082, USA
Kong-man (German) Cheung
The Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
Simon Baker & Takeo Kanade

Authors

Kong-man (German) Cheung
View author publications
You can also search for this author in PubMed Google Scholar
Simon Baker
View author publications
You can also search for this author in PubMed Google Scholar
Takeo Kanade
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kong-man (German) Cheung.

Electronic supplementary material

Video (mpg 4.761 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheung, Km., Baker, S. & Kanade, T. Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking. Int J Comput Vision 63, 225–245 (2005). https://doi.org/10.1007/s11263-005-6879-4

Download citation

Received: 16 December 2003
Revised: 13 December 2004
Accepted: 13 December 2004
Published: 01 April 2005
Issue Date: July 2005
DOI: https://doi.org/10.1007/s11263-005-6879-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAM

The Pascal Visual Object Classes Challenge: A Retrospective

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Video (mpg 4.761 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking

Abstract

Access this article

Similar content being viewed by others

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

LSD-SLAM: Large-Scale Direct Monocular SLAM

The Pascal Visual Object Classes Challenge: A Retrospective

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Video (mpg 4.761 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation