Recognizing Activities with Multiple Cues

Biswas, Rahul; Thrun, Sebastian; Fujimura, Kikuo

doi:10.1007/978-3-540-75703-0_18

Rahul Biswas¹,
Sebastian Thrun¹ &
Kikuo Fujimura²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4814))

Included in the following conference series:

Workshop on Human Motion

1647 Accesses
20 Citations

Abstract

In this paper, we introduce a first-order probabilistic model that combines multiple cues to classify human activities from video data accurately and robustly. Our system works in a realistic office setting with background clutter, natural illumination, different people, and partial occlusion. The model we present is compact, requires only fifteen sentences of first-order logic grouped as a Dynamic Markov Logic Network (DMLNs) to implement the probabilistic model and leverages existing state-of-the-art work in pose detection and object recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

http://alchemy.cs.washington.edu/
Allen, D., Darwiche, A.: New advances in inference by recursive conditioning. In: UAI 2003 (2003)
Google Scholar
Bacchus, F., Dalmao, S., Pitassi, T.: Value elimination: Bayesian inference via backtracking search. In: UAI 2003 (2003)
Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape context: A new descriptor for shape matching and object recognition. In: NIPS 2000 (2000)
Google Scholar
Berg, A.: Shape Matching and Object Recognition. PhD thesis, University of California, Berkeley, (Adviser-Jitendra Malik) (2005)
Google Scholar
Berg, A., Berg, T., Malik, J.: Shape matching and object recognition using low distortion correspondence. In: CVPR 2005 (2005)
Google Scholar
Boiman, O., Irani, M.: Detecting irregularities in images and in video. In: ICCV 2005 (2005)
Google Scholar
Dechter, R., Mateescu, R.: Mixtures of deterministic-probabilistic networks and their and/or search space. In: UAI 2004 (2004)
Google Scholar
Doucet, A., Freitas, N., Murphy, K., Russell, S.: Rao-blackwellised particle filtering for dynamic bayesian networks. In: UAI 2000 (2000)
Google Scholar
Enderton, H.: A Mathematical Introduction to Logic. Academic Press, Inc., Florida (1972)
MATH Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. In: PAMI 2006 (2006)
Google Scholar
Huang, C., Ai, H., Li, Y., Lao, S.: Vector boosting for rotation invariant multi-view face detection. In: ICCV 2005 (2005)
Google Scholar
Liao, L., Fox, D., Kautz, H.: Extracting places and activities from gps traces using hierarchical conditional random fields. In: IJRR 2007 (2007)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2) (2004)
Google Scholar
McAllester, D., Collins, M., Pereira, F.: Case-factor diagrams for structured probabilistic modeling. In: UAI 2004 (2004)
Google Scholar
Morency, L., Sidner, C., Lee, C., Darrell, T.: The role of context in head gesture recognition. In: AAAI 2006 (2006)
Google Scholar
Mori, G., Ren, X., Efros, A., Malik, J.: Recovering human body configurations: combining segmentation and recognition. In: CVPR 2004 (2004)
Google Scholar
Murphy, K., Torralba, A., Freeman, W.: Using the forest to see the trees: A graphical model relating features, objects, and scenes. In: NIPS 2003 (2003)
Google Scholar
Mutch, J., Lowe, D.: Multiclass object recognition with sparse, localized features. In: CVPR 2006 (2006)
Google Scholar
Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Progress in Brain Research: Visual Perception 155 (2006)
Google Scholar
Opelt, A., Pinz, A., Zisserman, A.: A boundary fragment model for object detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, Springer, Heidelberg (2006)
Google Scholar
Ormoneit, D., Black, M., Hastie, T., Kjellstrom, H.: Representing cyclic human motion using function analysis. In: IVC 2005 (2005)
Google Scholar
Pasula, H., Russell, S.: Approximate inference for first-order probabilistic languages. In: IJCAI 2001 (2001)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)
Google Scholar
Philipose, M., Fishkin, K., Perkowitz, M., Patterson, D., Fox, D., Kautz, H., Haehnel, D.: Inferring activities from interactions with objects. In: IEEE-PC 2004 (2004)
Google Scholar
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition, pp. 267–296 (1990)
Google Scholar
Ramanan, D., Forsyth, D., Zisserman, A.: Strike a pose: Tracking people by finding stylized poses. In: CVPR 2005 (2005)
Google Scholar
Ren, X., Berg, A., Malik, J.: Recovering human body configurations using pairwise constraints between parts. In: ICCV 2005 (2005)
Google Scholar
Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1-2) (2006)
Google Scholar
Russell, B.C., Efros, A.A., Sivic, J., Freeman, W., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR 2006 (2006)
Google Scholar
Sanghai, S., Domingos, P., Weld, D.: Learning models of relational stochastic processes. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, Springer, Heidelberg (2005)
Google Scholar
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: ICCV 2003 (2003)
Google Scholar
Sidenbladh, H., Black, M.: Learning the statistics of people in images and video. IJCV 54(1-3) (2003)
Google Scholar
Sigal, L., Black, M.: Predicting 3d people from 2d pictures. In: Perales, F.J., Fisher, R.B. (eds.) AMDO 2006. LNCS, vol. 4069, Springer, Heidelberg (2006)
Google Scholar
Torralba. A.: Contextual priming for object detection. International Journal of Computer Vision 53(2) (2003)
Google Scholar
Torralba, A., Murphy, K.: Context-based vision system for place and object recognition. In: ICCV 2003 (2003)
Google Scholar
Viola, P., Jones, M.: Robust real time object detection. In: SCTV 2001 (2001)
Google Scholar
Wang, S., Quattoni, A., Morency, L., Demirdjian, D., Darrell, T.: Hidden conditional random fields for gesture recognition. In: CVPR 2006 (2006)
Google Scholar
Wei, W., Erenrich, J., Selman, B.: Towards efficient sampling: Exploiting random walk strategies. In: AAAI 2004 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Stanford University,
Rahul Biswas & Sebastian Thrun
Honda Research Institute,
Kikuo Fujimura

Authors

Rahul Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Thrun
View author publications
You can also search for this author in PubMed Google Scholar
Kikuo Fujimura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ahmed Elgammal Bodo Rosenhahn Reinhard Klette

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Biswas, R., Thrun, S., Fujimura, K. (2007). Recognizing Activities with Multiple Cues. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds) Human Motion – Understanding, Modeling, Capture and Animation. HuMo 2007. Lecture Notes in Computer Science, vol 4814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75703-0_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-75703-0_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75702-3
Online ISBN: 978-3-540-75703-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics