A hierarchical Bayesian network for event recognition of human actions and interactions

Park, Sangho; Aggarwal, J. K.

doi:10.1007/s00530-004-0148-1

A hierarchical Bayesian network for event recognition of human actions and interactions

Sp.lss. on Video Surveillance
Published: August 2004

Volume 10, pages 164–179, (2004)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Sangho Park¹ &
J. K. Aggarwal¹

1030 Accesses
156 Citations
3 Altmetric
Explore all metrics

Abstract.

Recognizing human interactions is a challenging task due to the multiple body parts of interacting persons and the concomitant occlusions. This paper presents a method for the recognition of two-person interactions using a hierarchical Bayesian network (BN). The poses of simultaneously tracked body parts are estimated at the low level of the BN, and the overall body pose is estimated at the high level of the BN. The evolution of the poses of the multiple body parts are processed by a dynamic Bayesian network (DBN). The recognition of two-person interactions is expressed in terms of semantic verbal descriptions at multiple levels: individual body-part motions at low level, single-person actions at middle level, and two-person interactions at high level. Example sequences of interacting persons illustrate the success of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aggarwal JK, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Understand 73(3):295-304
Google Scholar
Allen JF, Ferguson G (1994) Actions and events in interval temporal logic. J Logic Comput 4(5):531-579
MATH Google Scholar
Bakowski A, Jones G (1999) Video surveillance tracking using color region adjacency graphs. In: 7th international conference on image processing and its applications, 13-15 July 1999, University of Manchester, UK, pp 794-798
Barron C, Kakadiaris I (2003) A convex penalty method for optical human motion tracking. In: ACM international workshop on video surveillance (IWVS), Berkeley, CA, November 2003, pp 1-10
Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Springer, Berlin Heidelberg New York
Data A, Shah M, Lobo N (2002) Person-on-person violence detection in video data. In: Proceedings of the international conference on pattern recognition, Quebec City, Canada, 1:433-438
Elgammal AM, Davis L (2001) Probabilistic framework for segmenting people under occlusion. In: International conference on computer vision, Vancouver, Canada, 2:145-152
Gavrila D (1999) The visual analysis of human movement: a survey. Comput Vis Image Understand 73(1):82-98
Article MATH Google Scholar
Graham RL (1972) An efficient algorithm for determining the convex hull of a finite planar set. Inf Process Lett 1:132-133
Article MATH Google Scholar
Haritaoglu I, Harwood D, Davis LS (2000) W4: Real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 22(8):797-808
Article Google Scholar
Hongeng S, Bremond F, Nevatia R (2000) Representation and optimal recognition of human activities. In: IEEE conference on computer vision and pattern recognition, 1:818-825
Huang C, Darwiche A (1996) Inference in belief networks: a procedural guide. Int J Approx Reason 15(3):225-263
Article MATH Google Scholar
Jensen FV, Jensen F (1994) Optimal junction trees. In: Conference on uncertainty in artificial intelligence, Seattle, July 1994
Kojima A, Tamura T, Fukunaga K (2002) Natural language description of human activities from video images based on concept hierarchy of actions. Int J Comput Vis 50(2):171-184
Article MATH Google Scholar
Moeslund T, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Understand 81(3):231-268
Article MATH Google Scholar
Oliver NM, Rosario B, Pentland AP (2000) A Bayesian computer vision system for modeling human interactions. IEEE Trans Pattern Anal Mach Intell 22(8):831-843
Article Google Scholar
O’Rourke J (1994) Computational geometry in C. Cambridge University Press, Cambridge, UK, pp 70-112
Park S, Aggarwal JK (2000) Recognition of human interaction using multiple features in grayscale images. In: Proceedings of the internaitonal conference on pattern recognition, Barcelona, Spain, September 2000, 1:51-54
Park S, Aggarwal JK (2002) Segmentation and tracking of interacting human body parts under occlusion and shadowing. In: IEEE workshop on motion and video computing, Orlando, FL, pp 105-111
Park S, Aggarwal JK (2003) Recognition of two-person interactions using a hierarchical Bayesian network. In: ACM international workshop on video surveillance, Berkeley, CA, pp 65-76
Park S, Park J, Aggarwal JK (2003) Video retrieval of human interactions using model-based motion tracking and multi-layer finite state automata. In: Lecture notes in computer science, vol 2728. Springer, Berlin Heidelberg New York, pp 394-403
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Mateo, CA, pp 337-340
Google Scholar
Rabiner L (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77(2):257-286
Article Google Scholar
Rosales R, Sclaroff S (2000) Inferring body pose without tracking body parts. In: Computer vision and pattern recognition, Hilton Head Island, SC, pp 721-727
Sato K, Aggarwal JK (2001) Recognizing two-person interactions in outdoor image sequences. In: IEEE workshop on multi-object tracking, Vancouver, CA
Sherrah J, Gong S (2000) Resolving visual uncertainty and occlusion through probabilistic reasoning. In: British machine vision conference, Bristol, UK, pp 252-261
Sherrah J, Gong S (2000) Tracking discontinuous motion using bayesian inference. In: 6th European conference on computer vision, pp 150-166
Siebel N, Maybank S (2001) Real-time tracking of pedestrians and vehicles. In: IEEE workshop on PETS, Kauai, HI
Wada T, Matsuyama T (2000) Multiobject behavior recognition by event driven selective attention method. IEEE Trans Pattern Anal Mach Intell 22(8):873-887
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, The University of Texas at Austin, TX 78712, Austin, USA
Sangho Park & J. K. Aggarwal

Authors

Sangho Park
View author publications
You can also search for this author in PubMed Google Scholar
J. K. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sangho Park.

Additional information

Published online: 25 October 2004

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, S., Aggarwal, J.K. A hierarchical Bayesian network for event recognition of human actions and interactions. Multimedia Systems 10, 164–179 (2004). https://doi.org/10.1007/s00530-004-0148-1

Download citation

Issue Date: August 2004
DOI: https://doi.org/10.1007/s00530-004-0148-1

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical Bayesian network for event recognition of human actions and interactions

Abstract.

Access this article

Similar content being viewed by others

Human activity recognition in artificial intelligence framework: a narrative review

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Navigation

A hierarchical Bayesian network for event recognition of human actions and interactions

Abstract.

Access this article

Similar content being viewed by others

Human activity recognition in artificial intelligence framework: a narrative review

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Search

Navigation