Skip to main content

Advertisement

Log in

Stochastic Representation and Recognition of High-Level Group Activities

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper describes a stochastic methodology for the recognition of various types of high-level group activities. Our system maintains a probabilistic representation of a group activity, describing how individual activities of its group members must be organized temporally, spatially, and logically. In order to recognize each of the represented group activities, our system searches for a set of group members that has the maximum posterior probability of satisfying its representation. A hierarchical recognition algorithm utilizing a Markov chain Monte Carlo (MCMC)-based probability distribution sampling has been designed, detecting group activities and finding the acting groups simultaneously. The system has been tested to recognize complex activities such as ‘a group of thieves stealing an object from another group’ and ‘a group assaulting a person’. Videos downloaded from YouTube as well as videos that we have taken are tested. Experimental results show that our system recognizes a wide range of group activities more reliably and accurately, as compared to previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aggarwal, J. K., & Cai, Q. (1999). Human motion analysis: A review. Computer Vision and Image Understanding: CVIU, 73(3), 428–440.

    Article  Google Scholar 

  • Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11), 832–843.

    Article  MATH  Google Scholar 

  • Allen, J. F., & Ferguson, G. (1994). Actions and events in interval temporal logic. Journal of Logic and Computation, 4(5), 531–579.

    Article  MATH  MathSciNet  Google Scholar 

  • Cupillard, F., Bremond, F., & Thonnat, M. (2002). Group behavior recognition with multiple cameras. In Proceedings of sixth IEEE workshop on applications of computer vision (WACV) (pp. 177–183).

  • Francois, A. R. J., Nevatia, R., Hobbs, J., & Bolles, R. C. (2005). Verl: An ontology framework for representing and annotating video events. IEEE MultiMedia, 12(4), 76–86.

    Article  Google Scholar 

  • Gong, S., & Xiang, T. (2003). Recognition of group activities using dynamic probabilistic networks. In IEEE international conference on computer vision (ICCV) (p. 742).

  • Hakeem, A., Sheikh, Y., & Shah, M. (2004). CASEE: A hierarchical event representation for the analysis of videos. In Proceedings of the 20th national conference on artificial intelligence (AAAI) (pp. 263–268).

  • Hongeng, S., Nevatia, R., & Bremond, F. (2004). Video-based event recognition: activity representation and probabilistic recognition methods. Computer Vision and Image Understanding: CVIU, 96(2), 129–162.

    Article  Google Scholar 

  • Intille, S. S., & Bobick, A. F. (1999). A framework for recognizing multi-agent action from visual evidence. In AAAI/IAAI (pp. 518–525).

  • Ivanov, Y. A., & Bobick, A. F. (2000). Recognition of visual activities and interactions by stochastic parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 852–872.

    Article  Google Scholar 

  • Khan, S. M., & Shah, M. (2005). Detecting group activities using rigidity of formation. In ACM multimedia.

  • Khan, Z., Balch, T., & Dellaert, F. (2005). Mcmc-based particle filtering for tracking a variable number of interacting targets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(11), 1805–1819.

    Article  Google Scholar 

  • Liao, L., Fox, D., & Kautz, H. (2005). Location-based activity recognition using relational Markov networks. In Proceedings of the nineteenth international conference on artificial intelligence (IJCAI).

  • Milch, B., Marthi, B., Russell, S., Sontag, D., Ong, D. L., & Kolobov, A. (2005). Blog: Probabilistic models with unknown objects. In Proceedings of the 19th international joint conference on artificial intelligence (IJCAI) (pp. 1352–1359).

  • Oliver, N. M., Rosario, B., & Pentland, A. P. (2000). A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 831–843.

    Article  Google Scholar 

  • Park, S., & Aggarwal, J. K. (2004). A hierarchical Bayesian network for event recognition of human actions and interactions. Multimedia Systems, 10(2), 164–179.

    Article  Google Scholar 

  • Pinhanez, C. S., & Bobick, A. F. (1998). Human action detection using pnf propagation of temporal constraints. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (p. 898).

  • Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning, 62(1–2), 107–136.

    Article  Google Scholar 

  • Ryoo, M. S., & Aggarwal, J. K. (2008a). Observe-and-explain: A new approach for multiple hypotheses tracking of humans and objects. In IEEE conference on computer vision and pattern recognition (CVPR).

  • Ryoo, M. S., & Aggarwal, J. K. (2008b). Recognition of high-level group activities based on activities of individual members. In Proceedings of IEEE workshop on motion and video computing (WMVC).

  • Ryoo, M. S., & Aggarwal, J. K. (2009). Semantic representation and recognition of continued and recursive human activities. International Journal of Computer Vision (IJCV), 32(1), 1–24.

    Article  Google Scholar 

  • Siskind, J. M. (2001). Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. Journal of Artificial Intelligence Research (JAIR), 15, 31–90.

    MATH  Google Scholar 

  • Song, X., & Nevatia, R. (2004). Detection and tracking of moving vehicles in crowded scenes. In Proceedings of IEEE workshop on motion and video computing (WMVC).

  • Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. In Proceedings of the conference on uncertainty in artificial intelligence (UAI).

  • Tran, S. D., & Davis, L. S. (2008). Event modeling and recognition using Markov logic networks. In Proceedings of European conference on computer vision (ECCV) (pp. 610–623).

  • Turaga, P., Chellappa, R., Subrahmanian, V. S., & Udrea, O. (2008). Machine recognition of human activities: A survey. IEEE Transactions on Circuits and Systems for Video Technology, 18(11), 1473–1488.

    Article  Google Scholar 

  • Vaswani, N., Roy Chowdhury, A., & Chellappa, R. (2003). Activity recognition using the dynamics of the configuration of interacting objects. In IEEE conference on computer vision and pattern recognition (CVPR).

  • Viola, P., & Jones, M. J. (2001). Rapid object detection using a boosted cascade of simple features. In IEEE conference on computer vision and pattern recognition (CVPR).

  • Vu, V.-T., Brémond, F., & Thonnat, M. (2003). Automatic video interpretation: A novel algorithm for temporal scenario recognition. In International joint conference on artificial intelligence (IJCAI) (pp. 1295–1302).

  • Zhang, D., Gatica-Perez, D., Bengio, S., & McCowan, I. (2006). Modeling individual and group actions in meetings with layered hmms. IEEE Transactions on Multimedia, 8(3), 509–520.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. S. Ryoo.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(AVI 4.848 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ryoo, M.S., Aggarwal, J.K. Stochastic Representation and Recognition of High-Level Group Activities. Int J Comput Vis 93, 183–200 (2011). https://doi.org/10.1007/s11263-010-0355-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-010-0355-5

Keywords

Navigation