Abstract
This paper presents an efficient hybrid (top-down and bottom-up) framework for activity recognition based on analyzing group context in crowded scenes. The approach presented starts by discovering interacting groups of people using a graph based clustering algorithm. Given the interacting groups, a novel group context activity descriptor is computed that captures not only the focal person’s activity but also the behaviors of neighbors in the group. Finally, for a high-level of understanding of human activities, we propose a bottom-up approach using a random field model to encode activity relationships between people in the scene. We evaluate our approach on two public benchmark datasets and compare the utility of our proposed descriptor with other descriptors using the same baseline recognition framework. The results of both the steps show that our approach with the proposed descriptor achieves recognition rates comparable to state-of-the-art methods for activity recognition in crowded scenes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ryoo, M., Aggarwal, J.: Stochastic representation and recognition of high-level group activities. Int. J. Comput. Vis. 93, 183–200 (2011)
Tran, K.N.: Contextual descriptors for human activity recognition, Ph.D. Thesis, University of Houston (2013)
Zhou, B., Tang, X., Wang, X.: Learning collective crowd behaviors with dynamic pedestrian-agents. Int. J. Comput. Vis. 111(1), 50–68 (2015). doi:10.1007/s11263-014-0735-3
Yi, S., Wang, X., Lu, C., Jia, J.: L0 regularized stationary time estimation for crowd group analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 2219–2226 (2014). doi:10.1109/CVPR.2014.284
Tran, K., Gala, A., Kakadiaris, I., Shah, S.: Activity analysis in crowded environments using social cues for group discovery and human interaction modeling. Pattern Recogn. Lett. 44, 49–57 (2014). Pattern Recognition and Crowd Analysis
Smith, K., Ba, S., Odobez, J.-M., Gatica-Perez, D.: Tracking the visual focus of attention for a varying number of wandering people. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1212–1229 (2008)
Helbing, D., Molnár, P.: Social force model for pedestrian dynamics. Phy. Rev. E 51(5), 4282–4286 (1995)
Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Tosato, D., Bue, A.D., Menegaz, G., Murino, V.: Social interaction discovery by statistical analysis of f-formations. In: Proceedings of the British Machine Vision Conference, pp. 23.1–23.12 (2011)
Tran, K., Yan, X., Kakadiaris, I., Shah, S.: A group contextual model for activity recognition in crowded scenes. In: Proceedings of the International Conference on Computer Vision Theory and Applications (2015)
Lan, T., Wang, Y., Yang, W., Robinovitch, S., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1549 (2012)
Choi, W., Shahid, K., Savarese, S.: What are they ng? : collective activity classification using spatio-temporal relationship among people. In: Proceedings Visual Surveillance Workshop, ICCV, pp. 1282–1289 (2009)
Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: Proceedings of the Computer Vision and Pattern Recognition, Spring CO, USA, pp. 3273–3280 (2011)
Amer, M.R., Todorovic, S.: A chains model for localizing participants of group activities in videos. In: Proceedings of the IEEE International Conference on Computer Vision (2011)
Khan, S.M., Shah, M.: Detecting group activities using rigidity of formation. In: Proceedings of the ACM International Conference on Multimedia, MULTIMEDIA 2005, NY, USA, pp. 403–406. ACM, New York (2005). http://doi.acm.org/10.1145/1101149.1101237. doi:10.1145/1101149.1101237
Vaswani, N., Roy Chowdhury, A., Chellappa, R.: Activity recognition using the dynamics of the configuration of interacting objects. In: Proceedings of the Computer Vision and Pattern Recognition, vol. 2, pp. II-633–II-40 (2003). doi:10.1109/CVPR.2003.1211526
Chang, M.-C., Krahnstoever, N., Lim, S., Yu, T.: Group level activity recognition in crowded environments across multiple cameras. In: Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance, DC, USA, pp. 56–63 (2010)
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: Proceedings of the Computer Vision and Pattern Recognition, pp. 935–942 (2009). doi:10.1109/CVPR.2009.5206641
Farenzena, M., Tavano, A., Bazzani, L., Tosato, D., Pagetti, G., Menegaz, G., Murino, V., Cristani, M.: Social interaction by visual focus of attention in a three-dimensional environment. In: Proceedings of the Workshop on Pattern Recognition and Artificial Intelligence for Human Behavior Analysis at AI*IA (2009)
Farenzena, M., Bazzani, L., Murino, V., Cristani, M.: Towards a subject-centered analysis for automated video surveillance. In: Foggia, P., Sansone, C., Vento, M. (eds.) ICIAP 2009. LNCS, vol. 5716, pp. 481–489. Springer, Heidelberg (2009)
Lan, T., Sigal, L., Mori, G.: Social roles in hierarchical models for human activity recognition. In: Proceedings of the Computer Vision and Pattern Recognition, pp. 1354–1361 (2012). doi:10.1109/CVPR.2012.6247821
Hoiem, D., Efros, A., Hebert, M.: Putting objects in perspective. In: Proceedings of the Computer Vision and Pattern Recognition, vol. 2, pp. 2137–2144 (2006) doi:10.1109/CVPR.2006.232
Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. 29, 167–172 (2007)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Wang, H., Klaser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Proceedings of the Computer Vision and Pattern Recognition, pp. 3169–3176 (2011). doi:10.1109/CVPR.2011.5995407
Tran, K., Kakadiaris, I., Shah, S.: Part-based motion descriptor image for human action recognition. Pattern Recogn. 45(7), 2562–2572 (2012)
Amer, M.R., Xie, D., Zhao, M., Todorovic, S., Zhu, S.-C.: Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In: Proceedings of the European Conference on Computer Vision, pp. 187–200 (2012)
Was, J., Gudowski, B., Matuszyk, P.J.: Social distances model of pedestrian dynamics. In: El Yacoubi, S., Chopard, B., Bandini, S. (eds.) ACRI 2006. LNCS, vol. 4173, pp. 492–501. Springer, Heidelberg (2006)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Mooij, J.M.: libDAI: a free and open source C++ library for discrete approximate inference in graphical models. J. Mach. Learn. Res. 11, 2169–2173 (2010)
Acknowledgements
This work was supported in part by the US Department of Justice 2009-MU-MU-K004. Any opinions, findings, conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of our sponsors.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Tran, K.N., Yan, X., Kakadiaris, I.A., Shah, S.K. (2016). A Hybrid Approach for Individual and Group Activity Analysis in Crowded Scene. In: Braz, J., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2015. Communications in Computer and Information Science, vol 598. Springer, Cham. https://doi.org/10.1007/978-3-319-29971-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-29971-6_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29970-9
Online ISBN: 978-3-319-29971-6
eBook Packages: Computer ScienceComputer Science (R0)