Skip to main content

Group Activity Recognition by Exploiting Position Distribution and Appearance Relation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12572))

Abstract

Group activity recognition in multi-person scene videos is a challenging task. Most previous approaches fail to provide a practical solution to describe the person relations and distribution within the scene, which is important for understanding group activities. To this end, we propose a two-stream relation network to simultaneously deal with both position distribution information and appearance relation information. For the former, we build Position Distribution Network (PDN) to obtain the spatial position distribution. For the latter, we propose Appearance Relation Network (ARN) to explore the appearance relation of the individuals in scene. We fuse the two clues, i.e. position distribution and appearance relation, to form the global representation for group activity recognition. Extensive experiments on two widely-used group activity datasets demonstrate the effectiveness and superiority of the proposed framework.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bagautdinov, T., Alahi, A., Fleuret, F., Fua, P., Savarese, S.: Social scene understanding: end-to-end multi-person action localization and collective activity recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4315–4324 (2017)

    Google Scholar 

  2. Choi, W., Shahid, K., Savarese, S.: What are they doing?: Collective activity classification using spatio-temporal relationship among people. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp. 1282–1289. IEEE (2009)

    Google Scholar 

  3. Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3273–3280 (2011)

    Google Scholar 

  4. Deng, Z., Vahdat, A., Hu, H., Mori, G.: Structure inference machines: recurrent neural networks for analyzing relations in group activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4772–4781 (2016)

    Google Scholar 

  5. Direkoǧlu, C., O’Connor, N.E.: Temporal segmentation and recognition of team activities in sports. Mach. Vis. Appl. 29(5), 891–913 (2018). https://doi.org/10.1007/s00138-018-0944-9

    Article  Google Scholar 

  6. Hajimirsadeghi, H., Yan, W., Vahdat, A., Mori, G.: Visual recognition by counting instances: a multi-instance cardinality potential kernel. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2596–2605 (2015)

    Google Scholar 

  7. He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  8. Ibrahim, M.S., Mori, G.: Hierarchical relational networks for group activity recognition and retrieval. In: European Conference on Computer Vision, pp. 742–758 (2018)

    Google Scholar 

  9. Ibrahim, M.S., Muralidharan, S., Deng, Z., Vahdat, A., Mori, G.: A hierarchical deep temporal model for group activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1971–1980 (2016)

    Google Scholar 

  10. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2017)

    Google Scholar 

  11. Kong, L., Qin, J., Huang, D., Wang, Y., Gool, L.V.: Hierarchical attention and context modeling for group activity recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1328–1332 (2018)

    Google Scholar 

  12. Lan, T., Sigal, L., Mori, G.: Social roles in hierarchical models for human activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1354–1361 (2012)

    Google Scholar 

  13. Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1549–1562 (2012)

    Article  Google Scholar 

  14. Li, X., Choo Chuah, M.: SBGAR: semantics based group activity recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2876–2885 (2017)

    Google Scholar 

  15. Liu, L., Zhou, T., Long, G., Jiang, J., Yao, L., Zhang, C.: Prototype propagation networks (PPN) for weakly-supervised few-shot learning on category graph. In: International Joint Conferences on Artificial Intelligence (IJCAI) (2019)

    Google Scholar 

  16. Liu, L., Zhou, T., Long, G., Jiang, J., Zhang, C.: Learning to propagate for graph meta-learning. In: Neural Information Processing Systems (NeurIPS) (2019)

    Google Scholar 

  17. Qi, M., Qin, J., Li, A., Wang, Y., Luo, J., Van Gool, L.: stagNet: an attentive semantic RNN for group activity recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)

    Google Scholar 

  18. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  19. Shu, T., Todorovic, S., Zhu, S.: CERN: confidence-energy recurrent network for group activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4255–4263 (2017)

    Google Scholar 

  20. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  21. Wang, M., Ni, B., Yang, X.: Recurrent modeling of interaction context for collective activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7408–7416 (2017)

    Google Scholar 

  22. Wu, J., Wang, L., Wang, L., Guo, J., Wu, G.: Learning actor relation graphs for group activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 9964–9974 (2019)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the Foundation for Innovative Research Groups through the National Natural Science Foundation of China (Grant No. 61421003) and CCF-Tencent Rhino-Bird Research Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Annan Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pei, D., Li, A., Wang, Y. (2021). Group Activity Recognition by Exploiting Position Distribution and Appearance Relation. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12572. Springer, Cham. https://doi.org/10.1007/978-3-030-67832-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67832-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67831-9

  • Online ISBN: 978-3-030-67832-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics