Relation-Guided Actor Attention for Group Activity Recognition

Wu, Lifang; Wang, Qi; Li, Zeyu; Xiang, Ye; Lang, Xianglong

doi:10.1007/978-3-030-88004-0_11

Relation-Guided Actor Attention for Group Activity Recognition

Lifang Wu¹⁶,
Qi Wang¹⁶,
Zeyu Li¹⁶,
Ye Xiang¹⁶ &
…
Xianglong Lang¹⁶

Conference paper
First Online: 22 October 2021

2489 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13019))

Abstract

Group activity recognition has received significant interest due to its widely practical applications in sports analysis, intelligent surveillance and abnormal behavior detection. In a complex multi-person scenario, only a few key actors participate in the overall group activity and others may bring irrelevant information for recognition. However, most previous approaches model all the actors’ actions in the scene equivalently. To this end, we propose a relation-guided actor attention (RGAA) module to learn reinforced feature representations for effective group activity recognition. First, a location-aware relation module (LARM) is designed to explore the relation among pairwise actors’ features in which appearance and position information are both considered. We propose to stack all the pairwise relation features and the features themselves of an actor to learn actor attention which determines the importance degree from local and global information. Extensive experiments on two publicly benchmarks demonstrate the effectiveness of our method and the state-of-the-art performance is achieved.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Amer, M.R., Xie, D., Zhao, M., Todorovic, S., Zhu, S.C.: Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In: European Conference on Computer Vision, pp. 187–200. Springer (2012)
Google Scholar
Azar, S.M., Atigh, M.G., Nickabadi, A., Alahi, A.: Convolutional relational machine for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7892–7901 (2019)
Google Scholar
Choi, W., Shahid, K., Savarese, S.: What are they doing?: Collective activity classification using spatio-temporal relationship among people. In: 2009 IEEE 12th international conference on computer vision workshops, ICCV Workshops, pp. 1282–1289. IEEE (2009)
Google Scholar
Gavrilyuk, K., Sanford, R., Javan, M., Snoek, C.G.: Actor-transformers for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 839–848 (2020)
Google Scholar
Hu, G., Cui, B., He, Y., Yu, S.: Progressive relation learning for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 980–989 (2020)
Google Scholar
Ibrahim, M.S., Muralidharan, S., Deng, Z., Vahdat, A., Mori, G.: A hierarchical deep temporal model for group activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1971–1980 (2016)
Google Scholar
Li, W., Zhu, X., Gong, S.: Harmonious attention network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2285–2294 (2018)
Google Scholar
Li, X., Zhou, W., Zhou, Y., Li, H.: Relation-guided spatial attention and temporal refinement for video-based person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11434–11441 (2020)
Google Scholar
Lu, L., Lu, Y., Yu, R., Di, H., Zhang, L., Wang, S.: Gaim: graph attention interaction model for collective activity recognition. IEEE Trans. Multimed. 22(2), 524–539 (2019)
Article Google Scholar
Qi, M., Qin, J., Li, A., Wang, Y., Luo, J., Van Gool, L.: stagnet: An attentive semantic rnn for group activity recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)
Google Scholar
Ramanathan, V., Huang, J., Abu-El-Haija, S., Gorban, A., Murphy, K., Fei-Fei, L.: Detecting events and key actors in multi-person videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3043–3053 (2016)
Google Scholar
Shu, T., Todorovic, S., Zhu, S.C.: Cern: confidence-energy recurrent network for group activity recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5523–5531 (2017)
Google Scholar
Song, S., Lan, C., Xing, J., Zeng, W., Liu, J.: An end-to-end spatio-temporal attention model for human action recognition from skeleton data. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 31 (2017)
Google Scholar
Tang, Y., Wang, Z., Li, P., Lu, J., Yang, M., Zhou, J.: Mining semantics-preserving attention for group activity recognition. In: Proceedings of the 26th ACM international conference on Multimedia, pp. 1283–1291 (2018)
Google Scholar
Wang, M., Ni, B., Yang, X.: Recurrent modeling of interaction context for collective activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3048–3056 (2017)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Google Scholar
Wu, J., Wang, L., Wang, L., Guo, J., Wu, G.: Learning actor relation graphs for group activity recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9964–9974 (2019)
Google Scholar
Wu, L.F., Wang, Q., Jian, M., Qiao, Y., Zhao, B.X.: A comprehensive review of group activity recognition in videos. Int. J. Autom. Comput. 18(3), 334–350 (2021)
Article Google Scholar
Yan, R., Tang, J., Shu, X., Li, Z., Tian, Q.: Participation-contributed temporal dynamic model for group activity recognition. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 1292–1300 (2018)
Google Scholar
Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3186–3195 (2020)
Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (61976010, 61802011), Beijing Municipal Education Committee Science Foundation (KM201910005024) and Postdoctoral Research Foundation (Q6042001202101).

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, China
Lifang Wu, Qi Wang, Zeyu Li, Ye Xiang & Xianglong Lang

Authors

Lifang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zeyu Li
View author publications
You can also search for this author in PubMed Google Scholar
Ye Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Xianglong Lang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ye Xiang .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, L., Wang, Q., Li, Z., Xiang, Y., Lang, X. (2021). Relation-Guided Actor Attention for Group Activity Recognition. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13019. Springer, Cham. https://doi.org/10.1007/978-3-030-88004-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-88004-0_11
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88003-3
Online ISBN: 978-3-030-88004-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics