An Asymmetric Modeling for Action Assessment

Gao, Jibin; Zheng, Wei-Shi; Pan, Jia-Hui; Gao, Chengying; Wang, Yaowei; Zeng, Wei; Lai, Jianhuang

doi:10.1007/978-3-030-58577-8_14

An Asymmetric Modeling for Action Assessment

Jibin Gao^12,15,
Wei-Shi Zheng^12,13,16,
Jia-Hui Pan¹²,
Chengying Gao¹²,
Yaowei Wang¹³,
Wei Zeng¹⁴ &
…
Jianhuang Lai¹²

Conference paper
First Online: 24 September 2020

3176 Accesses
34 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12375))

Abstract

Action assessment is a task of assessing the performance of an action. It is widely applicable to many real-world scenarios such as medical treatment and sporting events. However, existing methods for action assessment are mostly limited to individual actions, especially lacking modeling of the asymmetric relations among agents (e.g., between persons and objects); and this limitation undermines their ability to assess actions containingasymmetrically interactive motion patterns, since there always exists subordination between agents in many interactive actions. In this work, we model the asymmetric interactions among agents for action assessment. In particular, we propose an asymmetric interaction module (AIM), to explicitly model asymmetric interactions between intelligent agents within an action, where we group these agents into a primary one (e.g., human) and secondary ones (e.g., objects). We perform experiments on JIGSAWS dataset containing surgical actions, and additionally collect a new dataset, TASD-2, for interactive sporting actions. The experimental results on two interactive action datasets show the effectiveness of our model, and our method achieves state-of-the-art performance. The extended experiment on AQA-7 dataset also demonstrates the generalization capability of our framework to conventional action assessment.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
For interactive actions involving more than two performers, the important people detection [11, 12, 23] can be utilized to divide performers into the primary (the most important one) and the secondary (the rest).
2.
Details of data preprocessing can be found in the supplementary materials.
3.
Videos can be found in the supplementary materials.

References

Bertasius, G., Soo Park, H., Yu, S.X., Shi, J.: Am i a baller? basketball performance assessment from first-person videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2177–2185 (2017)
Google Scholar
Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Google Scholar
Chen, J., Wang, Y., Qin, J., Liu, L., Shao, L.: Fast person re-identification via cross-camera semantic binary transformation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Doughty, H., Damen, D., Mayol-Cuevas, W.: Whoś better, whoś best: skill determination in video using deep ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Doughty, H., Mayol-Cuevas, W., Damen, D.: The pros and cons: Rank-aware temporal attention for skill determination in long videos, June 2019
Google Scholar
Gao, Y., et al.: Jhu-isi gesture and skill assessment working set (jigsaws): a surgical activity dataset for human motion modeling. In: MICCAI Workshop: M2CAI, vol. 3, p. 3 (2014)
Google Scholar
Gattupalli, S., Ebert, D., Papakostas, M., Makedon, F., Athitsos, V.: Cognilearn: a deep learning-based interface for cognitive behavior assessment. In: Proceedings of the 22nd International Conference on Intelligent User Interfaces, pp. 577–587. ACM (2017)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM. In: IET Conference Proceedings, vol. 5, pp. 850–855, January 1999
Google Scholar
Ilg, W., Mezger, J., Giese, M.: Estimation of skill levels in sports based on hierarchical spatio-temporal correspondences. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 523–531. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45243-0_67
Chapter Google Scholar
Li, H., Cai, Y., Zheng, W.S.: Deep dual relation modeling for egocentric interaction recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Li, W.H., Hong, F.T., Zheng, W.S.: Learning to learn relation for important people detection in still images. In: Computer Vision and Pattern Recognition (2019)
Google Scholar
Li, W.H., Li, B., Zheng, W.S.: Personrank: detecting important people in images. In: International Conference on Automatic Face & Gesture Recognition (FG 2018) (2018)
Google Scholar
Malpani, A., Vedula, S.S., Chen, C.C.G., Hager, G.D.: Pairwise comparison-based objective score for automated skill assessment of segments in a surgical task. In: Stoyanov, D., Collins, D.L., Sakuma, I., Abolmaesumi, P., Jannin, P. (eds.) IPCAI 2014. LNCS, vol. 8498, pp. 138–147. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07521-1_15
Chapter Google Scholar
Paiement, A., Tao, L., Hannuna, S., Camplani, M., Damen, D., Mirmehdi, M.: Online quality assessment of human movement from skeleton data. In: British Machine Vision Conference, pp. 153–166. BMVA Press (2014)
Google Scholar
Pan, J.H., Gao, J., Zheng, W.S.: Action assessment by joint relation graphs. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Parmar, P., Morris, B.T.: What and how well you performed? a multitask learning approach to action quality assessment. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Parmar, P., Tran Morris, B.: Learning to score olympic events. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–28 (2017)
Google Scholar
Parmar, P., Tran Morris, B.: Action quality assessment across multiple actions. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1468–1476, January 2019. https://doi.org/10.1109/WACV.2019.00161
Pérez, J.S., Meinhardt-Llopis, E., Facciolo, G.: Tv-l1 optical flow estimation. Image Processing On Line, pp. 137–150 (2013)
Google Scholar
Pirsiavash, H., Vondrick, C., Torralba, A.: Assessing the quality of actions. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 556–571. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_36
Chapter Google Scholar
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009)
Article Google Scholar
Sharma, Y., et al.: Video based assessment of osats using sequential motion textures. Georgia Institute of Technology (2014)
Google Scholar
Solomon Mathialagan, C., Gallagher, A.C., Batra, D.: VIP: finding important people in images. In: Computer Vision and Pattern Recognition (2015)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30, pp. 5998–6008. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Wang, Z., Lu, J., Tao, C., Zhou, J., Tian, Q.: Learning channel-wise interactions for binary convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Google Scholar
Xu, C., Fu, Y., Zhang, B., Chen, Z., Jiang, Y.G., Xue, X.: Learning to score the figure skating sports videos. arXiv preprint arXiv:1802.02774 (2018)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Zhang, Q., Li, B.: Video-based motion expertise analysis in simulation-based surgical training using hierarchical dirichlet process hidden markov model. In: Proceedings of the 2011 international ACM workshop on Medical multimedia analysis and retrieval, pp. 19–24. ACM (2011)
Google Scholar
Zhang, Q., Li, B.: Relative hidden markov models for video-based evaluation of motion skills in surgical training. IEEE transactions on pattern analysis and machine intelligence 37(6), 1206–1218 (2015)
Article Google Scholar
Zia, A., Essa, I.: Automated surgical skill assessment in RMIS training. Int J CARS 13, 731–739 (2018)
Article Google Scholar
Zia, A., Sharma, Y., Bettadapura, V., Sarin, E.L., Clements, M.A., Essa, I.: Automated assessment of surgical skills using frequency analysis. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 430–438. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_53
Chapter Google Scholar
Zia, A., Sharma, Y., Bettadapura, V., Sarin, E.L., Essa, I.: Video and accelerometer-based motion analysis for automated surgical skills assessment. Int. J. Comput. Assisted Radiol. Surgery 13(3), 443–455 (2018)
Article Google Scholar
Zia, A., et al.: Automated video-based assessment of surgical skills for training and evaluation in medical schools. Int. J. Comput. Assisted Radiol. Surgery 11(9), 1623–1636 (2016)
Article Google Scholar

Download references

Acknowledgement

This work was supported partially by the National Key Research and Development Program of China (2018YFB1004903), NSFC(U1911401,U1811461), Guangdong Province Science and Technology Innovation Leading Talents (2016TX03X157), Guangdong NSF Project (No. 2018B030312002), Guangzhou Research Project (201902010037), and Research Projects of Zhejiang Lab (No. 2019KD0AB03).

Author information

Authors and Affiliations

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
Jibin Gao, Wei-Shi Zheng, Jia-Hui Pan, Chengying Gao & Jianhuang Lai
Peng Cheng Laboratory, Shenzhen, 518005, China
Wei-Shi Zheng & Yaowei Wang
School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Wei Zeng
Pazhou Lab, Guangzhou, China
Jibin Gao
Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, Guangzhou, China
Wei-Shi Zheng

Authors

Jibin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Shi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Hui Pan
View author publications
You can also search for this author in PubMed Google Scholar
Chengying Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yaowei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Jianhuang Lai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wei-Shi Zheng or Chengying Gao .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 4 (pdf 283 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, J. et al. (2020). An Asymmetric Modeling for Action Assessment. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12375. Springer, Cham. https://doi.org/10.1007/978-3-030-58577-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-58577-8_14
Published: 24 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58576-1
Online ISBN: 978-3-030-58577-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics