Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting

Huang, Feng; Li, Jianjun

doi:10.1007/s10489-024-05349-6

Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting

Published: 29 February 2024

Volume 54, pages 3245–3259, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

465 Accesses
2 Citations
Explore all metrics

Abstract

Action Quality Assessment (AQA) is a critical branch of video understanding, offering impartial evaluations for competitive sports. Existing paradigms tend to assess action quality using equal-length clips that lack sufficient semantics, leading to suboptimal predictions. To address this issue, we propose to conduct AQA with Semantic-Sequence Performance Regression (SSPR). SSPR first divides an action into a series of unequal-length segments according to the semantic continuity of the video, such as jumping, dropping, and entering the water in diving. Specifically, the latest Temporal Convolutional Network (TCN) is adopted for semantic-sequence segmentation. To better achieve SSPR, we design a feature fusion module that integrates the semantics of each segment using cascaded 1D convolutions. Furthermore, the imbalanced distribution phenomenon is usually ignored in AQA and we attempt to propose a new loss called positive-weighting MSE (PW-MSE) to deal with it. PW-MSE encourages the network to focus more on densely distributed samples during training, which further improves the network’s ranking performance. Experimental results on the benchmark datasets (i.e., UNLV-Dive and AQA-7) demonstrate that our proposed method outperforms the current state-of-the-arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing Action Quality via Attentive Spatio-Temporal Convolutional Networks

Procedure-Aware Action Quality Assessment: Datasets and Performance Evaluation

Article 14 July 2024

Gaussian guided frame sequence encoder network for action quality assessment

Article Open access 27 October 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Anastasiou D, Jin YM, Stoyanov D, Mazomenos E (2023) Keep your eye on the best: Contrastive regression transformer for skill assessment in robotic surgery. IEEE Robot Autom Lett 8(3):1755–1762
Article Google Scholar
Bai Y, Zhou D, Zhang SY, Wang J, Ding E, Guan Y, Wang JD (2022) Action quality assessment with temporal parsing transformer. In: ECCV, Springer, pp 422–438
Carreira J, Zisserman A (2017) Quo vadis, action recognition? a new model and the kinetics dataset. In: CVPR, IEEE, pp 6299–6308
Chen WH, Chai Y, Qi M, Sun H, Pu Q, Kong J, Zheng CX (2022) Bottom-up improved multistage temporal convolutional network for action segmentation. Appl Intell 52(12):14053–14069
Article Google Scholar
Dong LJ, Zhang HB, Shi Q, Lei Q, Du JX, Gao S (2021) Learning and fusing multiple hidden substages for action quality assessment. Knowl-Based Syst 229:107388
Article Google Scholar
Farha YA, Gall J (2019) Ms-tcn: Multi-stage temporal convolutional network for action segmentation. In: CVPR, IEEE, pp 3575–3584
Gan WS, Wu WH, Chen SF, Zhao YX, Wong PK (2023) Rethinking 3d cost aggregation in stereo matching. Pattern Recognit Lett 167:75–81
Article Google Scholar
Gao JB, Zheng WS, Pan JH, Gao CY, Wang YW, Zeng W, Lai JH (2020) An asymmetric modeling for action assessment. In: ECCV, Springer, pp 222–238
Gavas RD, Das M, Ghosh SK, Pal A (2023) Spatial-smote for handling imbalance in spatial regression tasks. Multimed Tools Appl 1–22
Graves A, Fernández S, Schmidhuber J (2005) Bidirectional lstm networks for improved phoneme classification and recognition. In: International conference on artificial neural networks, Springer, pp 799–804
Hao N, Ruan SH, Song YH, Chen JS, Tian LG (2023) The establishment of a precise intelligent evaluation system for sports events: Diving. Heliyon 9(11)
Ishikawa Y, Kasai S, Aoki Y, Kataoka H (2021) Alleviating over-segmentation errors by detecting action boundaries. In: WACV, IEEE, pp 2322–2331
Jain H, Harit G, Sharma A (2020) Action quality assessment using siamese network-based deep metric learning. IEEE Trans Circuits Syst Video Technol 31(6):2260–2273
Article Google Scholar
Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal convolutional networks for action segmentation and detection. In: CVPR, IEEE, pp 156–165
Lei Q, Li HY, Zhang HB, Du JX, Gao SC (2023) Multi-skeleton structures graph convolutional network for action quality assessment in long videos. Appl Intell 1–14
Li HG, Qian WH, Nie RC, Cao JD, Xu D (2023) Siamese conditional generative adversarial network for multi-focus image fusion. Appl Intell 1–16
Li MZ, Zhang HB, Dong LJ, Lei Q, Du JX (2023) Gaussian guided frame sequence encoder network for action quality assessment. Complex Intell Syst 9(2):1963–1974
Article Google Scholar
Li MZ, Zhang HB, Lei Q, Fan Z, Liu J, Du JX (2022) Pairwise contrastive learning network for action quality assessment. In: ECCV, Springer, pp 457–473
Li Y, Chai X, Chen X (2018) End-to-end learning for action quality assessment. In: Pacific rim conference on multimedia, Springer, pp 125–134
Li Y, Chai X, Chen X (2018) Scoringnet: Learning key fragment for action quality assessment with ranking loss in skilled sports. In: ACCV, Springer, pp 149–164
Liu J, Liu Y, Li D, Wang HQ, Huang XH, Song L (2023) Dsdcla: Driving style detection via hybrid cnn-lstm with multi-level attention fusion. Appl Intell 1–18
Nekoui M, Cruz FOT, Cheng L (2020) Falcons: Fast learner-grader for contorted poses in sports. In: CVPR workshops. IEEE
Nekoui M, Cruz FOT, Cheng L (2021) Eagle-eye: Extreme-pose action grader using detail bird’s-eye view. In: WACV, IEEE, pp 394–402
Pan JH, Gao J, Zheng WS (2019) Action assessment by joint relation graphs. In: ICCV, IEEE, pp 6331–6340
Pan JH, Gao J, Zheng WS (2022) Adaptive action assessment. IEEE Trans Pattern Anal Mach Intell 44(12):8779–8795
Article Google Scholar
Parmar P, Morris B (2022) Win-fail action recognition. In: WACV Workshop, IEEE, pp 161–171
Parmar P, Morris BT (2017) Learning to score olympic events. In: CVPR workshops, IEEE, pp 20–28
Parmar P, Morris BT (2019) Action quality assessment across multiple actions. In: WACV, IEEE, pp 1468–1476
Parmar P, Morris BT (2019) What and how well you performed? a multitask learning approach to action quality assessment. In: CVPR, IEEE, pp 304–313
Steininger M, Kobs K, Davidson P, Krause A, Hotho A (2021) Density-based weighting for imbalanced regression. Mach Learn 110:2187–2211
Article MathSciNet Google Scholar
Tang YS, Ni ZL, Zhou JH, Zhang DY, Lu JW, Wu Y, Zhou J (2020) Uncertainty-aware score distribution learning for action quality assessment. In: CVPR, IEEE, pp 9839–9848
Tian Y, Pang GS, Chen YH, Singh R, Verjans JW, Carneiro G (2021) Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In: CVPR, IEEE, pp 4975–4986
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: ICCV, IEEE, pp 4489–4497
Wang JH, Du ZY, Li A, Wang YH (2020) Assessing action quality via attentive spatio-temporal convolutional networks. In: PRCV, Springer, pp 3–16
Wang Q, Zhang L, Bertinetto L, Hu WM, Torr PHS (2019) Fast online object tracking and segmentation: A unifying approach. In: CVPR, IEEE, pp 1328–1338
Wang SL, Yang DK, Zhai P, Chen CX, Zhang LH (2021) Tsa-net: Tube self-attention network for action quality assessment. In: ACM MM, ACM, pp 4902–4910
Wang TY, Jin MH, Li M (2021) Towards accurate and interpretable surgical skill assessment: a video-based method for skill score prediction and guiding feedback generation. Int J Comput Assist Radiol Surg 16(9):1595–1605
Article Google Scholar
Xiang X, Tian Y, Reiter A, Hager GD, Tran TD (2018) S3d: Stacking segmental p3d for action quality assessment. In: ICIP, IEEE, pp 928–932
Xu JL, Rao Y, Yu X, Chen G, Zhou J, Lu J (2022) Finediving: A fine-grained dataset for procedure-aware action quality assessment. In: CVPR, IEEE, pp 2949–2958
Yang DW, Cao Z, Mao L, Zhang RB (2022) A temporal and channel-combined attention block for action segmentation. Appl Intell 53(3):2738–2750
Article Google Scholar
Yang YZ, Zha KW, Chen Y, Wang H, Katabi D (2021) Delving into deep imbalanced regression. In: ICML, PMLR, pp 11842–11851
Yi FQ, Wen HY, Jiang TT (2021) Asformer: Transformer for action segmentation. In: BMVC, BMVA Press, pp 236
Yu XM, Rao YM, Zhao WL, Lu JW, Zhou J (2021) Group-aware contrastive regression for action quality assessment. In: ICCV, IEEE, pp 7919–7928
Zeng LA, Hong FT, Zheng WS, Yu QZ, Zeng W, Wang YW, Lai JH (2020) Hybrid dynamic-static context-aware attention network for action assessment in long videos. In: ACM MM, ACM, pp 2526–2534
Zhang HB, Dong LJ, Lei Q, Yang LJ, Jiang YG, Du JX (2023) Label-reconstruction-based pseudo-subscore learning for action quality assessment in sporting events. Appl Intell 53(9):10053–10067
Article Google Scholar
Zhang SJ, Pan JH, Gao J, Zheng WS (2022) Semi-supervised action quality assessment with self-supervised segment feature recovery. EEE Trans Circuits Syst Video Technol 32(9):6017–6028
Article Google Scholar
Zhang SJ, Pan JH, Gao J, Zheng WS (2023) Adaptive stage-aware assessment skill transfer for skill determination. IEEE Trans Multimed 1
Zhang SY, Dai WX, Wang SJ, Shen XW, Lu JW, Zhou J, Tang YS (2023) Logo: a long-form video dataset for group action quality assessment. In: CVPR, IEEE, pp 2405–2414
Zhang Y, Xiong W, Mi SY (2022) Learning time-aware features for action quality assessment. Pattern Recognit Lett 158:104–110
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the Key Research and Development Plan of Zhejiang under Grant 2021C03131, in part by the National Natural Science Foundation of China under Grant 61871170. We would like to thank Xiao-Diao Chen and Wen Wu for collaborating with us in proofreading and refining the paper.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China
Feng Huang
School of Information Science and Technology, Hangzhou Normal University, Hangzhou, 311121, China
Jianjun Li

Authors

Feng Huang
View author publications
You can also search for this author inPubMed Google Scholar
Jianjun Li
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Huang Feng: Data acquisition. Experiments. Validation. Investigation. Writing & editing, Resources. Li Jianjun: Conceptualization, Methodology, Funding acquisition, Supervision, Writing - review & editing. Corresponding author.

Corresponding author

Correspondence to Jianjun Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical and informed consent for data used

The data used in this paper are from publicly available datasets and do not violate any ethical guidelines.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, F., Li, J. Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting. Appl Intell 54, 3245–3259 (2024). https://doi.org/10.1007/s10489-024-05349-6

Download citation

Accepted: 17 February 2024
Published: 29 February 2024
Issue Date: February 2024
DOI: https://doi.org/10.1007/s10489-024-05349-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing action quality with semantic-sequence performance regression and densely distributed sample weighting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Assessing Action Quality via Attentive Spatio-Temporal Convolutional Networks

Procedure-Aware Action Quality Assessment: Datasets and Performance Evaluation

Gaussian guided frame sequence encoder network for action quality assessment

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical and informed consent for data used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now