Abstract
Nowadays, action quality assessment has attracted more and more attention of the researchers in computer vision. In this paper, an end-to-end framework is proposed based on fragment-based 3D convolutional neural network to realize the action quality assessment in videos. Furthermore, the ranking loss integrated with the MSE forms the loss function to make the optimization more reasonable in terms of both the score value and the ranking aspects. Through the deep learning, we narrow the gap between the predictions and ground-truth scores as well as making the predictions satisfy the ranking constraint. The proposed network can indeed learn the evaluation criteria of actions and works well with limited training data. Widely experiments conducted on three public datasets convincingly show that our method achieves the state-of-the-art results.
This work was partially supported by 973 Program under contract No2015CB351802, Natural Science Foundation of China under contracts Nos. 61390511, 61472398, 61532018.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Vault. https://en.wikipedia.org/wiki/Vault-(gymnastics). 2.1.2. Accessed 2018
List of Olympic Games Scandals and Controversies. https://en.wikipedia.org/wiki/List-of-Olympic-Games-boycotts
FINA-DIVING RULES. http://www.fina.org/content/diving-rules. D8.1.3
Pirsiavash, H., Vondrick, C., Torralba, A.: Assessing the quality of actions. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 556–571. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_36
Tao, L., et al.: A comparative study of pose representation and dynamics modelling for online motion quality assessment. Comput. Vis. Image Underst. 148, 136–152 (2016)
Parmar, P., Morris, B.: Measuring the quality of exercises. In: 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), pp. 2241–2244 (2016)
Parmar, P., Morris, B.: Learning to score olympic events. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 76–84 (2017)
Zia, A., Sharma, Y., Bettadapura, V., Sarin, E.L., Clements, M.A., Essa, I.: Automated assessment of surgical skills using frequency analysis. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 430–438. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_53
Carvajal, J., Wiliem, A., Sanderson, C., Lovell, B.: Towards miss universe automatic prediction: the evening gown competition. In: 23rd International Conference on Pattern Recognition, pp. 1089–1094 (2016)
Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: International Conference on Computer Vision, pp. 4489–4497 (2015)
Venkataraman, V., Vlachos, I., Turaga, P.: Dynamical regularity for action analysis. In: 26th British Machine Vision Conference, pp. 67.1–67.12 (2015)
Soomro, K., Zamir, A., Shah, M.: UCF101: A Dataset of 101 Human Actions Classes from Videos in The Wild. arXiv preprint arXiv:1212.0402 (2012)
Chai, X., Liu, Z., Li, Y., Yin, F., Chen, X.: SignInstructor: an effective tool for sign language vocabulary learning. In: 4th Asian Conference on Pattern Recognition (2017)
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3361–3368 (2011)
Kingma, D., Ba, J.: Adam: A Method for Stochastic Optimization arXiv preprint arXiv:1412.6980 (2014)
Carreira, J., Zisserman, A.: Quo Vadis, action recognition? a new model and the kinetics dataset. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4733 (2017)
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: International Conference on Computer Vision, pp. 5534–5542 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y., Chai, X., Chen, X. (2018). End-To-End Learning for Action Quality Assessment. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-00767-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00766-9
Online ISBN: 978-3-030-00767-6
eBook Packages: Computer ScienceComputer Science (R0)