Automatic generation of Labanotation based on human pose estimation in folk dance videos

Cai, Xingquan; Wang, Tong; Lu, Rui; Jia, Sichen; Sun, Haiyan

doi:10.1007/s00521-023-08206-8

Automatic generation of Labanotation based on human pose estimation in folk dance videos

S.I.: Applications and Techniques in Cyber Intelligence (ATCI2022)
Published: 27 January 2023

Volume 35, pages 24755–24771, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xingquan Cai ORCID: orcid.org/0000-0002-5996-2728¹,
Tong Wang¹,
Rui Lu¹,
Sichen Jia¹ &
…
Haiyan Sun¹

280 Accesses
1 Altmetric
Explore all metrics

Abstract

Existing Labanotation generation methods have some drawbacks due to low efficiency and incapability to recognize existing videos, which can also be affected by the quality of hardware equipment. To address the issues in existing methods, we propose a new Labanotation generation method for folk dance videos based on pose estimation. Specifically, our method first extracts the key frame images from the fork dance video using temporal differences. Afterward, the 2D joint points of a dancer can be detected from key frame images by using multi-scale fusion of high-resolution net (HRNet), then maps the 2D–3D joint point sequence of the dancer using a pose projection generative adversarial network (pose projection GAN) to predict the coordinates of the 3D joint point position. Finally, the corresponding Labanotation can be generated by analyzing the estimate posture. Experimental results show that the method can achieve the conversion of dance movements in folk dance videos into digital Labanotation, and the automatic generation is much more efficient than manual recording. This method can quickly record endangered folk dances and contribute to the preservation and transmission of movement-based intangible cultural heritage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 4

Fig. 6

Fig. 7

Deepfake generation and detection, a survey

Article 08 January 2022

A hybrid deep learning framework for daily living human activity recognition with cluster-based video summarization

Article 18 April 2024

A literature review and perspectives in deepfakes: generation, detection, and applications

Article 23 July 2022

Data availability

The datasets generated during and analyzed during the current study will be made available upon reasonable academic request within the limitations of informed consent by the corresponding author upon acceptance.

References

Jiang P, Qin XL (2018) Adaptive video keyframe extraction based on visual attention model. J Image Gr 14(8):1650–1655
Google Scholar
He J, Zhang C, He XL et al (2020) Visual recognition of traffic police gestures with convolutional pose machine and handcrafted features. Neurocomputing 390(5):248–259. https://doi.org/10.1016/j.neucom.2019.07.103
Article Google Scholar
Zhang XK, Zhang RF, Liu YH (2020) Human pose estimation based on quadratic generative antagonism. Laser Optoelectron Prog 679(20):335–343. https://doi.org/10.3788/LOP57.201509
Article Google Scholar
Glas S, Kiesel R, Kolkmann S et al (2020) Intraday renewable electricity trading: advanced modeling and numerical optimal control. J Math Ind 10(2):49–85. https://doi.org/10.1186/s13362-020-0071-x
Article MathSciNet MATH Google Scholar
Feng GM, Liu YJ (2021) Visual algorithm for on-the-job behavior analysis. Comput Eng Des 42(6):1668–1676
Google Scholar
Lian RM, Liu Y, Yu P et al (2019) Video based human pose detection methods and their applications. Comput Program Skills Maint 9:127–129. https://doi.org/10.3969/j.issn.1006-4052.2019.09.046
Article Google Scholar
Zhou KY (2021) Fitness action recognition system based on deep learning. Ind Control Comput 34(6):37–39
Google Scholar
Baltaoglu S, Tong L, Zhao Q (2018) Algorithmic bidding for virtual trading in electricity markets. IEEE Trans Power Syst 34(21):535–543. https://doi.org/10.1109/TPWRS.2018.2862246
Article Google Scholar
Cai Z, Shi T (2021) Distributed query processing in the edge-assisted IoT data monitoring system. IEEE Internet Things J 8(16):12679–12693. https://doi.org/10.1109/JIOT.2020.3026988
Article Google Scholar
Toshev A, Szegedy C (2014) DeepPose: human pose estimation via deep neural networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1653–1660. https://doi.org/10.1109/CVPR.2014.214
Wei S, Ramakrishna V, Kanade T et al (2016) Convolutional pose machines. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 4724–4732. https://doi.org/10.1109/CVPR.2016.511
Cao Z, Simon T, Wei SE et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1302–1310. https://doi.org/10.48550/arXiv.1611.08050
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision (ECCV), pp 483–499. https://doi.org/10.1007/978-3-319-46484-8_29
Fang HS, Xie S, Tai YW et al (2017) RMPE: regional multi-person pose estimation. In: IEEE International conference on computer vision (ICCV), pp 2353–2362. https://doi.org/10.1109/ICCV.2017.256
Sun K, Xiao B, Liu D et al (2019) Deep high-resolution representation learning for human pose estimation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5686–5696. https://doi.org/10.1109/CVPR.2019.00584
Shen L, Chen Y (2020) End-to-end unlabeled human pose estimation network based on high-dimensional information encoding and decoding with feature monitoring. Acta Electron Sin 48(8):1528–1537. https://doi.org/10.3969/j.issn.0372-2112.2020.08.010
Article MathSciNet Google Scholar
Xu J, Wan H, Chen ZY (2019) Sharp skirt bandpass filter-integrated single-pole double-throw switch with absorptive OFF-state. IEEE Trans Microw Theory Tech 67(2):704–711. https://doi.org/10.1109/TMTT.2018.2880914
Article Google Scholar
Feng T (2019) Three-dimensional human pose estimation based on monocular vision. Harbin Institute of Technology, Harbin. https://doi.org/10.27061/d.cnki.ghgdu.2019.000896
Fan SR, Jia YT, Liu JH (2019) Feature selection for human pose recognition based on three-axis acceleration sensor. Chin J Appl Sci 37(03):427–436. https://doi.org/10.3969/j.issn.0255-8297.2019.03.013
Article Google Scholar
Kanazawa A, Black MJ, Jacobs DW et al (2018) End-to-end recovery of human shape and pose. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7122–7131. https://doi.org/10.1109/CVPR.2018.00744
Mehta D, Sridhar S, Sotnychenko O et al (2017) Vnect: real-time 3d human pose estimation with a single RGB camera. ACM Trans Gr 36:44.1-44.14
Article Google Scholar
Cai Z, Esposito C, Dargahi T et al (2022) Graph-powered learning for social networks. Neurocomputing 501:244–245. https://doi.org/10.1016/j.neucom.2022.05.029
Article Google Scholar
Cai XQ, Wang T, Bai X et al (2022) Pogt: a peking opera gesture training system using infrared sensors. Int J Pattern Recognit Artif Intell 36(6):2256011. https://doi.org/10.1142/S0218001422560110
Article Google Scholar
Martinez J, Hossain R, Romero J et al (2017) A simple yet effective baseline for 3d human pose estimation. In: IEEE international conference on computer vision (ICCV), pp 2659–2668. https://doi.org/10.1109/ICCV.2017.288
Hossain M, Little J (2018) Exploiting temporal information for 3d human pose estimation. In: European conference on computer vision (ECCV), pp 69–86. https://doi.org/10.1007/978-3-030-01249-6_5
Pavllo D, Feichtenhofer C, Grangier D et al (2019) 3d human pose estimation in video with temporal convolutions and semi-supervised training. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7753–7762.https://doi.org/10.1109/CVPR.2019.00794
Hachimura K, Nakamura M (2001) Method of generating coded description of human body motion from motion-captured data. In: IEEE International workshop on robot and human interactive communication (ROMAN), pp 122–127. https://doi.org/10.1109/ROMAN.2001.981889
Chen H, Qian G, James J (2005) An autonomous dance scoring system using marker-based motion capture. In: Workshop on multimedia signal processing (MMSP), pp 1–4. https://doi.org/10.1109/MMSP.2005.248666
Choensawat W, Nakamura M, Hachimura K (2015) GenLaban: a tool for generating Labanotation from motion capture data. Multimed Tools Appl 74:10823–10846. https://doi.org/10.1007/s11042-014-2209-6
Article Google Scholar
Guo H (2015) Research on automatic generation of Labanotation based on human motion capture data. Beijing Jiaotong Univ, Beijing. https://doi.org/10.7666/d.Y2916406
Book Google Scholar
Guo H, Miao ZJ, Zhu FY et al (2014) Automatic labanotation generation based on human motion capture data. In: Chinese conference on pattern recognition (CCPR), pp 426–435. https://doi.org/10.1007/978-3-662-45646-0_44
Zhou ZM, Miao ZJ, Wang JJ (2016) A system for automatic generation of Labanotation from motion capture data. In: International conference on signal processing (ICSP), pp 1031–1034. https://doi.org/10.1109/ICSP.2016.7877986
Zhou ZM (2017) Research on automatic generation of Labanotation based on dynamic programming. Beijing Jiaotong University, Beijing
Google Scholar

Download references

Acknowledgements

This work was supported by the Funding Project of Humanities and Social Sciences of the Ministry of Education in China (22YJAZH002), the Funding Project of Beijing Social Science Foundation (Nos. 19YTC043, 20YTB011). We would like to thank those who care of this paper and our projects. Also, we would like to thank everyone who spent time on reading early versions of this paper, including the anonymous reviewers.

Author information

Authors and Affiliations

School of Information Science and Technology, North China University of Technology, Beijing, 100144, China
Xingquan Cai, Tong Wang, Rui Lu, Sichen Jia & Haiyan Sun

Authors

Xingquan Cai
View author publications
You can also search for this author in PubMed Google Scholar
Tong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Lu
View author publications
You can also search for this author in PubMed Google Scholar
Sichen Jia
View author publications
You can also search for this author in PubMed Google Scholar
Haiyan Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xingquan Cai.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cai, X., Wang, T., Lu, R. et al. Automatic generation of Labanotation based on human pose estimation in folk dance videos. Neural Comput & Applic 35, 24755–24771 (2023). https://doi.org/10.1007/s00521-023-08206-8

Download citation

Received: 20 September 2022
Accepted: 06 January 2023
Published: 27 January 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00521-023-08206-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic generation of Labanotation based on human pose estimation in folk dance videos

Abstract

Access this article

Similar content being viewed by others

Deepfake generation and detection, a survey

A hybrid deep learning framework for daily living human activity recognition with cluster-based video summarization

A literature review and perspectives in deepfakes: generation, detection, and applications

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic generation of Labanotation based on human pose estimation in folk dance videos

Abstract

Access this article

Similar content being viewed by others

Deepfake generation and detection, a survey

A hybrid deep learning framework for daily living human activity recognition with cluster-based video summarization

A literature review and perspectives in deepfakes: generation, detection, and applications

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation