research-article

CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit

Authors:

Zhaoxin FanAuthors Info & Claims

ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval

Pages 675 - 683

https://doi.org/10.1145/3652583.3657998

Published: 07 June 2024 Publication History

Abstract

Dance and music are intimately interconnected, with group dance being a crucial part of dance artistry. Consequently, Music-Driven Group Dance Generation has been a fundamental and challenging task in various fields like education, art, and sports. However, existing methods fail to fully explore group dance coherence. Thus, we propose CoDancers, a novel and efficient retrieval-based music-driven group dance generation framework. CoDancers improves performance by decomposing group dance coherence into individual movement coherence and group interaction coherence for specialized design, incorporating a Spatial-Temporal Group Dance Blender block, a Acoustic-Semantic Music Miner block, and a Stereotype-Reducing Dance Generator block. Experimental results on the public dataset demonstrate the superiority of our method over existing baselines, achieving state-of-the-art performance. The code is available at https://github.com/XulongT/CoDancers.

References

[1]

Omid Alemi, Jules Françoise, and Philippe Pasquier. 2017. GrooveNet: Realtime music-driven dance movement generation using artificial neural networks. networks 8, 17 (2017), 26.

[2]

Andreas Aristidou, Anastasios Yiannakidis, Kfir Aberman, Daniel Cohen-Or, Ariel Shamir, and Yiorgos Chrysanthou. 2022. Rhythm is a dancer: Music-driven motion synthesis with global structure. IEEE Transactions on Visualization and Computer Graphics (2022).

[3]

Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, and Samuel Albanie. 2022. Cross modal retrieval with querybank normalisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5194--5205.

[4]

Kang Chen, Zhipeng Tan, Jin Lei, Song-Hai Zhang, Yuan-Chen Guo, Weidong Zhang, and Shi-Min Hu. 2021. Choreomaster: choreography-oriented musicdriven dance synthesis. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1--13.

Digital Library

[5]

Yiming Dou. 2023. Pose_to_SMPL: A Tool for Converting Pose to SMPL Parameters. https://github.com/Dou-Yiming/Pose_to_SMPL. Accessed: 2024-02-12.

[6]

Joao P Ferreira, Thiago M Coutinho, Thiago L Gomes, José F Neto, Rafael Azevedo, Renato Martins, and Erickson R Nascimento. 2021. Learning to dance: A graph convolutional adversarial network to generate realistic dance motions from audio. Computers & Graphics 94 (2021), 11--21.

[7]

Kehong Gong, Dongze Lian, Heng Chang, Chuan Guo, Zihang Jiang, Xinxin Zuo, Michael Bi Mi, and Xinchao Wang. 2023. Tm2d: Bimodality driven 3d dance generation via music-text integration. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9942--9952.

[8]

S Hochreiter and J Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780.

Digital Library

[9]

Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--11.

Digital Library

[10]

Ruozi Huang, Huang Hu, Wei Wu, Kei Sawada, Mi Zhang, and Daxin Jiang. 2020. Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning. In International Conference on Learning Representations.

[11]

Yuhang Huang, Junjie Zhang, Shuyan Liu, Qian Bao, Dan Zeng, Zhineng Chen, and Wu Liu. 2022. Genre-conditioned long-term 3d dance generation driven by music. In ICASSP 2022--2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4858--4862.

[12]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[13]

Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, and Xinchao Wang. 2023. Priority-Centric Human Motion Generation in Discrete Latent Space. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14806--14816.

[14]

Nhat Le, Thang Pham, Tuong Do, Erman Tjiputra, Quang D Tran, and Anh Nguyen. 2023. Music-Driven Group Choreography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8673--8682.

[15]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.

[16]

Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, and Jan Kautz. 2019. Dancing to music. Advances in neural information processing systems 32 (2019).

[17]

Buyu Li, Yongchi Zhao, Shi Zhelun, and Lu Sheng. 2022. Danceformer: Music conditioned 3d dance generation with parametric motion transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1272--1279.

[18]

Ruilong Li, Shan Yang, David A Ross, and Angjoo Kanazawa. 2021. Ai choreographer: Music conditioned 3d dance generation with aist. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13401--13412.

[19]

Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, et al. 2023. MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training. arXiv preprint arXiv:2306.00107 (2023).

[20]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.

[21]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A Skinned Multi-Person Linear Model. Acm Transactions on Graphics 34, Article 248 (2015).

Digital Library

[22]

Anqi Mao, Mehryar Mohri, and Yutao Zhong. 2023. Cross-entropy loss functions: Theoretical analysis and applications. arXiv preprint arXiv:2304.07288 (2023).

[23]

Brian McFee, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. 2015. librosa: Audio and music signal analysis in python. In Proceedings of the 14th python in science conference, Vol. 8.

[24]

Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. 2017. Neural discrete representation learning. In Advances in Neural Information Processing Systems. 6306--6315.

[25]

Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, and Shuicheng Yan. 2023. DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation. In Proceedings of the 31st ACM International Conference on Multimedia. 1374--1382.

Digital Library

[26]

Ali Razavi, Aaron van den Oord, and Oriol Vinyals. 2019. Generating Diverse High-Fidelity Images with VQ-VAE-2. In Advances in Neural Information Processing Systems.

[27]

Xuanchi Ren, Haoran Li, Zijian Huang, and Qifeng Chen. 2020. Self-supervised dance video synthesis conditioned on music. In Proceedings of the 28th ACM International Conference on Multimedia. 46--54.

Digital Library

[28]

Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, and Ziwei Liu. 2022. Bailando: 3d dance generation by actorcritic gpt with choreographic memory. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11050--11059.

[29]

Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, and Ziwei Liu. 2023. Bailando: 3D Dance GPT With Choreographic Memory. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

[30]

Guofei Sun, Yongkang Wong, Zhiyong Cheng, Mohan S Kankanhalli, Weidong Geng, and Xiangdong Li. 2020. DeepDance: music-to-dance motion choreography with adversarial learning. IEEE Transactions on Multimedia 23 (2020), 497--509.

[31]

Taoran Tang, Jia Jia, and Hanyang Mao. 2018. Dance with melody: An lstmautoencoder approach to music-oriented dance synthesis. In Proceedings of the 26th ACM international conference on Multimedia. 1598--1606.

Digital Library

[32]

Guillermo Valle-Pérez, Gustav Eje Henter, Jonas Beskow, Andre Holzapfel, Pierre- Yves Oudeyer, and Simon Alexanderson. 2021. Transflower: probabilistic auto regressive dance generation with multimodal attention. ACM Transactions on Graphics (TOG) 40, 6 (2021), 1--14.

Digital Library

[33]

Zixuan Wang, Jia Jia, Haozhe Wu, Junliang Xing, Jinghe Cai, Fanbo Meng, Guowen Chen, and Yanfeng Wang. 2022. Groupdancer: Music to multi-people dance synthesis with style collaboration. In Proceedings of the 30th ACM International Conference on Multimedia. 1138--1146.

Digital Library

[34]

Sijie Yan, Zhizhong Li, Yuanjun Xiong, Huahan Yan, and Dahua Lin. 2019. Convolutional sequence generation for skeleton-based action synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4394--4402.

[35]

Siyue Yao, Mingjie Sun, Bingliang Li, Fengyu Yang, Junle Wang, and Ruimao Zhang. 2023. Dance with You: The Diversity Controllable Dancer Generation via Diffusion Models. In Proceedings of the 31st ACM International Conference on Multimedia. 8504--8514.

Digital Library

[36]

Zijie Ye, Haozhe Wu, Jia Jia, Yaohua Bu, Wei Chen, Fanbo Meng, and Yanfeng Wang. 2020. Choreonet: Towards music to dance synthesis with choreographic action unit. In Proceedings of the 28th ACM International Conference on Multimedia. 744--752.

Digital Library

Index Terms

CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit
1. Applied computing
  1. Arts and humanities

Recommendations

Music2Dance: DanceNet for Music-Driven Dance Generation
Synthesize human motions from music (i.e., music to dance) is appealing and has attracted lots of research interests in recent years. It is challenging because of the requirement for realistic and complex human motions for dance, but more importantly, the ...
ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Dance and music are two highly correlated artistic forms. Synthesizing dance motions has attracted much attention recently. Most previous works conduct music-to-dance synthesis via directly music to human skeleton keypoints mapping. Meanwhile, human ...
Pop Music Generation: From Melody to Multi-style Arrangement
Special Issue on KDD 2018, Regular Papers and Survey Paper

Music plays an important role in our daily life. With the development of deep learning and modern generation techniques, researchers have done plenty of works on automatic music generation. However, due to the special requirements of both melody and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval

May 2024

1379 pages

ISBN:9798400706196

DOI:10.1145/3652583

General Chairs:
Cathal Gurrin
Dublin City University, Ireland
,
Rachada Kongkachandra
Thammasat University, Thailand
,
Klaus Schoeffmann
Klagenfurt University, Austria
,
Program Chairs:
Duc-Tien Dang-Nguyen
University of Bergen, Norway
,
Luca Rossetto
University of Zurich, Switzerland
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Liting Zhou
Dublin City University, Ireland

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

ICMR '24

Sponsor:

ICMR '24: International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket, Thailand

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
185
Total Downloads

Downloads (Last 12 months)185
Downloads (Last 6 weeks)28

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten