skip to main content
10.1145/3652583.3657998acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit

Published: 07 June 2024 Publication History

Abstract

Dance and music are intimately interconnected, with group dance being a crucial part of dance artistry. Consequently, Music-Driven Group Dance Generation has been a fundamental and challenging task in various fields like education, art, and sports. However, existing methods fail to fully explore group dance coherence. Thus, we propose CoDancers, a novel and efficient retrieval-based music-driven group dance generation framework. CoDancers improves performance by decomposing group dance coherence into individual movement coherence and group interaction coherence for specialized design, incorporating a Spatial-Temporal Group Dance Blender block, a Acoustic-Semantic Music Miner block, and a Stereotype-Reducing Dance Generator block. Experimental results on the public dataset demonstrate the superiority of our method over existing baselines, achieving state-of-the-art performance. The code is available at https://github.com/XulongT/CoDancers.

References

[1]
Omid Alemi, Jules Françoise, and Philippe Pasquier. 2017. GrooveNet: Realtime music-driven dance movement generation using artificial neural networks. networks 8, 17 (2017), 26.
[2]
Andreas Aristidou, Anastasios Yiannakidis, Kfir Aberman, Daniel Cohen-Or, Ariel Shamir, and Yiorgos Chrysanthou. 2022. Rhythm is a dancer: Music-driven motion synthesis with global structure. IEEE Transactions on Visualization and Computer Graphics (2022).
[3]
Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, and Samuel Albanie. 2022. Cross modal retrieval with querybank normalisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5194--5205.
[4]
Kang Chen, Zhipeng Tan, Jin Lei, Song-Hai Zhang, Yuan-Chen Guo, Weidong Zhang, and Shi-Min Hu. 2021. Choreomaster: choreography-oriented musicdriven dance synthesis. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1--13.
[5]
Yiming Dou. 2023. Pose_to_SMPL: A Tool for Converting Pose to SMPL Parameters. https://github.com/Dou-Yiming/Pose_to_SMPL. Accessed: 2024-02-12.
[6]
Joao P Ferreira, Thiago M Coutinho, Thiago L Gomes, José F Neto, Rafael Azevedo, Renato Martins, and Erickson R Nascimento. 2021. Learning to dance: A graph convolutional adversarial network to generate realistic dance motions from audio. Computers & Graphics 94 (2021), 11--21.
[7]
Kehong Gong, Dongze Lian, Heng Chang, Chuan Guo, Zihang Jiang, Xinxin Zuo, Michael Bi Mi, and Xinchao Wang. 2023. Tm2d: Bimodality driven 3d dance generation via music-text integration. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9942--9952.
[8]
S Hochreiter and J Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780.
[9]
Daniel Holden, Jun Saito, and Taku Komura. 2016. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics (TOG) 35, 4 (2016), 1--11.
[10]
Ruozi Huang, Huang Hu, Wei Wu, Kei Sawada, Mi Zhang, and Daxin Jiang. 2020. Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning. In International Conference on Learning Representations.
[11]
Yuhang Huang, Junjie Zhang, Shuyan Liu, Qian Bao, Dan Zeng, Zhineng Chen, and Wu Liu. 2022. Genre-conditioned long-term 3d dance generation driven by music. In ICASSP 2022--2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4858--4862.
[12]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[13]
Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, and Xinchao Wang. 2023. Priority-Centric Human Motion Generation in Discrete Latent Space. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14806--14816.
[14]
Nhat Le, Thang Pham, Tuong Do, Erman Tjiputra, Quang D Tran, and Anh Nguyen. 2023. Music-Driven Group Choreography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8673--8682.
[15]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.
[16]
Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, and Jan Kautz. 2019. Dancing to music. Advances in neural information processing systems 32 (2019).
[17]
Buyu Li, Yongchi Zhao, Shi Zhelun, and Lu Sheng. 2022. Danceformer: Music conditioned 3d dance generation with parametric motion transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1272--1279.
[18]
Ruilong Li, Shan Yang, David A Ross, and Angjoo Kanazawa. 2021. Ai choreographer: Music conditioned 3d dance generation with aist. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13401--13412.
[19]
Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, et al. 2023. MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training. arXiv preprint arXiv:2306.00107 (2023).
[20]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.
[21]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. 2015. SMPL: A Skinned Multi-Person Linear Model. Acm Transactions on Graphics 34, Article 248 (2015).
[22]
Anqi Mao, Mehryar Mohri, and Yutao Zhong. 2023. Cross-entropy loss functions: Theoretical analysis and applications. arXiv preprint arXiv:2304.07288 (2023).
[23]
Brian McFee, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. 2015. librosa: Audio and music signal analysis in python. In Proceedings of the 14th python in science conference, Vol. 8.
[24]
Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. 2017. Neural discrete representation learning. In Advances in Neural Information Processing Systems. 6306--6315.
[25]
Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, and Shuicheng Yan. 2023. DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation. In Proceedings of the 31st ACM International Conference on Multimedia. 1374--1382.
[26]
Ali Razavi, Aaron van den Oord, and Oriol Vinyals. 2019. Generating Diverse High-Fidelity Images with VQ-VAE-2. In Advances in Neural Information Processing Systems.
[27]
Xuanchi Ren, Haoran Li, Zijian Huang, and Qifeng Chen. 2020. Self-supervised dance video synthesis conditioned on music. In Proceedings of the 28th ACM International Conference on Multimedia. 46--54.
[28]
Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, and Ziwei Liu. 2022. Bailando: 3d dance generation by actorcritic gpt with choreographic memory. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11050--11059.
[29]
Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, and Ziwei Liu. 2023. Bailando: 3D Dance GPT With Choreographic Memory. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[30]
Guofei Sun, Yongkang Wong, Zhiyong Cheng, Mohan S Kankanhalli, Weidong Geng, and Xiangdong Li. 2020. DeepDance: music-to-dance motion choreography with adversarial learning. IEEE Transactions on Multimedia 23 (2020), 497--509.
[31]
Taoran Tang, Jia Jia, and Hanyang Mao. 2018. Dance with melody: An lstmautoencoder approach to music-oriented dance synthesis. In Proceedings of the 26th ACM international conference on Multimedia. 1598--1606.
[32]
Guillermo Valle-Pérez, Gustav Eje Henter, Jonas Beskow, Andre Holzapfel, Pierre- Yves Oudeyer, and Simon Alexanderson. 2021. Transflower: probabilistic auto regressive dance generation with multimodal attention. ACM Transactions on Graphics (TOG) 40, 6 (2021), 1--14.
[33]
Zixuan Wang, Jia Jia, Haozhe Wu, Junliang Xing, Jinghe Cai, Fanbo Meng, Guowen Chen, and Yanfeng Wang. 2022. Groupdancer: Music to multi-people dance synthesis with style collaboration. In Proceedings of the 30th ACM International Conference on Multimedia. 1138--1146.
[34]
Sijie Yan, Zhizhong Li, Yuanjun Xiong, Huahan Yan, and Dahua Lin. 2019. Convolutional sequence generation for skeleton-based action synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4394--4402.
[35]
Siyue Yao, Mingjie Sun, Bingliang Li, Fengyu Yang, Junle Wang, and Ruimao Zhang. 2023. Dance with You: The Diversity Controllable Dancer Generation via Diffusion Models. In Proceedings of the 31st ACM International Conference on Multimedia. 8504--8514.
[36]
Zijie Ye, Haozhe Wu, Jia Jia, Yaohua Bu, Wei Chen, Fanbo Meng, and Yanfeng Wang. 2020. Choreonet: Towards music to dance synthesis with choreographic action unit. In Proceedings of the 28th ACM International Conference on Multimedia. 744--752.

Index Terms

  1. CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval
    May 2024
    1379 pages
    ISBN:9798400706196
    DOI:10.1145/3652583
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. group dance generation
    2. multimodal learning
    3. music-driven dance generation
    4. retrieval-based dance generation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICMR '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 185
      Total Downloads
    • Downloads (Last 12 months)185
    • Downloads (Last 6 weeks)28
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media