research-article

Motion to Dance Music Generation using Latent Diffusion Model

Authors:

Vanessa Tan,

Junghyun Nam,

Juhan Nam,

Junyong NohAuthors Info & Claims

SA '23: SIGGRAPH Asia 2023 Technical Communications

Article No.: 5, Pages 1 - 4

https://doi.org/10.1145/3610543.3626164

Published: 28 November 2023 Publication History

Get Access

Editorial Notes

The authors have requested minor, non-substantive changes to the Version of Record and, in accordance with ACM policies, a Corrected Version of Record was published on December 29, 2023. For reference purposes, the VoR may still be accessed via the Supplemental Material section on this page.

Abstract

The role of music in games and animation, particularly in dance content, is essential for creating immersive and entertaining experiences. Although recent studies have made strides in generating dance music from videos, their practicality in integrating music into games and animation remains limited. In this context, we present a method capable of generating plausible dance music from 3D motion data and genre labels. Our approach leverages a combination of a UNET-based latent diffusion model and a pre-trained VAE model. To evaluate the performance of the proposed model, we employ evaluation metrics to assess various audio properties, including beat alignment, audio quality, motion-music correlation, and genre score. The quantitative results show that our approach outperforms previous methods. Furthermore, we demonstrate that our model can generate audio that seamlessly fits to in-the-wild motion data. This capability enables us to create plausible dance music that complements dynamic movements of characters and enhances overall audiovisual experience in interactive media. Examples from our proposed model are available at this link: https://dmdproject.github.io/.

Supplementary Material

Version of Record for "Motion to Dance Music Generation using Latent Diffusion Model" by Tan et al., SIGGRAPH Asia 2023 Technical Communications. (3626164-vor.pdf)

Download
792.49 KB

Appendix (Supplementary.pdf)

Download
687.09 KB

MP4 File (Video.mp4)

Demo video

Download
89.88 MB

References

[1]

Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Apostol Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. YouTube-8M: A Large-Scale Video Classification Benchmark. ArXiv abs/1609.08675 (2016).

Editorial Notes

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement

DiffuseRoll: multi-track multi-attribute music generation based on diffusion model

Dance2Music-Diffusion: leveraging latent diffusion models for music generation from dance videos

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations