skip to main content
10.1145/3641825.3689531acmconferencesArticle/Chapter ViewAbstractPublication PagesvrstConference Proceedingsconference-collections
abstract

A Transfer Learning Approach for Music-driven 3D Conducting Motion Generation with Limited Data

Published: 09 October 2024 Publication History

Abstract

Generating motions based on audio using deep learning has been studied steadily. However, previous research has mainly focused on speech-driven 3D gesture generation and music-driven 3D dance motion generation. We aim to generate 3D motions for specific scenarios, such as conducting. To address the challenge of lacking existing training datasets, we constructed a multi-modal 3D conducting motion dataset, which containing 1.43 hours and is, to our knowledge, a small-scale dataset. Furthermore, we propose a novel approach that uses transfer learning with a model pre-trained on a speech gesture dataset to generate 3D conducting motions. We evaluate the generated motions both with and without transfer learning, using quantitative and qualitative metrics. Our results show that the proposed method improves performance in both aspects compared to the baseline without transfer learning.

References

[1]
Taras Kucherenko, Dai Hasegawa, Gustav Eje Henter, Naoshi Kaneko, and Hedvig Kjellström. 2019. Analyzing input and output representations for speech-driven gesture generation. In Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents. 97–104.
[2]
Ruilong Li, Shan Yang, David A Ross, and Angjoo Kanazawa. 2021. Ai choreographer: Music conditioned 3d dance generation with aist++. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13401–13412.
[3]
Fan Liu, De-Long Chen, Rui-Zhi Zhou, Sai Yang, and Feng Xu. 2022. Self-supervised music motion synchronization learning for music-driven conducting motion generation. Journal of Computer Science and Technology 37, 3 (2022), 539–558.
[4]
Zhuoran Zhao, Jinbin Bai, Delong Chen, Debang Wang, and Yubo Pan. 2023. Taming diffusion models for music-driven conducting motion generation. In Proceedings of the AAAI Symposium Series, Vol. 1. 40–44.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
VRST '24: Proceedings of the 30th ACM Symposium on Virtual Reality Software and Technology
October 2024
633 pages
ISBN:9798400705359
DOI:10.1145/3641825
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2024

Check for updates

Author Tags

  1. 3D Conducting Motion
  2. Multi-modal Dataset
  3. Music-driven Motion Generation
  4. Transfer Learning

Qualifiers

  • Abstract
  • Research
  • Refereed limited

Conference

VRST '24

Acceptance Rates

Overall Acceptance Rate 66 of 254 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 50
    Total Downloads
  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)4
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media