skip to main content
10.1145/3705956.3705967acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcctConference Proceedingsconference-collections
research-article

The Application of Neural Networks in Guitar Style Transformation

Published: 28 December 2024 Publication History

Abstract

Abstract: This paper explores techniques in audio style transfer using deep learning models and spectral analysis. Our approach transforms raw audio signals into the frequency domain using Short-Time Fourier Transform (STFT), facilitating the extraction of spatial features crucial for audio synthesis. Central to our methodology is utilizing neural networks such as RandomCNN and EfficientNetV2-S for effective feature extraction from spectrograms. These networks are adapted to preserve essential spectral characteristics while enhancing computational efficiency. Content loss functions ensure fidelity between generated and source audio, quantified through mean square error in feature space. Style transfer is achieved via Gram matrices, capturing temporal correlations to imbue generated audio with stylistic attributes akin to reference audio. For reconstructing audio from spectrograms, we employ the Fast Griffin-Lim Algorithm (FGLA), which iteratively estimates phase information from magnitude data to produce high-quality audio outputs. Experimental validation using the Guitarset dataset demonstrates the efficacy of our approach across diverse musical genres, including Funk and Bossa Nova. Spectrographic analysis of generated outputs validates the preservation of content integrity and faithful adoption of stylistic elements from reference audio. This research investigates the capabilities of neural networks in audio synthesis, promising applications in music production and artistic expression through nuanced style fusion and high-fidelity audio reproduction.

References

[1]
MicroMusic. (n.d.). Retrieved July 30, 2024, from https://micromusic.tech/
[2]
Huang, Z., Chen, S., & Zhu, B. Deep Learning for Audio Style Transfer.
[3]
Heller, B., Ryzhik, A., & Tesfai, Z. Evaluation of Vocal Audio Style Transfer.
[4]
randomCNN-voice-transfer. (n.d.). Retrieved July 29, 2024, from https://github.com/mazzzystar/randomCNN-voice-transfer
[5]
Verma, P., & Smith, J. O. (2018). Neural style transfer for audio spectograms. arXiv preprint arXiv:1801.01589.
[6]
Li, Y., Wang, N., Liu, J., & Hou, X. (2017). Demystifying neural style transfer. arXiv preprint arXiv:1701.01036.
[7]
Grinstein, E., Duong, N. Q., Ozerov, A., & Pérez, P. (2018, April). Audio style transfer. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 586-590). IEEE.
[8]
Tan, M., & Le, Q. (2021, July). Efficientnetv2: Smaller models and faster training. In International conference on machine learning (pp. 10096-10106). PMLR.
[9]
Tomczak, M., Southall, C., & Hockman, J. (2018, September). Audio style transfer with rhythmic constraints. In Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18).
[10]
Engel, J., Hantrakul, L., Gu, C., & Roberts, A. (2020). DDSP: Differentiable digital signal processing. arXiv preprint arXiv:2001.04643.
[11]
Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A neural algorithm of artistic style. *arxiv preprint arxiv:1508.06576*.
[12]
Perraudin, N., Balazs, P., & Søndergaard, P. L. (2013, October). A fast Griffin-Lim algorithm. In 2013 IEEE workshop on applications of signal processing to audio and acoustics (pp. 1-4). IEEE.
[13]
Xi, Q., Bittner, R., Pauwels, J., Ye, X., & Bello, J. P. (2018). Guitarset: A dataset for guitar transcription. In 19th International Society for Music Information Retrieval Conference, Paris, France, September 2018.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
HPCCT '24: Proceedings of the 2024 8th High Performance Computing and Cluster Technologies Conference
July 2024
55 pages
ISBN:9798400716881
DOI:10.1145/3705956
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 December 2024

Check for updates

Author Tags

  1. EfficientNetV2-S
  2. RandomCNN
  3. style transform

Qualifiers

  • Research-article

Conference

HPCCT 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 16
    Total Downloads
  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)4
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media