Singer Identity Representation Learning Using Self-Supervised Techniques

doi:10.5281/zenodo.10265323

Published November 4, 2023 | Version v1

Conference paper Open

Singer Identity Representation Learning Using Self-Supervised Techniques

Significant strides have been made in creating voice identity representations using speech data. However, the same level of progress has not been achieved for singing voices. To bridge this gap, we suggest a framework for training singer identity encoders to extract representations suitable for various singing-related tasks, such as singing voice similarity and synthesis. We explore different self-supervised learning techniques on a large collection of isolated vocal tracks and apply data augmentations during training to ensure that the representations are invariant to pitch and content variations. We evaluate the quality of the resulting representations on singer similarity and identification tasks across multiple datasets, with a particular emphasis on out-of-domain generalization. Our proposed framework produces high-quality embeddings that outperform both speaker verification and wav2vec 2.0 pre-trained baselines on singing voice while operating at 44.1 kHz. We release our code and trained models to facilitate further research on singing voice and related areas.

Files

000053.pdf

Files (149.8 kB)

Name	Size	Download all
000053.pdf md5:a138721adf3382c1a67d905c3b4e6d4b	149.8 kB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	80	80
Downloads	77	77
Data volume	12.9 MB	12.9 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 24th International Society for Music Information Retrieval Conference, 448-456. Milan, Italy.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2023) , Milan, Italy, November 5-9, 2023

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: December 5, 2023
Modified: December 5, 2023

Singer Identity Representation Learning Using Self-Supervised Techniques

Creators

Description

Files

000053.pdf

Files (149.8 kB)