Monaural Music Source Separation using Convolutional Sparse Coding

Jao, Ping-Keng; Su, Li; Yang, Yi-Hsuan; Wohlberg, Brendt Egon

doi:10.1109/TASLP.2016.2598323

Title: Monaural Music Source Separation using Convolutional Sparse Coding

Journal Article · Thu Aug 04 00:00:00 EDT 2016 · IEEE/ACM Transactions on Audio, Speech, and Language Processing

DOI:https://doi.org/10.1109/TASLP.2016.2598323· OSTI ID:1495128

Jao, Ping-Keng ^[1]; Su, Li ^[1]; Yang, Yi-Hsuan ^[1];

^[2]

Academia Sinica, Taipei (Taiwan). Research Center for Information Technology Innovation
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

We present a comprehensive performance study of a new time-domain approach for estimating the components of an observed monaural audio mixture. Unlike existing time-frequency approaches that use the product of a set of spectral templates and their corresponding activation patterns to approximate the spectrogram of the mixture, the proposed approach uses the sum of a set of convolutions of estimated activations with prelearned dictionary filters to approximate the audio mixture directly in the time domain. The approximation problem can be solved by an efficient convolutional sparse coding algorithm. The effectiveness of this approach for source separation of musical audio has been demonstrated in our prior work, but under rather restricted and controlled conditions, requiring the musical score of the mixture being informed a priori and little mismatch between the dictionary filters and the source signals. In this paper, we report an evaluation that considers wider, and more practical, experimental settings. This includes the use of an audio-based multipitch estimation algorithm to replace the musical score, and an external dataset of audio single notes to construct the dictionary filters. Here, our result shows that the proposed approach remains effective with a larger dictionary, and compares favorably with the state-of-the-art nonnegative matrix factorization approach. However, in the absence of the score and in the case of a small dictionary, our approach may not be better.

View Accepted Manuscript (DOE)

Cite

Export

Save

Research Organization:: Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

Sponsoring Organization:: USDOE Laboratory Directed Research and Development (LDRD) Program

Grant/Contract Number:: 89233218CNA000001

OSTI ID:: 1495128

Report Number(s):: LA-UR-15-27928

Journal Information:: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, Issue 11; ISSN 2329-9290

Publisher:: IEEE - ACMCopyright Statement

Country of Publication:: United States

Language:: English

Citation Metrics:

Cited by: 10 works

Citation information provided by
Web of Science

Cited By (1)

Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation Azamian, Mohammadali; Kabir, Ehsanollah Multimedia Tools and Applications, Vol. 78, Issue 13 https://doi.org/10.1007/s11042-018-7060-8	journal	January 2019

Similar Records

Context-Dependent Piano Music Transcription With Convolutional Sparse Coding

Journal Article · Thu Aug 04 00:00:00 EDT 2016 · IEEE/ACM Transactions on Audio, Speech, and Language Processing · OSTI ID:1495128

Cogliati, Andrea; Duan, Zhiyao; Wohlberg, Brendt

Piano Transcription with Convolutional Sparse Lateral Inhibition

Journal Article · Wed Feb 08 00:00:00 EST 2017 · IEEE Signal Processing Letters · OSTI ID:1495128

Cogliati, Andrea; Duan, Zhiyao; Wohlberg, Brendt Egon

Context-dependent piano music transcription with convolutional sparse coding

Patent · Tue Oct 03 00:00:00 EDT 2017 · OSTI ID:1495128

Cogliati, Andrea; Duan, Zhiyao; Wohlberg, Brendt Egon

Related Subjects

97 MATHEMATICS AND COMPUTING
Computer Science
Information Science
Mathematics
Monaural music source separation
non-negative matrix factorization
convolutional sparse coding

Title: Monaural Music Source Separation using Convolutional Sparse Coding

Citation Formats

Cited By (1)

Similar Records

Related Subjects