Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method

Gurunath Reddy, M.; Sreenivasa Rao, K.

doi:10.1007/s00034-017-0696-1

Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method

Published: 07 November 2017

Volume 37, pages 2911–2933, (2018)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

M. Gurunath Reddy¹ &
K. Sreenivasa Rao¹

429 Accesses
6 Citations
Explore all metrics

Abstract

In this paper, a time-domain adaptive filtering-based melody extraction method is proposed. The proposed method works in multiple stages to extract the vocal melody (singer’s fundamental frequency) from vocal polyphonic music signals. The vocal and non-vocal regions of the music signal are identified by the strength of excitation of the source signal. The vocal regions are further segmented into the sequence of notes by detecting their onsets in the frequency representation of the composite signal. The melody contour in each of the vocal note segment is obtained by adaptive zero-frequency filtering in the time domain. The performance of the proposed melody extraction method is compared with the current state-of-the-art melody extraction method in respect of voicing recall rate, voicing false alarm rate, raw pitch, and overall accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Melody extraction from music using modified group delay functions

Article 03 February 2017

Audio Classification for Melody Transcription in the Context of Indian Art Music

Role of Linear, Mel and Inverse-Mel Filterbanks in Automatic Recognition of Speech from High-Pitched Speakers

Article 26 February 2019

Notes

http://www.mtg.upf.edu/technologies/melodia.

References

V. Arora, L. Behera, On-line melody extraction from polyphonic audio using harmonic cluster tracking. IEEE Trans. Audio Speech Lang. Process. 21(3), 520–530 (2013)
Article Google Scholar
J.P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M.B. Sandler, A tutorial on onset detection in music signals. IEEE Trans. Audio Speech Lang. Process. 13(5), 1035–1047 (2005)
Article Google Scholar
S. Böck, F. Krebs, M. Schedl, Evaluating the online capabilities of onset detection methods, in ISMIR, pp. 49–54 (2012)
J.C. Brown, Calculation of a constant Q spectral transform. J. Acoust. Soc. Am. 89(1), 425–434 (1991)
Article Google Scholar
P. Cancela, Tracking melody in polyphonic audio. mirex 2008, in Proceedings of Music Information Retrieval Evaluation eXchange (2008)
S. Dixon, Onset detection revisited, in Proceedings of the International Confernce on Digital Audio Effects (DAFx-06), pp. 133–137 (2006)
K. Dressler, Sinusoidal extraction using an efficient implementation of a multi-resolution FFT, in Proceedings of 9th International Conference on Digital Audio Effects (DAFx), pp. 247–252 (2006)
J.L. Durrieu, G. Richard, B. David, C. Févotte, Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. Audio Speech Lang. Process. 18(3), 564–575 (2010)
Article Google Scholar
C. Duxbury, M. Sandler, M. Davies, A hybrid approach to musical note onset detection, in Proceedings of Digital Audio Effects Conference (DAFX) pp. 33–38 (2002)
J. Eggink, G.J. Brown, Extracting Melody Lines From Complex Audio, ISMIR (2004)
M. Goto, A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Commun. 43(4), 311–329 (2004)
Article MathSciNet Google Scholar
D.W. Griffin, J.S. Lim, Multiband excitation vocoder. IEEE Trans. Acoust. Speech Signal Process. 36(8), 1223–1235 (1988)
Article MATH Google Scholar
C.-L. Hsu, J.-S. R. Jang, Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion. ISMIR, pp. 525–530 (2010)
P.S. Huang, S.D. Chen, P. Smaragdis, H.-J. Mark, Singing-voice separation from monaural recordings using robust principal component analysis, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 57–60 (2012)
S. Jo, S. Joo, C.D. Yoo, Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model. INTERSPEECH, pp. 2902–2905 (2010)
P. Leveau, L. Daudet, Methodology and tools for the evaluation of automatic onset detection algorithms in music, in Proceeding International Symposium on Music Information Retrieval (2004)
A. Liutkus, Z. Rafii, R. Badeau, B. Pardo, G. Richard, Adaptive filtering for music/voice separation exploiting the repeating musical structure, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 53–56 (2012)
H. Madden, Comments on smoothing and differentiation of data by simplified least square procedure. Anal. Chem. 50(9), 1383–86 (1978)
Article Google Scholar
R.C. Maher, J.W. Beauchamp, Fundamental frequency estimation of musical signals using a two-way mismatch procedure. J. Acoust. Soc. Am. 95(4), 2254–2263 (1994)
Article Google Scholar
B.C.J. Moore, An Introduction to the Psychology of Hearing (Brill, Leiden, 2012)
Google Scholar
K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
Article Google Scholar
N. Ono, K. Miyamoto, H. Kameoka, J. Le Roux, Y. Uchiyama, E. Tsunoo, T. Nishimoto, S. Sagayama, Harmonic and percussive sound separation and its application to MIR-related tasks, in Advances in music information retrieval (Springer, 2010), pp. 213–236
R.P. Paiva, T. Mendes, A. Cardoso, Melody detection in polyphonic musical signals: exploiting perceptual rules, note salience, and melodic smoothness. Comput. Music J. 30(4), 80–98 (2006)
Article Google Scholar
G.E. Poliner, D.P.W. Ellis, A.F. Ehmann, E. Gómez, S. Streich, B. Ong, Melody transcription from music audio: approaches and evaluation. IEEE Trans. Audio Speech Lang. Process. 15(4), 1247–1256 (2007)
Article Google Scholar
Z. Rafii, B. Pardo, Repeating pattern extraction technique (REPET): a simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process. 21(1), 73–84 (2013)
Article Google Scholar
V. Rao, P. Rao, Vocal melody extraction in the presence of pitched accompaniment in polyphonic music. IEEE Trans. Audio Speech Lang. Process. 18(8), 2145–2154 (2010)
Article Google Scholar
M.G. Reddy, K. Sreenivasa, Predominant melody extraction from vocal polyphonic music signal by combined spectro-temporal method, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 455–459 (2016)
G. Reddy, K.S. Rao, Enhanced harmonic content and vocal note based predominant melody extraction from vocal polyphonic music signals, in INTERSPEECH, pp. 3309–3313 (2016)
G. Reddy, K.S. Rao, Predominant vocal melody extraction from enhanced partial harmonic content, in 25th European Signal Processing Conference (EUSIPCO), pp. 1016–1020 (2017)
D.W. Robinson, R.S. Dadson, A re-determination of the equal-loudness relations for pure tones. Br. J. Appl. Phys. 7(5), 166 (1956)
Article Google Scholar
M.P. Ryynänen, A.P. Klapuri, Automatic transcription of melody, bass line, and chords in polyphonic music. Comput. Music J. 32(3), 72–86 (2008)
Article Google Scholar
J. Salamon, E. Gómez, Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans. Audio Speech Lang. Process. 20(6), 1759–1770 (2012)
Article Google Scholar
J. Salamon, E. Gomez, D.P.W. Ellis, G. Richard, Melody extraction from polyphonic music signals: approaches, applications, and challenges. IEEE Signal Process. Mag. 31(2), 118–134 (2014)
Article Google Scholar
J. Salamon, Melody extraction from polyphonic music signals. Ph. D. thesis, Department of Information and Communication Technologies Universitat Pompeu Fabra, Barcelona, Spain (2013)
E.D. Scheirer, Machine-listening systems. Unpublished Ph.D. Thesis, Massachusetts Institute of Technology (2000)
B. Scherrer, P. Depalle, Onset time estimation for the analysis of percussive sounds using exponentially damped sinusoids, in Proceedings of the 17th International Conference on Digital Audio Effects (DAFx), pp. 211–217 (2014)
J. Sundberg, T.D. Rossing, The science of singing voice. J. Acoust. Soc. Am. 87(1), 462–463 (1990)
Article Google Scholar
H. Tachibana, T. Ono, N. Ono, S. Sagayama, Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source, in Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 425–428 (2010)
T.-C. Yeh, M.-J. Wu, J.-S.R. Jang, W.-L. Chang, I.-B. Liao, A hybrid approach to singing pitch extraction based on trend estimation and hidden Markov models, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 457–460 (2012)

Download references

Acknowledgements

The present work is carried out under the project entitled “Scientific Approach to Networking and Designing of Heritage Interfaces (SANDHI)” sponsored by Ministry of Human Resource Development (MHRD), Govt. of India. Project reference IIT/SRIC/R/ITA/2014/40, dated March 24, 2014. We would like to thank Google (Google PhD Fellowship) and Department of Information Technology (DIT), Govt. of India for financial support. We would also like to thank Prof. Pallab Das Gupta (Dept. of Computer Science and Engineering, IIT Kharagpur), Prof. Priyadarshi Patnaik (Dept. of Humanities, IIT Kharagpur), and Ms. Gowri (Professional Hindustani music vocalist) for providing us the more theoretical insight into the Hindustani Music.

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, India
M. Gurunath Reddy & K. Sreenivasa Rao

Authors

M. Gurunath Reddy
View author publications
You can also search for this author in PubMed Google Scholar
K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Gurunath Reddy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gurunath Reddy, M., Sreenivasa Rao, K. Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method. Circuits Syst Signal Process 37, 2911–2933 (2018). https://doi.org/10.1007/s00034-017-0696-1

Download citation

Received: 20 September 2016
Revised: 16 October 2017
Accepted: 19 October 2017
Published: 07 November 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s00034-017-0696-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method

Abstract

Access this article

Similar content being viewed by others

Melody extraction from music using modified group delay functions

Audio Classification for Melody Transcription in the Context of Indian Art Music

Role of Linear, Mel and Inverse-Mel Filterbanks in Automatic Recognition of Speech from High-Pitched Speakers

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method

Abstract

Access this article

Similar content being viewed by others

Melody extraction from music using modified group delay functions

Audio Classification for Melody Transcription in the Context of Indian Art Music

Role of Linear, Mel and Inverse-Mel Filterbanks in Automatic Recognition of Speech from High-Pitched Speakers

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation