Skip to main content
Log in

Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

In this paper, a time-domain adaptive filtering-based melody extraction method is proposed. The proposed method works in multiple stages to extract the vocal melody (singer’s fundamental frequency) from vocal polyphonic music signals. The vocal and non-vocal regions of the music signal are identified by the strength of excitation of the source signal. The vocal regions are further segmented into the sequence of notes by detecting their onsets in the frequency representation of the composite signal. The melody contour in each of the vocal note segment is obtained by adaptive zero-frequency filtering in the time domain. The performance of the proposed melody extraction method is compared with the current state-of-the-art melody extraction method in respect of voicing recall rate, voicing false alarm rate, raw pitch, and overall accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://www.mtg.upf.edu/technologies/melodia.

References

  1. V. Arora, L. Behera, On-line melody extraction from polyphonic audio using harmonic cluster tracking. IEEE Trans. Audio Speech Lang. Process. 21(3), 520–530 (2013)

    Article  Google Scholar 

  2. J.P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M.B. Sandler, A tutorial on onset detection in music signals. IEEE Trans. Audio Speech Lang. Process. 13(5), 1035–1047 (2005)

    Article  Google Scholar 

  3. S. Böck, F. Krebs, M. Schedl, Evaluating the online capabilities of onset detection methods, in ISMIR, pp. 49–54 (2012)

  4. J.C. Brown, Calculation of a constant Q spectral transform. J. Acoust. Soc. Am. 89(1), 425–434 (1991)

    Article  Google Scholar 

  5. P. Cancela, Tracking melody in polyphonic audio. mirex 2008, in Proceedings of Music Information Retrieval Evaluation eXchange (2008)

  6. S. Dixon, Onset detection revisited, in Proceedings of the International Confernce on Digital Audio Effects (DAFx-06), pp. 133–137 (2006)

  7. K. Dressler, Sinusoidal extraction using an efficient implementation of a multi-resolution FFT, in Proceedings of 9th International Conference on Digital Audio Effects (DAFx), pp. 247–252 (2006)

  8. J.L. Durrieu, G. Richard, B. David, C. Févotte, Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. Audio Speech Lang. Process. 18(3), 564–575 (2010)

    Article  Google Scholar 

  9. C. Duxbury, M. Sandler, M. Davies, A hybrid approach to musical note onset detection, in Proceedings of Digital Audio Effects Conference (DAFX) pp. 33–38 (2002)

  10. J. Eggink, G.J. Brown, Extracting Melody Lines From Complex Audio, ISMIR (2004)

  11. M. Goto, A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Commun. 43(4), 311–329 (2004)

    Article  MathSciNet  Google Scholar 

  12. D.W. Griffin, J.S. Lim, Multiband excitation vocoder. IEEE Trans. Acoust. Speech Signal Process. 36(8), 1223–1235 (1988)

    Article  MATH  Google Scholar 

  13. C.-L. Hsu, J.-S. R. Jang, Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion. ISMIR, pp. 525–530 (2010)

  14. P.S. Huang, S.D. Chen, P. Smaragdis, H.-J. Mark, Singing-voice separation from monaural recordings using robust principal component analysis, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 57–60 (2012)

  15. S. Jo, S. Joo, C.D. Yoo, Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model. INTERSPEECH, pp. 2902–2905 (2010)

  16. P. Leveau, L. Daudet, Methodology and tools for the evaluation of automatic onset detection algorithms in music, in Proceeding International Symposium on Music Information Retrieval (2004)

  17. A. Liutkus, Z. Rafii, R. Badeau, B. Pardo, G. Richard, Adaptive filtering for music/voice separation exploiting the repeating musical structure, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 53–56 (2012)

  18. H. Madden, Comments on smoothing and differentiation of data by simplified least square procedure. Anal. Chem. 50(9), 1383–86 (1978)

    Article  Google Scholar 

  19. R.C. Maher, J.W. Beauchamp, Fundamental frequency estimation of musical signals using a two-way mismatch procedure. J. Acoust. Soc. Am. 95(4), 2254–2263 (1994)

    Article  Google Scholar 

  20. B.C.J. Moore, An Introduction to the Psychology of Hearing (Brill, Leiden, 2012)

    Google Scholar 

  21. K.S.R. Murty, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)

    Article  Google Scholar 

  22. N. Ono, K. Miyamoto, H. Kameoka, J. Le Roux, Y. Uchiyama, E. Tsunoo, T. Nishimoto, S. Sagayama, Harmonic and percussive sound separation and its application to MIR-related tasks, in Advances in music information retrieval (Springer, 2010), pp. 213–236

  23. R.P. Paiva, T. Mendes, A. Cardoso, Melody detection in polyphonic musical signals: exploiting perceptual rules, note salience, and melodic smoothness. Comput. Music J. 30(4), 80–98 (2006)

    Article  Google Scholar 

  24. G.E. Poliner, D.P.W. Ellis, A.F. Ehmann, E. Gómez, S. Streich, B. Ong, Melody transcription from music audio: approaches and evaluation. IEEE Trans. Audio Speech Lang. Process. 15(4), 1247–1256 (2007)

    Article  Google Scholar 

  25. Z. Rafii, B. Pardo, Repeating pattern extraction technique (REPET): a simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process. 21(1), 73–84 (2013)

    Article  Google Scholar 

  26. V. Rao, P. Rao, Vocal melody extraction in the presence of pitched accompaniment in polyphonic music. IEEE Trans. Audio Speech Lang. Process. 18(8), 2145–2154 (2010)

    Article  Google Scholar 

  27. M.G. Reddy, K. Sreenivasa, Predominant melody extraction from vocal polyphonic music signal by combined spectro-temporal method, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 455–459 (2016)

  28. G. Reddy, K.S. Rao, Enhanced harmonic content and vocal note based predominant melody extraction from vocal polyphonic music signals, in INTERSPEECH, pp. 3309–3313 (2016)

  29. G. Reddy, K.S. Rao, Predominant vocal melody extraction from enhanced partial harmonic content, in 25th European Signal Processing Conference (EUSIPCO), pp. 1016–1020 (2017)

  30. D.W. Robinson, R.S. Dadson, A re-determination of the equal-loudness relations for pure tones. Br. J. Appl. Phys. 7(5), 166 (1956)

    Article  Google Scholar 

  31. M.P. Ryynänen, A.P. Klapuri, Automatic transcription of melody, bass line, and chords in polyphonic music. Comput. Music J. 32(3), 72–86 (2008)

    Article  Google Scholar 

  32. J. Salamon, E. Gómez, Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans. Audio Speech Lang. Process. 20(6), 1759–1770 (2012)

    Article  Google Scholar 

  33. J. Salamon, E. Gomez, D.P.W. Ellis, G. Richard, Melody extraction from polyphonic music signals: approaches, applications, and challenges. IEEE Signal Process. Mag. 31(2), 118–134 (2014)

    Article  Google Scholar 

  34. J. Salamon, Melody extraction from polyphonic music signals. Ph. D. thesis, Department of Information and Communication Technologies Universitat Pompeu Fabra, Barcelona, Spain (2013)

  35. E.D. Scheirer, Machine-listening systems. Unpublished Ph.D. Thesis, Massachusetts Institute of Technology (2000)

  36. B. Scherrer, P. Depalle, Onset time estimation for the analysis of percussive sounds using exponentially damped sinusoids, in Proceedings of the 17th International Conference on Digital Audio Effects (DAFx), pp. 211–217 (2014)

  37. J. Sundberg, T.D. Rossing, The science of singing voice. J. Acoust. Soc. Am. 87(1), 462–463 (1990)

    Article  Google Scholar 

  38. H. Tachibana, T. Ono, N. Ono, S. Sagayama, Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source, in Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 425–428 (2010)

  39. T.-C. Yeh, M.-J. Wu, J.-S.R. Jang, W.-L. Chang, I.-B. Liao, A hybrid approach to singing pitch extraction based on trend estimation and hidden Markov models, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 457–460 (2012)

Download references

Acknowledgements

The present work is carried out under the project entitled “Scientific Approach to Networking and Designing of Heritage Interfaces (SANDHI)” sponsored by Ministry of Human Resource Development (MHRD), Govt. of India. Project reference IIT/SRIC/R/ITA/2014/40, dated March 24, 2014. We would like to thank Google (Google PhD Fellowship) and Department of Information Technology (DIT), Govt. of India for financial support. We would also like to thank Prof. Pallab Das Gupta (Dept. of Computer Science and Engineering, IIT Kharagpur), Prof. Priyadarshi Patnaik (Dept. of Humanities, IIT Kharagpur), and Ms. Gowri (Professional Hindustani music vocalist) for providing us the more theoretical insight into the Hindustani Music.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Gurunath Reddy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gurunath Reddy, M., Sreenivasa Rao, K. Predominant Melody Extraction from Vocal Polyphonic Music Signal by Time-Domain Adaptive Filtering-Based Method. Circuits Syst Signal Process 37, 2911–2933 (2018). https://doi.org/10.1007/s00034-017-0696-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-017-0696-1

Keywords

Navigation