skip to main content
10.1145/1291233.1291299acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Singing voice detection using perceptually-motivated features

Authors Info & Claims
Published:29 September 2007Publication History

ABSTRACT

Perceptual features are motivated by human perception of sounds. In this paper, several perceptually-motivated features such as harmonic, vibrato and timbre are studied to detect singing voice segments in a song. In addition, singing formant and attack-decay envelope of the sound are also studied for acoustic feature formulation. The cepstral coefficients which reflect the timbre characteristics are formulated by combining information from harmonic content, vibrato, singing formant and attack-decay envelope of the sound. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information. Several experiments are conducted using a database that includes 84 popular songs from commercially available CD recordings. The experiments show that the proposed feature formulation methods are effective.

References

  1. Becchetti, C., and Ricotti, L. P. Speech Recognition Theory and C++ Implementation. New York: John Wiley & Sons, 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Everest, F. A. The Master Handbook of Acoustics. New York, McGraw-Hill, 2001.Google ScholarGoogle Scholar
  3. Fujihara, H., Kitahara, T., Goto, M., Komatani, K., Ogata, T. and Okuno, H. G. F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search. in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2006, vol. 5, pp. V-253--V-256.Google ScholarGoogle ScholarCross RefCross Ref
  4. Goto, M. A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication, vol. 43, no. 4, pp. 311--329, September 2004.Google ScholarGoogle ScholarCross RefCross Ref
  5. Hackhaus, W. Die Ausgleichsvorgange. Zeitschrift fur Technische Physik, 1932.Google ScholarGoogle Scholar
  6. Mellody, M., Herseth, F. and Wakefield, G. H. Modal distribution analysis, synthesis, and perception of a soprano's sung vowels. J. Voice, vol. 15, pp. 469--482, December 2001.Google ScholarGoogle ScholarCross RefCross Ref
  7. Nwe, T. L., Foo, S. W., and De Silva, L. C. Stress classification using subband based features. IEICE Trans. Information and Systems, Special Issue on Speech Information Processing, vol. E86-D, no.3, pp. 565--573, March 2003.Google ScholarGoogle Scholar
  8. Nwe, T. L. and Li, H. Exploring vibrato-motivated acoustic features for singer identification. IEEE Transactions, Audio, Speech and Language Processing: vol. 15, no. 2, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Rabiner, L. R., and Juang, B. H. Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, NJ, 1993 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sundberg, J. The Acoustics of The Singing Voice, Scientific American, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sundberg, J. The Science of Singing Voice. Northern Illinois University Press, 1987, ch. 8.Google ScholarGoogle Scholar
  12. Tzanetakis, G. Song-specific bootstrapping of singing voice structure. IEEE Int. Conf. Multimedia and Expo, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  13. Timmers, R., and Desain, P. Vibrato: Questions and answers from musicians and science. in Proc. Int. Conf. Music Perception and Cognition, England, 2000.Google ScholarGoogle Scholar
  14. "Vibrato", Word of the Day. Answers Corporation, 2006. Answers.com 13 Dec. 2006. http://www.answers.com/topic/vibratoGoogle ScholarGoogle Scholar
  15. Wakefield, G. H. and Bartsch, M. A. Where's Caruso? Singer identification by listener and machine. Cambridge Music Processing Colloquium, Cambridge, England, 2003.Google ScholarGoogle Scholar
  16. Winckell, F. Music, sound and sensation. Dover, NY, 1967.Google ScholarGoogle Scholar
  17. Zhang, T. System and method for automatic singer identification. IEEE Int. Conf. Multimedia and Expo, Baltimore, MD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Singing voice detection using perceptually-motivated features

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MM '07: Proceedings of the 15th ACM international conference on Multimedia
          September 2007
          1115 pages
          ISBN:9781595937025
          DOI:10.1145/1291233

          Copyright © 2007 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 29 September 2007

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate995of4,171submissions,24%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader