skip to main content
10.1145/1032604.1032620acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Automatic classification of speech and music using neural networks

Authors Info & Claims
Published:13 November 2004Publication History

ABSTRACT

The importance of automatic discrimination between speech signals and music signals has evolved as a research topic over recent years. The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. Several approaches have been previously used to discriminate between speech and music data. In this paper, we propose the use of the mean and variance of the discrete wavelet transform in addition to other features that have been used previously for audio classification. We have used Multi-Layer Perceptron (MLP) Neural Networks as a classifier. Our initial tests have shown encouraging results that indicate the viability of our approach.

References

  1. Carey, M. J., Parris, E. S. and Lloyd-Thomas, H., A Comparison of Features for Speech, Music Discrimination. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 99), Vol. 1, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chou, W. and Gu, L., Robust Singing Detection In Speech/Music Discriminator Design. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 01), Vol. 2, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. El-Maleh, K., Klein, M., Petrucci, G. and Kabal, P., Speech/Music Discrimination For Multimedia Applications. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 00), Vol. 6, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Harb, H. and Chen, L., Robust Speech Music Discrimination Using Spectrum's First Order Statistics And Neural Networks. In Proceedings of the Seventh International Symposium on Signal Processing and Its Applications, Vol. 2, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. Harb, H., Chen, L. and Auloge, J. Y., Speech/Music/Silence and Gender Detection Algorithm. In Proceedings of the 7th International Conference on Distributed Multimedia Systems (DMS 01), 2001.Google ScholarGoogle Scholar
  6. Haykin, S., Neural Networks: A Comprehensive Foundation. Prentice Hall, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Karneback, S., Discrimination between speech and music based on a low frequency modulation feature. In Proceedings of the European Conference on Speech Communication and Technology, 2001.Google ScholarGoogle Scholar
  8. Panagiotakis, C. and Tziritas, G., A Speech/Music Discriminator Based On RMS And Zero-Crossings. IEEE Transactions on Multimedia, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Parris, E. S., Carey, M. J. and Lloyd-Thomas, H., Feature Fusion For Music Detection. In Proceedings of the European Conference on Speech Communication and Technology, 1999.Google ScholarGoogle Scholar
  10. Pinquier, J., Rouas, J. -L. and André-Obrecht, R., A Fusion Study in Speech/Music Classification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 03), Vol. 2, 2003.Google ScholarGoogle Scholar
  11. Pinquier, J., Rouas, J.-L. and André-Obrecht, R., Robust Speech / Music Classification in Audio Documents. In Proceedings of the International Conference on Spoken Language Processing (ICSLP 02), Vol. 3, 2002.Google ScholarGoogle Scholar
  12. Pinquier, J., Sénac, C. and André-Obrecht, R., Speech and Music Classification in Audio Documents. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 02), Vol. 4, 2002.Google ScholarGoogle Scholar
  13. Saad, E. M., El-Adawy, M. I., Abu-El-Wafa, M. E. and Wahba, A. A., A Multifeature Speech/Music Discrimination System. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE 02), Vol. 2, 2002.Google ScholarGoogle Scholar
  14. Saunders, J., Real-Time Discrimination of Broadcast Speech/Music. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 96), Vol. 2, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Scheirer, E. and Slaney, M., Construction and Evaluation of A Robust Multifeatures Speech/Music Discriminator. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 97), Vol. 2, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wang, W. Q., Gao, W. and Ying, D. W., A Fast and Robust Speech/Music Discrimination Approach. In Proceedings of the International Conference on Information, Communications and Signal Processing, 2003.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Automatic classification of speech and music using neural networks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            MMDB '04: Proceedings of the 2nd ACM international workshop on Multimedia databases
            November 2004
            118 pages
            ISBN:1581139756
            DOI:10.1145/1032604

            Copyright © 2004 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 November 2004

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader