ABSTRACT
The importance of automatic discrimination between speech signals and music signals has evolved as a research topic over recent years. The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. Several approaches have been previously used to discriminate between speech and music data. In this paper, we propose the use of the mean and variance of the discrete wavelet transform in addition to other features that have been used previously for audio classification. We have used Multi-Layer Perceptron (MLP) Neural Networks as a classifier. Our initial tests have shown encouraging results that indicate the viability of our approach.
- Carey, M. J., Parris, E. S. and Lloyd-Thomas, H., A Comparison of Features for Speech, Music Discrimination. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 99), Vol. 1, 1999. Google ScholarDigital Library
- Chou, W. and Gu, L., Robust Singing Detection In Speech/Music Discriminator Design. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 01), Vol. 2, 2001. Google ScholarDigital Library
- El-Maleh, K., Klein, M., Petrucci, G. and Kabal, P., Speech/Music Discrimination For Multimedia Applications. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 00), Vol. 6, 2000. Google ScholarDigital Library
- Harb, H. and Chen, L., Robust Speech Music Discrimination Using Spectrum's First Order Statistics And Neural Networks. In Proceedings of the Seventh International Symposium on Signal Processing and Its Applications, Vol. 2, 2003.Google ScholarCross Ref
- Harb, H., Chen, L. and Auloge, J. Y., Speech/Music/Silence and Gender Detection Algorithm. In Proceedings of the 7th International Conference on Distributed Multimedia Systems (DMS 01), 2001.Google Scholar
- Haykin, S., Neural Networks: A Comprehensive Foundation. Prentice Hall, 1999. Google ScholarDigital Library
- Karneback, S., Discrimination between speech and music based on a low frequency modulation feature. In Proceedings of the European Conference on Speech Communication and Technology, 2001.Google Scholar
- Panagiotakis, C. and Tziritas, G., A Speech/Music Discriminator Based On RMS And Zero-Crossings. IEEE Transactions on Multimedia, 2004. Google ScholarDigital Library
- Parris, E. S., Carey, M. J. and Lloyd-Thomas, H., Feature Fusion For Music Detection. In Proceedings of the European Conference on Speech Communication and Technology, 1999.Google Scholar
- Pinquier, J., Rouas, J. -L. and André-Obrecht, R., A Fusion Study in Speech/Music Classification. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 03), Vol. 2, 2003.Google Scholar
- Pinquier, J., Rouas, J.-L. and André-Obrecht, R., Robust Speech / Music Classification in Audio Documents. In Proceedings of the International Conference on Spoken Language Processing (ICSLP 02), Vol. 3, 2002.Google Scholar
- Pinquier, J., Sénac, C. and André-Obrecht, R., Speech and Music Classification in Audio Documents. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 02), Vol. 4, 2002.Google Scholar
- Saad, E. M., El-Adawy, M. I., Abu-El-Wafa, M. E. and Wahba, A. A., A Multifeature Speech/Music Discrimination System. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (CCECE 02), Vol. 2, 2002.Google Scholar
- Saunders, J., Real-Time Discrimination of Broadcast Speech/Music. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 96), Vol. 2, 1996. Google ScholarDigital Library
- Scheirer, E. and Slaney, M., Construction and Evaluation of A Robust Multifeatures Speech/Music Discriminator. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 97), Vol. 2, 1997. Google ScholarDigital Library
- Wang, W. Q., Gao, W. and Ying, D. W., A Fast and Robust Speech/Music Discrimination Approach. In Proceedings of the International Conference on Information, Communications and Signal Processing, 2003.Google ScholarCross Ref
Index Terms
- Automatic classification of speech and music using neural networks
Recommendations
Machine-learning based classification of speech and music
The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. In this paper, we investigate audio features that have not been previously used in music-speech classification, such ...
Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation
AbstractIn this paper, we propose speech/music pitch classification based on recurrent neural network (RNN) for monaural speech segregation from music interferences. The speech segregation methods in this paper exploit sub-band masking to construct ...
Multi-class pattern classification using neural networks
Multi-class pattern classification has many applications including text document classification, speech recognition, object recognition, etc. Multi-class pattern classification using neural networks is not a trivial extension from two-class neural ...
Comments