Skip to main content

Audio Classification for Radio Broadcast Indexing: Feature Normalization and Multiple Classifiers Decision

  • Conference paper
Advances in Multimedia Information Processing - PCM 2004 (PCM 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3332))

Included in the following conference series:

Abstract

This paper presents a system that detects the two basic components (speech and music) in the context of radio broadcast indexing. The originality of the approach covers three different points: a differentiated modelling based on Gaussian Mixture Model (GMM), which permits the extraction of speech and music components separately, the normalization of commonly used features and the efficient fusion of classifiers for speech classification which provides a substantial improvement in the presence of strong background music: accuracy of the indexing system goes from [69.2%,94.2%] for the best classifier to [90.25%,98.56%] for the fusion. Evaluation was performed on 12 hours of radio broadcast recorded under various noise conditions, channels and containing diverse speech and music mixtures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. El-Maleh, K., Klein, M., Petrucci, G., Kabal, P., McGill, P.: Speech/music discrimination for multimedia applications. In: Proc. IEEE ICASSP (2000)

    Google Scholar 

  2. Meinedo, H., Neto, J.: Audio segmentation, classification and clustering in a broadcast news task. In: Proc. IEEE ICASSP (2003)

    Google Scholar 

  3. Rossignol, S., Rodet, X., Soumagne, J., Collette, J., Depalle, P.: Automatic characterization of musical feature extraction and temporal segmentation. Journal of New Music Research 28 (1999)

    Google Scholar 

  4. Pinquier, J., Rouas, J., André-Obrecht, R.: Robust speech/music classification in audio documents. In: Proc. IEEE ICASSP (2003)

    Google Scholar 

  5. Ghaemmaghami, S.: Audio segmentation and classification based on a selective analysis scheme. In: Proc. IEEE MMC (2004)

    Google Scholar 

  6. Razik, J., Sénac, C., Fohr, D., Mella, O., Parlangeau, N.: Comparison of two speech/music segmentations systems for audio indexing on the web. In: Proc. SCI (2003)

    Google Scholar 

  7. Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Proc. of ‘A Speaker Odyssey’ (2001)

    Google Scholar 

  8. Faifhurst, M., Rahman, A.: Enhancing consensus in multiple expert decision fusion. IEEE VISP 147 (2000)

    Google Scholar 

  9. Kwon, O., Lee, T.: Optimizing speech/non-speech classifier design using adaboost. In: Proc. IEEE ICASSP (2003)

    Google Scholar 

  10. Lu, L., Li, S., Zhang, H.: Content based audio segmentation using support vector machines. In: Proc. IEEE ICME (2001)

    Google Scholar 

  11. Ross, A., Jain, A.: Information fusion in biometrics. Pattern Recognition Letters 24 (2003)

    Google Scholar 

  12. Verlinde, P., Chollet, G.: Comparing decision fusion paradigms using k-nn based classifiers, decision trees and logistic regression in a multi-modal identity verification application. In: Proc. AVPA (1999)

    Google Scholar 

  13. Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication 33 (2000)

    Google Scholar 

  14. Petralos, M., Benediktsson, J.: The effect of classifier agreement on the accuracy of the combined classifier in decision level fusion. IEEE Trans. on Geosciences and Remote Sensing 39 (2001)

    Google Scholar 

  15. Fleiss, J.: Statistical methods for rates and proportions, 2nd edn. Wiley, Chichester (1981)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sénac, C., Ambikairajh, E. (2004). Audio Classification for Radio Broadcast Indexing: Feature Normalization and Multiple Classifiers Decision. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds) Advances in Multimedia Information Processing - PCM 2004. PCM 2004. Lecture Notes in Computer Science, vol 3332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30542-2_109

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30542-2_109

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23977-2

  • Online ISBN: 978-3-540-30542-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics