Audio Classification for Radio Broadcast Indexing: Feature Normalization and Multiple Classifiers Decision

Sénac, Christine; Ambikairajh, Eliathamby

doi:10.1007/978-3-540-30542-2_109

Christine Sénac^19,20 &
Eliathamby Ambikairajh²⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3332))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

756 Accesses
1 Citations

Abstract

This paper presents a system that detects the two basic components (speech and music) in the context of radio broadcast indexing. The originality of the approach covers three different points: a differentiated modelling based on Gaussian Mixture Model (GMM), which permits the extraction of speech and music components separately, the normalization of commonly used features and the efficient fusion of classifiers for speech classification which provides a substantial improvement in the presence of strong background music: accuracy of the indexing system goes from [69.2%,94.2%] for the best classifier to [90.25%,98.56%] for the fusion. Evaluation was performed on 12 hours of radio broadcast recorded under various noise conditions, channels and containing diverse speech and music mixtures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

El-Maleh, K., Klein, M., Petrucci, G., Kabal, P., McGill, P.: Speech/music discrimination for multimedia applications. In: Proc. IEEE ICASSP (2000)
Google Scholar
Meinedo, H., Neto, J.: Audio segmentation, classification and clustering in a broadcast news task. In: Proc. IEEE ICASSP (2003)
Google Scholar
Rossignol, S., Rodet, X., Soumagne, J., Collette, J., Depalle, P.: Automatic characterization of musical feature extraction and temporal segmentation. Journal of New Music Research 28 (1999)
Google Scholar
Pinquier, J., Rouas, J., André-Obrecht, R.: Robust speech/music classification in audio documents. In: Proc. IEEE ICASSP (2003)
Google Scholar
Ghaemmaghami, S.: Audio segmentation and classification based on a selective analysis scheme. In: Proc. IEEE MMC (2004)
Google Scholar
Razik, J., Sénac, C., Fohr, D., Mella, O., Parlangeau, N.: Comparison of two speech/music segmentations systems for audio indexing on the web. In: Proc. SCI (2003)
Google Scholar
Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Proc. of ‘A Speaker Odyssey’ (2001)
Google Scholar
Faifhurst, M., Rahman, A.: Enhancing consensus in multiple expert decision fusion. IEEE VISP 147 (2000)
Google Scholar
Kwon, O., Lee, T.: Optimizing speech/non-speech classifier design using adaboost. In: Proc. IEEE ICASSP (2003)
Google Scholar
Lu, L., Li, S., Zhang, H.: Content based audio segmentation using support vector machines. In: Proc. IEEE ICME (2001)
Google Scholar
Ross, A., Jain, A.: Information fusion in biometrics. Pattern Recognition Letters 24 (2003)
Google Scholar
Verlinde, P., Chollet, G.: Comparing decision fusion paradigms using k-nn based classifiers, decision trees and logistic regression in a multi-modal identity verification application. In: Proc. AVPA (1999)
Google Scholar
Barras, C., Geoffrois, E., Wu, Z., Liberman, M.: Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication 33 (2000)
Google Scholar
Petralos, M., Benediktsson, J.: The effect of classifier agreement on the accuracy of the combined classifier in decision level fusion. IEEE Trans. on Geosciences and Remote Sensing 39 (2001)
Google Scholar
Fleiss, J.: Statistical methods for rates and proportions, 2nd edn. Wiley, Chichester (1981)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut de Recherche en Informatique de Toulouse, UMR 5505 CNRS INP UPS, 118, route de Narbonne, 31062, Toulouse Cedex 04, France
Christine Sénac
School of Electrical Engineering and Telecommunications, University of New South Wales, Sydney, 2052, Australia
Christine Sénac & Eliathamby Ambikairajh

Authors

Christine Sénac
View author publications
You can also search for this author in PubMed Google Scholar
Eliathamby Ambikairajh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-8656, Tokyo, Japan
Kiyoharu Aizawa
Tokyo Research Laboratory, IBM Research, 1623-14 Shimo-tsuruma, 242-0001, Yamato, Kanagawa, Japan
Yuichi Nakamura
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sénac, C., Ambikairajh, E. (2004). Audio Classification for Radio Broadcast Indexing: Feature Normalization and Multiple Classifiers Decision. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds) Advances in Multimedia Information Processing - PCM 2004. PCM 2004. Lecture Notes in Computer Science, vol 3332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30542-2_109

Download citation

DOI: https://doi.org/10.1007/978-3-540-30542-2_109
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23977-2
Online ISBN: 978-3-540-30542-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics