Abstract
The paper considers the task of recognizing the category of a context surrounding an audio sensor. Due to the unstructured and diverse nature of the auditory context and constituent environmental sounds, which differs from the usual structured audio data like speech or music, the recognition of auditory context faces many difficulties and relatively fewer researchs have addressed it. In this paper, we propose an ensemble recognition scheme based on the Hough forest framework for unstructured auditory contexts, which combines the discriminative and generative modeling of the context. We learn the effective audio feature representation for environmental sounds in the context with the LDB algorithm, and recognize the context using the Hough forest based ensemble classifier, which aggregates both the segmental and the contextual probabilistic votes on the context category by the segments of the auditory context. The experimental results demonstrate the effectiveness of the proposed approach for auditory context recognition.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chu, S., Narayanan, S., Kuo, C.C.J.: Environmental sound recognition with time-frequency audio features. IEEE TASLP 17(6), 1142–1158 (2009)
Kiranyaz, S., Qureshi, A.F., Gabbouj, M.: A generic audio classification and segmentation approach for multimedia indexing and retrieval. IEEE TASLP 14(3), 1062–1081 (2006)
Lin, C., Chen, S., Truong, T., Chang, Y.: Audio classification and categorization based on wavelets and support vector machine. IEEE T. Speech and Audio Processing 13(5), 644–651 (2005)
Umapathy, K., Krishnan, S., Jimaa, S.: Multigroup classification of audio signals using time-frequency parameters. IEEE T. Multimedia 7(2), 308–315 (2005)
Umapathy, K., Krishnan, S., Rao, R.K.: Audio signal feature extraction and classification using local discriminant bases. IEEE TASLP 15(4), 1236–1246 (2007)
Han, B., Hwang, E.: Environmental sound classification based on feature collaboration. In: ICME 2009, pp. 542–545 (2009)
Wang, J., Wang, J., He, K., Hsu, C.: Environmental sound classification using hybrid SVM/KNN classifier and mpeg-7 audio low-level descriptor. In: IJCNN 2006, pp. 1731–1735 (2006)
Cai, R., Lu, L., Hanjalic, A., Zhang, H., Cai, L.: A flexible framework for key audio effects detection and auditory context inference. IEEE TASLP 14(3), 1026–1039 (2006)
Chu, W., Cheng, W., Wu, J.: Generative and discriminative modeling toward semantic context detection in audio tracks. In: MMM 2005, pp. 38–45 (2005)
Eronen, A.J., Peltonen, V.T., Tuomi, J.T., Klapuri, A.P., Fagerlund, S., Sorsa, T., Lorho, G., Huopaniemi, J.: Audio-based context recognition. IEEE TASLP 14(1), 321–329 (2006)
Su, F., Yang, L., Lu, T., Wang, G.: Environmental sound classification for scene recognition using local discriminant bases and HMM. In: ACM Multimedia 2011, pp. 1389–1392 (2011)
Saito, N., Coifman, R.R.: Local discriminant bases and their applications. J. of Mathematical Imaging and Vision 5(4), 337–358 (1995)
Gall, J., Lempitsky, V.: Class-specific hough forests for object detection. In: CVPR 2009, pp. 1022–1029 (2009)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3(4-5), 993–1022 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Su, F., Yang, L. (2013). Auditory Context Recognition Combining Discriminative and Generative Models. In: Huet, B., Ngo, CW., Tang, J., Zhou, ZH., Hauptmann, A.G., Yan, S. (eds) Advances in Multimedia Information Processing – PCM 2013. PCM 2013. Lecture Notes in Computer Science, vol 8294. Springer, Cham. https://doi.org/10.1007/978-3-319-03731-8_56
Download citation
DOI: https://doi.org/10.1007/978-3-319-03731-8_56
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03730-1
Online ISBN: 978-3-319-03731-8
eBook Packages: Computer ScienceComputer Science (R0)