Abstract
In recent years, the audio sensor networks have been paid much attention. One of the most important applications of audio sensor networks is audio scene analysis. In this paper, we present a neural network based framework for analyzing the audio scene in the audio sensor networks. In the proposed framework, basic audio events are modeled and detected by Hidden Markov Models (HMMs) in the audio sensor nodes. The cluster head collects the sensory information in its cluster, and then a neural network based approach is proposed to discover the high-level semantic content of the audio context. With the neural network based approach, human knowledge and machine learning are effectively combined together in the semantic inference. That is, the model parameters are learned by statistical learning and then modified manually based on the prior knowledge. We deploy the proposed framework on an audio sensor network and do a series of experiments to evaluate its performance. The experimental results show that our method can work well in the complex real-world situations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Clavel, C., Ehrette, T., Richard, G.: Events Detection for An Audio-Based Surveillance System. In: Proc. IEEE ICME 2005, Amsterdam, Netherlands, July 2005, pp. 1306–1309 (2005)
Valencia-Jimenez, J.J., Fernandez-Caballero, A.: Holonic, Multi-agent Systems to Integrate Independent Multi-sensor Platforms in Complex Surveillance. In: Proc. AVSS 2008, November 2008, pp. 45–49 (2008)
Lie, L., Cai, R.: Audio Elements Based Auditory Scene Segmentation. In: Proc. ICASSP 2006, vol. 5, pp. 14–19 (2006)
Labiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77, 257–286 (1989)
Naphade, M.R., Huang, T.S.: Extracting Semantics from Audiovisual Content: The Final Frontier in Multimedia Retrieval. IEEE Trans. on Neural Network 13(4), 793–810 (2002)
Mu, X., Maddage, N.C., Xu, S., Kankanhalli, M.: Creating Audio Keywords for Event Detection in Soccer Video. In: Proc. ICME, Baltimore, vol. 2, pp. 281–284 (2003)
Cheng, W.H., Chu, W.T., Wu, J.L.: Semantic Context Detection Based on Hierarchical Audio Models. In: Proc. 5th ACM SIGMM Int.Workshop on Multimedia Information Retrieval, Berkeley, pp. 109–115 (2003)
Ma, Y.F., Zhang, H.J.: Video Summarization and Scene Detection by Graph Modeling. IEEE Trans. Circuits Syst. Video Technol. 15, 296–305 (2005)
Rui, C., Lu, L., Zhang, H.J., Cai, L.H.: Highlight Sound Effects Detection in Audio Stream. In: Proc. ICME, vol. 3, pp. 37–40 (2003)
Moncrieff, S., Dorai, C., Venkatesh, S.: Detecting Indexical Signs in Film Audio for Scene Interpretation. In: Proc. IEEE ICME, Tokyo, Japan, August 2001, pp. 1192–1195 (2001)
Cristani, M., Bicego, M., Murino, V.: On-line Adaptive Background Modeling for Audio Surveillance Pattern Recognition. In: Proc. IEEE ICPR, August 2004, vol. 2, pp. 399–402 (2004)
Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, T.S.: Audio Events Detection Based Highlights Extraction from Baseball, Golf and Soccer Games in a Unified Framework. In: Proc. IEEE ICASSP 2003, Hong Kong, vol. 5, pp. 632–635 (2003)
Veltonen, V., Tuomi, J., Klapuri, A.P., Huopaniemi, J., Sorsa, T.: Computational Auditory Scene Recognition. In: Proc. IEEE ICASSP, Orlando, vol. 2, pp. 1941–1944 (2002)
Wei, W., Xia, Z., Huang, M.: NINA - Navigating and Interacting with Notation and Audio. In: ICMLC 2007, vol. 7, pp. 1620–1625 (2007)
Li, Q., Ma, H., Huang, Q.: A Hierarchical Scene Analysis Framework for Audio Sensor Networks. In: Proc. IEEE IRADSN 2009, Hangzhou, pp. 170–177 (2009)
Karhunen, J., Oja, E.: A Class of Networks for Independent Component Analysis. IEEE Trans. on Neural Networks 8(3), 486–503 (1997)
Li, Q., Ma, H.: GBED: Group Based Event Detection Method for Audio Sensor Networks. In: ACM Multimedia, Beijing (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Q., Ma, H., Zhao, D. (2009). A Neural Network Based Framework for Audio Scene Analysis in Audio Sensor Networks. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds) Advances in Multimedia Information Processing - PCM 2009. PCM 2009. Lecture Notes in Computer Science, vol 5879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10467-1_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-10467-1_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10466-4
Online ISBN: 978-3-642-10467-1
eBook Packages: Computer ScienceComputer Science (R0)