A Neural Network Based Framework for Audio Scene Analysis in Audio Sensor Networks

Li, Qi; Ma, Huadong; Zhao, Dong

doi:10.1007/978-3-642-10467-1_42

Qi Li²²,
Huadong Ma²² &
Dong Zhao²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5879))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1324 Accesses
3 Citations

Abstract

In recent years, the audio sensor networks have been paid much attention. One of the most important applications of audio sensor networks is audio scene analysis. In this paper, we present a neural network based framework for analyzing the audio scene in the audio sensor networks. In the proposed framework, basic audio events are modeled and detected by Hidden Markov Models (HMMs) in the audio sensor nodes. The cluster head collects the sensory information in its cluster, and then a neural network based approach is proposed to discover the high-level semantic content of the audio context. With the neural network based approach, human knowledge and machine learning are effectively combined together in the semantic inference. That is, the model parameters are learned by statistical learning and then modified manually based on the prior knowledge. We deploy the proposed framework on an audio sensor network and do a series of experiments to evaluate its performance. The experimental results show that our method can work well in the complex real-world situations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Clavel, C., Ehrette, T., Richard, G.: Events Detection for An Audio-Based Surveillance System. In: Proc. IEEE ICME 2005, Amsterdam, Netherlands, July 2005, pp. 1306–1309 (2005)
Google Scholar
Valencia-Jimenez, J.J., Fernandez-Caballero, A.: Holonic, Multi-agent Systems to Integrate Independent Multi-sensor Platforms in Complex Surveillance. In: Proc. AVSS 2008, November 2008, pp. 45–49 (2008)
Google Scholar
Lie, L., Cai, R.: Audio Elements Based Auditory Scene Segmentation. In: Proc. ICASSP 2006, vol. 5, pp. 14–19 (2006)
Google Scholar
Labiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77, 257–286 (1989)
Article Google Scholar
Naphade, M.R., Huang, T.S.: Extracting Semantics from Audiovisual Content: The Final Frontier in Multimedia Retrieval. IEEE Trans. on Neural Network 13(4), 793–810 (2002)
Article Google Scholar
Mu, X., Maddage, N.C., Xu, S., Kankanhalli, M.: Creating Audio Keywords for Event Detection in Soccer Video. In: Proc. ICME, Baltimore, vol. 2, pp. 281–284 (2003)
Google Scholar
Cheng, W.H., Chu, W.T., Wu, J.L.: Semantic Context Detection Based on Hierarchical Audio Models. In: Proc. 5th ACM SIGMM Int.Workshop on Multimedia Information Retrieval, Berkeley, pp. 109–115 (2003)
Google Scholar
Ma, Y.F., Zhang, H.J.: Video Summarization and Scene Detection by Graph Modeling. IEEE Trans. Circuits Syst. Video Technol. 15, 296–305 (2005)
Article Google Scholar
Rui, C., Lu, L., Zhang, H.J., Cai, L.H.: Highlight Sound Effects Detection in Audio Stream. In: Proc. ICME, vol. 3, pp. 37–40 (2003)
Google Scholar
Moncrieff, S., Dorai, C., Venkatesh, S.: Detecting Indexical Signs in Film Audio for Scene Interpretation. In: Proc. IEEE ICME, Tokyo, Japan, August 2001, pp. 1192–1195 (2001)
Google Scholar
Cristani, M., Bicego, M., Murino, V.: On-line Adaptive Background Modeling for Audio Surveillance Pattern Recognition. In: Proc. IEEE ICPR, August 2004, vol. 2, pp. 399–402 (2004)
Google Scholar
Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, T.S.: Audio Events Detection Based Highlights Extraction from Baseball, Golf and Soccer Games in a Unified Framework. In: Proc. IEEE ICASSP 2003, Hong Kong, vol. 5, pp. 632–635 (2003)
Google Scholar
Veltonen, V., Tuomi, J., Klapuri, A.P., Huopaniemi, J., Sorsa, T.: Computational Auditory Scene Recognition. In: Proc. IEEE ICASSP, Orlando, vol. 2, pp. 1941–1944 (2002)
Google Scholar
Wei, W., Xia, Z., Huang, M.: NINA - Navigating and Interacting with Notation and Audio. In: ICMLC 2007, vol. 7, pp. 1620–1625 (2007)
Google Scholar
Li, Q., Ma, H., Huang, Q.: A Hierarchical Scene Analysis Framework for Audio Sensor Networks. In: Proc. IEEE IRADSN 2009, Hangzhou, pp. 170–177 (2009)
Google Scholar
Karhunen, J., Oja, E.: A Class of Networks for Independent Component Analysis. IEEE Trans. on Neural Networks 8(3), 486–503 (1997)
Article Google Scholar
Li, Q., Ma, H.: GBED: Group Based Event Detection Method for Audio Sensor Networks. In: ACM Multimedia, Beijing (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Qi Li, Huadong Ma & Dong Zhao

Authors

Qi Li
View author publications
You can also search for this author in PubMed Google Scholar
Huadong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Dong Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Naresuan University, 65000, Phisanulok, Thailand
Paisarn Muneesawang
Microsoft Research Asia, 100109, Beijing, China
Feng Wu
Tokyo Institute of Technology, 226-8503, Yokohama, Japan
Itsuo Kumazawa
Mahanakorn University of Technology, 10530, Bankok, Thailand
Athikom Roeksabutr
Institute of Information Science, Academia Sinica, Taipei, Taiwan
Mark Liao
Chinese University of Hong Kong, Shatin, N.T., Hong Kong,
Xiaoou Tang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Q., Ma, H., Zhao, D. (2009). A Neural Network Based Framework for Audio Scene Analysis in Audio Sensor Networks. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds) Advances in Multimedia Information Processing - PCM 2009. PCM 2009. Lecture Notes in Computer Science, vol 5879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10467-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-10467-1_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10466-4
Online ISBN: 978-3-642-10467-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics