Abstract
In this paper, we propose a system to categorize audio in 7 classes. For classification features, we use the mean and variance of RMS, ZCR, fundamental frequency and frequency peak which are extracted from every frame of 25ms length. In addition to the audio content classification, we also perform speaker identification with the voice sequences extracted automatically using our proposed method. The accuracy of our proposed scheme reaches 93.8% in categorizing audio signal and 80% in the speaker identification process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baek, J.S., Lee, S.T., Baek, J.H.: Scene Boundary Detection by Audiovisual Contents Analysis. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI), vol. 3809, pp. 530–539. Springer, Heidelberg (2005)
Kim, H.G., Moreau, N., Sikora, T.: MEPG-7 Audio and Beyond Audio Content Indexing and Retrieval. Wiley, Chichester (2005)
Zhang, T., Jay, K.C.-C.: Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. Speech and Audio Processing IEEE Transactions 9, 441–457 (2001)
Panagiotakis, C., Tziritas, G.: A Speech/music Discriminator Based on RMS and Zero-Crossings. IEEE transactions on Multimedia 7, 155–166 (2005)
Quatieri, T.: Discrete-time Speech Signal Processing Principles and Practice. Prentice Hall PTR, Englewood Cliffs (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kang, CM., Baek, JH. (2006). Audio Content Analysis for Understanding Structures of Scene in Video. In: Huang, DS., Li, K., Irwin, G.W. (eds) Intelligent Computing. ICIC 2006. Lecture Notes in Computer Science, vol 4113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816157_151
Download citation
DOI: https://doi.org/10.1007/11816157_151
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37271-4
Online ISBN: 978-3-540-37273-8
eBook Packages: Computer ScienceComputer Science (R0)