Environmental Sounds Recognition Based on Image Processing Methods

Maka, Tomasz; Forczmański, Paweł

doi:10.1007/978-3-319-26227-7_68

Environmental Sounds Recognition Based on Image Processing Methods

Tomasz Maka⁷ &
Paweł Forczmański⁷

Conference paper
First Online: 05 March 2016

986 Accesses
1 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 403))

Abstract

The article presents an approach to environmental sound recognition that uses selected methods from the field of digital image processing and recognition. The proposed technique adopts the assumption that an audio signal can be converted into a visual representation, and processed further, as an image. At the first stage the audio data are converted into rectangular matrices called feature maps. Then a two-step approach is applied: the construction of a representative database of reference samples and the identification of test samples. The process of building the database employs two-dimensional linear discriminant analysis. Then the recognition operation is carried out in a reduced feature space that has been obtained by two-dimensional Karhunen–Loeve projection. At the classification stage, a minimum distance classifier is applied to different features. As it is shown, the results are very encouraging and can be a base for many practical audio applications.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Abe, M., Matsumoto, J., Nishiguchi, M.: Content-based classification of audio signals using source and structure modelling. In: Proceedings of the IEEE Pacific Conference on Multimedia, pp. 280–283 (2000)
Google Scholar
Cantrell, C.D.: Modern Mathematical Methods for Physicists and Engineers. Cambridge University Press, Cambridge (2000)
MATH Google Scholar
Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. IEEE Int. Conf. Multimed. Expo, ICME 2005, 1306–1309 (2005)
Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASSP 28(4), 357–366 (1980)
Article Google Scholar
Dennis, J., Tran, H.D., Li, H.L.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2011)
Article Google Scholar
Forczmański, P.: Evaluation of singer’s voice quality by means of visual pattern recognition. J. Voice. doi:10.1016/j.jvoice.2015.03.001 (2015, in press)
Forczmański, P., Frejlichowski, D.: Classification of elementary stamp shapes by means of reduced point distance histogram representation. Mach. Learn. Data Min. Pattern Recognit., LNCS 7376, 603–616 (2012)
Article Google Scholar
Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–4 (2013)
Google Scholar
Jiang, H., Bai, J., Zhang, S., Xu, B.: SVM-based audio scene classification, natural language processing and knowledge engineering. In: Proceedings of 2005 IEEE International Conference on IEEE NLP-KE’05, pp. 131–136 (2005)
Google Scholar
Kukharev, G., Forczmański, P.: Face recognition by means of two-dimensional direct linear discriminant analysis. In: Proceedings of the 8th International Conference PRIP 2005 Pattern Recognition and Information Processing. Republic of Belarus, Minsk, pp. 280–283 (2005)
Google Scholar
Maka, T.: Environmental background sounds classification based on properties of feature contours. In: 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE, Amsterdam, LNCS, vol. 7906, pp. 602–609 (2013)
Google Scholar
Okarma, K., Forczmański, P.: 2DLDA-based texture recognition in the aspect of objective image quality assessment. Ann. Univ. Mariae Curie-Sklodowska. Sectio AI Informatica 8(1), 99–110 (2008)
MathSciNet Google Scholar
Paraskevas, I., Chilton, E.: Audio classification using acoustic images for retrieval from multimedia databases. In: 4th EURASIP Conference on Video/Image Processing and Multimedia Communications. IEEE, vol. 1, pp. 187–192 (2003)
Google Scholar
Paraskevas, I., Potirakis, S.M., Rangoussi, M.: Natural soundscapes and identification of environmental sounds: a pattern recognition approach. In: 16th International Conference on Digital Signal Processing, pp. 5–7, 1–6 July 2009
Google Scholar
Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recognit. 30(5), 777–787 (1997)
Article Google Scholar
Rabiner, L., Schafer, W.: Theory and Applications of Digital Speech Processing. Prentice-Hall, Englewood Cliffs (2010)
Google Scholar
Rafii, Z., Coover, B., Han, J.: An audio fingerprinting system for live version identification using image processing techniques. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), pp. 644–648 (2014)
Google Scholar
Smith III, J.O.: Spectral Audio Processing. W3K Publishing, Stanford (2011)
Google Scholar
Wichern, G., Xue, J., Thornburg, H., Mechtley, B., Spanias, A.: Segmentation, indexing, and retrieval for environmental and natural sounds. IEEE Trans. Audio Speech Lang. Process. 18(3), 688–707 (2010)
Article Google Scholar
Yu, G., Slotine, J.: Audio classification from time-frequency texture. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. Taipei, Taiwan, pp. 1677–1680 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science and Information Technology, West Pomeranian University of Technology, Szczecin, Żołnierska Str. 52, 71–210, Szczecin, Poland
Tomasz Maka & Paweł Forczmański

Authors

Tomasz Maka
View author publications
You can also search for this author in PubMed Google Scholar
Paweł Forczmański
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomasz Maka .

Editor information

Editors and Affiliations

Department of Systems, Wrocław University of Technology, Wroclaw, Poland
Robert Burduk
Department of Systems and Computer, Wrocław University of Technology, Wroclaw, Poland
Konrad Jackowski
Department of Systems and Computer, Wrocław University of Technology, Wroclaw, Poland
Marek Kurzyński
Dept. of Systems and Computer Networks, Wrocław University of Technology, Wroclaw, Poland
Michał Woźniak
Department of Systems, Wrocław University of Technology, Wroclaw, Poland
Andrzej Żołnierek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maka, T., Forczmański, P. (2016). Environmental Sounds Recognition Based on Image Processing Methods. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. Advances in Intelligent Systems and Computing, vol 403. Springer, Cham. https://doi.org/10.1007/978-3-319-26227-7_68

Download citation

DOI: https://doi.org/10.1007/978-3-319-26227-7_68
Published: 05 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26225-3
Online ISBN: 978-3-319-26227-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics