Skip to main content

Environmental Sounds Recognition Based on Image Processing Methods

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 403))

Abstract

The article presents an approach to environmental sound recognition that uses selected methods from the field of digital image processing and recognition. The proposed technique adopts the assumption that an audio signal can be converted into a visual representation, and processed further, as an image. At the first stage the audio data are converted into rectangular matrices called feature maps. Then a two-step approach is applied: the construction of a representative database of reference samples and the identification of test samples. The process of building the database employs two-dimensional linear discriminant analysis. Then the recognition operation is carried out in a reduced feature space that has been obtained by two-dimensional Karhunen–Loeve projection. At the classification stage, a minimum distance classifier is applied to different features. As it is shown, the results are very encouraging and can be a base for many practical audio applications.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abe, M., Matsumoto, J., Nishiguchi, M.: Content-based classification of audio signals using source and structure modelling. In: Proceedings of the IEEE Pacific Conference on Multimedia, pp. 280–283 (2000)

    Google Scholar 

  2. Cantrell, C.D.: Modern Mathematical Methods for Physicists and Engineers. Cambridge University Press, Cambridge (2000)

    MATH  Google Scholar 

  3. Clavel, C., Ehrette, T., Richard, G.: Events detection for an audio-based surveillance system. IEEE Int. Conf. Multimed. Expo, ICME 2005, 1306–1309 (2005)

    Google Scholar 

  4. Davis, S., Mermelstein, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASSP 28(4), 357–366 (1980)

    Article  Google Scholar 

  5. Dennis, J., Tran, H.D., Li, H.L.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2011)

    Article  Google Scholar 

  6. Forczmański, P.: Evaluation of singer’s voice quality by means of visual pattern recognition. J. Voice. doi:10.1016/j.jvoice.2015.03.001 (2015, in press)

  7. Forczmański, P., Frejlichowski, D.: Classification of elementary stamp shapes by means of reduced point distance histogram representation. Mach. Learn. Data Min. Pattern Recognit., LNCS 7376, 603–616 (2012)

    Article  Google Scholar 

  8. Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–4 (2013)

    Google Scholar 

  9. Jiang, H., Bai, J., Zhang, S., Xu, B.: SVM-based audio scene classification, natural language processing and knowledge engineering. In: Proceedings of 2005 IEEE International Conference on IEEE NLP-KE’05, pp. 131–136 (2005)

    Google Scholar 

  10. Kukharev, G., Forczmański, P.: Face recognition by means of two-dimensional direct linear discriminant analysis. In: Proceedings of the 8th International Conference PRIP 2005 Pattern Recognition and Information Processing. Republic of Belarus, Minsk, pp. 280–283 (2005)

    Google Scholar 

  11. Maka, T.: Environmental background sounds classification based on properties of feature contours. In: 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE, Amsterdam, LNCS, vol. 7906, pp. 602–609 (2013)

    Google Scholar 

  12. Okarma, K., Forczmański, P.: 2DLDA-based texture recognition in the aspect of objective image quality assessment. Ann. Univ. Mariae Curie-Sklodowska. Sectio AI Informatica 8(1), 99–110 (2008)

    MathSciNet  Google Scholar 

  13. Paraskevas, I., Chilton, E.: Audio classification using acoustic images for retrieval from multimedia databases. In: 4th EURASIP Conference on Video/Image Processing and Multimedia Communications. IEEE, vol. 1, pp. 187–192 (2003)

    Google Scholar 

  14. Paraskevas, I., Potirakis, S.M., Rangoussi, M.: Natural soundscapes and identification of environmental sounds: a pattern recognition approach. In: 16th International Conference on Digital Signal Processing, pp. 5–7, 1–6 July 2009

    Google Scholar 

  15. Pinkowski, B.: Principal component analysis of speech spectrogram images. Pattern Recognit. 30(5), 777–787 (1997)

    Article  Google Scholar 

  16. Rabiner, L., Schafer, W.: Theory and Applications of Digital Speech Processing. Prentice-Hall, Englewood Cliffs (2010)

    Google Scholar 

  17. Rafii, Z., Coover, B., Han, J.: An audio fingerprinting system for live version identification using image processing techniques. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), pp. 644–648 (2014)

    Google Scholar 

  18. Smith III, J.O.: Spectral Audio Processing. W3K Publishing, Stanford (2011)

    Google Scholar 

  19. Wichern, G., Xue, J., Thornburg, H., Mechtley, B., Spanias, A.: Segmentation, indexing, and retrieval for environmental and natural sounds. IEEE Trans. Audio Speech Lang. Process. 18(3), 688–707 (2010)

    Article  Google Scholar 

  20. Yu, G., Slotine, J.: Audio classification from time-frequency texture. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. Taipei, Taiwan, pp. 1677–1680 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Maka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Maka, T., Forczmański, P. (2016). Environmental Sounds Recognition Based on Image Processing Methods. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015. Advances in Intelligent Systems and Computing, vol 403. Springer, Cham. https://doi.org/10.1007/978-3-319-26227-7_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26227-7_68

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26225-3

  • Online ISBN: 978-3-319-26227-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics