Skip to main content

Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction andĀ Convolutional Recurrent Attention Neural Network

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13721))

Included in the following conference series:

Abstract

Acoustic Scene Classification (ASC) is the task of identifying a scene using sound cues and assigning a label to the identified scene. From the past two years, the datasets that are released for ASC consist of audio samples recorded with multiple devices bringing the problem closer to real-world scenarios. Therefore, we aim to develop a device robust ASC model consisting of audio samples recorded with three different devices. The dataset considered is DCASE 2019 ASC task 1a which consists of the primary recording device (Device A) and two mobile devices (Device B and C). This work introduces the Adaptive Noise Reduction (ANR) technique to reduce the device distortion present in devices B and C audio samples. Spectrograms are extracted from all audio samples and normalized to remove biased values in the input signal. The normalized features are fed to Light weight Convolutional Recurrent Attention Neural Network to perform ASC. The key contributions of this work are the reduction of device distortion in mismatched devices and the introduction of an attention layer in the Convolutional Recurrent Neural Network (CRANN). The results achieved from the proposed method have shown a considerable improvement in the accuracy related to mismatched device ASC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  2. Barchiesi, D., Giannoulis, D., Stowell, D., Plumbley, M.D.: Acoustic scene classification: classifying environments from the sounds they produce. IEEE Signal Process. Mag. 32(3), 16ā€“34 (2015)

    ArticleĀ  Google ScholarĀ 

  3. Chen, H., Liu, Z., Liu, Z., Zhang, P., Yan, Y.: Integrating the data augmentation scheme with various classifiers for acoustic scene modeling. Technical report, DCASE2019 Challenge (2019)

    Google ScholarĀ 

  4. Dorfer, M., Lehner, B., Eghbal-zadeh, H., Christop, H., Fabian, P., Gerhard, W.: Acoustic scene classification with fully convolutional neural networks and i-vectors. DCASE2018 challenge (2018)

    Google ScholarĀ 

  5. Eghbal-zadeh, H., Koutini, K., Widmer, G.: Acoustic scene classification and audio tagging with receptive-field-regularized CNNs. Technical Report, DCASE 2019 Challenge (2019)

    Google ScholarĀ 

  6. Heittola, T., Mesaros, A., Virtanen, T.: Acoustic scene classification challenge: generalization across devices and low complexity solutions. In: Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop (DCASE2020), pp. 56ā€“60 (2020)

    Google ScholarĀ 

  7. Hu, H., et al.: Device-robust acoustic scene classification based on two-stage categorization and data augmentation. Technical report, DCASE2020 Challenge (2020)

    Google ScholarĀ 

  8. Ma, L., Smith, D., Milner, B.: Environmental noise classification for context-aware applications. In: MaÅ™Ć­k, V., Retschitzegger, W., Å těpĆ”nkovĆ”, O. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 360ā€“370. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45227-0_36

    ChapterĀ  Google ScholarĀ 

  9. McDonnell, M.D., Gao, W.: Acoustic scene classification using deep residual networks with late fusion of separated high and low frequency paths. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 141ā€“145 (2020)

    Google ScholarĀ 

  10. Mesaros, A., Heittola, T., Virtanen, T.: A multi-device dataset for urban acoustic scene classification. In: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), pp. 9ā€“13 (2018)

    Google ScholarĀ 

  11. Misra, H., Ikbal, S., Bourlard, H., Hermansky, H.: Spectral entropy based feature for robust ASR. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. I-193. IEEE (2004)

    Google ScholarĀ 

  12. Nguyen, T., Pernkopf, F.: Acoustic scene classification using a convolutional neural network ensemble and nearest neighbor filters. In: Workshop on Detection and Classification of Acoustic Scenes and Events (2018)

    Google ScholarĀ 

  13. Nguyen, T., Pernkopf, F., Kosmider, M.: Acoustic scene classification for mismatched recording devices using heated-up softmax and spectrum correction. In: ICASSP 2020ā€“2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 126ā€“130. IEEE (2020)

    Google ScholarĀ 

  14. Pham, L.D., Mcloughlin, I., Phan, H.P., Palaniappan, R.: A multi-spectrogram deep neural network for acoustic scene classification technical report (2019)

    Google ScholarĀ 

  15. Plata, M.: Deep neural networks with supported clusters preclassification procedure for acoustic scene recognition. Technical Report, DCASE2019 Challenge (2019)

    Google ScholarĀ 

  16. Sakashita, Y.: Acoustic scene classification by ensemble of spectrograms based on adaptive temporal divisions. In: Technical Report, Detection and Classification of Acoustic Scenes and Events Challenge (2018)

    Google ScholarĀ 

  17. Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1331ā€“1334. IEEE (1997)

    Google ScholarĀ 

  18. Sehili, M.A., et al.: Sound environment analysis in smart home. In: PaternĆ², F., de Ruyter, B., Markopoulos, P., Santoro, C., van Loenen, E., Luyten, K. (eds.) AmI 2012. LNCS, vol. 7683, pp. 208ā€“223. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34898-3_14

    ChapterĀ  Google ScholarĀ 

  19. Song, H., Yang, H.: Feature enhancement for robust acoustic scene classification with device mismatch. Technical Report, DCASE2019 Challenge (2019)

    Google ScholarĀ 

  20. Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley-IEEE press, Hoboken (2006)

    BookĀ  Google ScholarĀ 

  21. Zieliński, S.K., Lee, H.: Automatic spatial audio scene classification in binaural recordings of music. Appl. Sci. 9(9), 1724 (2019)

    ArticleĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Spoorthy Venkatesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Venkatesh, S., Koolagudi, S.G. (2022). Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction andĀ Convolutional Recurrent Attention Neural Network. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds) Speech and Computer. SPECOM 2022. Lecture Notes in Computer Science(), vol 13721. Springer, Cham. https://doi.org/10.1007/978-3-031-20980-2_58

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20980-2_58

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20979-6

  • Online ISBN: 978-3-031-20980-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics