Acoustic Scene Classification Using Kervolution-Based SubSpectralNet

Nandi, Ritika; Shekhar, Shashank; Mulimani, Manjunath

doi:10.21437/Interspeech.2021-656

Acoustic Scene Classification Using Kervolution-Based SubSpectralNet

Ritika Nandi, Shashank Shekhar, Manjunath Mulimani

In this paper, a Kervolution-based SubSpectralNet model is proposed for Acoustic Scene Classification (ASC). SubSpectralNet is a competitive model which divides the mel spectrogram into horizontal slices termed as sub-spectrograms that are considered as input to the Convolutional Neural Network (CNN). In this work, the linear convolutional operation of SubSpectralNet is replaced with a non-linear operation using the kernel trick. This is also known as kervolution (kernel convolution)-based SubSpectralNet. The performance of the proposed methodology is evaluated on the DCASE (Detection and Classification of Acoustic Scenes and Events) 2018 development dataset. The proposed method achieves 73.52% and 75.76% accuracy with Polynomial and Gaussian Kernels respectively.

doi: 10.21437/Interspeech.2021-656

Cite as: Nandi, R., Shekhar, S., Mulimani, M. (2021) Acoustic Scene Classification Using Kervolution-Based SubSpectralNet. Proc. Interspeech 2021, 561-565, doi: 10.21437/Interspeech.2021-656

@inproceedings{nandi21_interspeech,
  author={Ritika Nandi and Shashank Shekhar and Manjunath Mulimani},
  title={{Acoustic Scene Classification Using Kervolution-Based SubSpectralNet}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={561--565},
  doi={10.21437/Interspeech.2021-656}
}