In this paper, a Kervolution-based SubSpectralNet model is proposed for Acoustic Scene Classification (ASC). SubSpectralNet is a competitive model which divides the mel spectrogram into horizontal slices termed as sub-spectrograms that are considered as input to the Convolutional Neural Network (CNN). In this work, the linear convolutional operation of SubSpectralNet is replaced with a non-linear operation using the kernel trick. This is also known as kervolution (kernel convolution)-based SubSpectralNet. The performance of the proposed methodology is evaluated on the DCASE (Detection and Classification of Acoustic Scenes and Events) 2018 development dataset. The proposed method achieves 73.52% and 75.76% accuracy with Polynomial and Gaussian Kernels respectively.
Cite as: Nandi, R., Shekhar, S., Mulimani, M. (2021) Acoustic Scene Classification Using Kervolution-Based SubSpectralNet. Proc. Interspeech 2021, 561-565, doi: 10.21437/Interspeech.2021-656
@inproceedings{nandi21_interspeech, author={Ritika Nandi and Shashank Shekhar and Manjunath Mulimani}, title={{Acoustic Scene Classification Using Kervolution-Based SubSpectralNet}}, year=2021, booktitle={Proc. Interspeech 2021}, pages={561--565}, doi={10.21437/Interspeech.2021-656} }