Journals & Magazines >IEEE/ACM Transactions on Audi... >Volume: 28

Adaptive Distance-Based Pooling in Convolutional Neural Networks for Audio Event Classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In the last years, deep convolutional neural networks have become a standard for the development of state-of-the-art audio classification systems, taking the lead over tr...Show More

Metadata

Abstract:

In the last years, deep convolutional neural networks have become a standard for the development of state-of-the-art audio classification systems, taking the lead over traditional approaches based on feature engineering. While they are capable of achieving human performance under certain scenarios, it has been shown that their accuracy is severely degraded when the systems are tested over noisy or weakly segmented events. Although better generalization could be obtained by increasing the size of the training dataset, e.g. by applying data augmentation techniques, this also leads to longer and more complex training procedures. In this article, we propose a new type of pooling layer aimed at compensating non-relevant information of audio events by applying an adaptive transformation of the convolutional feature maps in the temporal axis. The proposed layer performs a non-linear temporal transformation that follows a uniform distance subsampling criterion on the learned feature space. The experiments conducted over different datasets show significant performance improvements when the proposed layer is added to baseline models, resulting in systems that generalize better to mismatching test conditions and learn more robustly from weakly labeled data.

Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 28)

Page(s): 1925 - 1935

Date of Publication: 11 June 2020

ISSN Information:

DOI: 10.1109/TASLP.2020.3001683

Funding Agency:

Contents

References is not available for this document.

Adaptive Distance-Based Pooling in Convolutional Neural Networks for Audio Event Classification

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Adaptive Distance-Based Pooling in Convolutional Neural Networks for Audio Event Classification

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?