Loading [a11y]/accessibility-menu.js
Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification | IEEE Journals & Magazine | IEEE Xplore

Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification


Abstract:

Audio signals are temporally-structured data, and learning their discriminative representations containing temporal information is crucial for the audio classification. I...Show More

Abstract:

Audio signals are temporally-structured data, and learning their discriminative representations containing temporal information is crucial for the audio classification. In this article, we propose an audio representation learning method with a hierarchical pyramid structure called pyramidal temporal pooling (PTP) which aims to capture the temporal information of an entire audio sample. By stacking a global temporal pooling layer on multiple local temporal pooling layers, the PTP can capture the high-level temporal dynamics of the input feature sequence in an unsupervised way. Furthermore, in the top global temporal pooling layer, we jointly optimize a learnable discriminative mapping (DM) and a softmax classifier. Such that, a joint learning method for the discriminative audio representations and the classifier called DM-PTP is also presented. By treating the temporal encoding as a low-level constraint of a bi-level optimization problem, the DM-PTP can produce the discriminative representation while maintaining the temporal information of the whole sequence. For an audio sample with an arbitrary time duration, both our PTP and DM-PTP can encode the input feature sequence with arbitrary length into a fixed-length representation. Without using any data augmentation and ensemble learning methods, both PTP and DM-PTP outperform the state-of-the-art CNNs on the audio event recognition (AER) dataset, and can achieve comparable performance on the DCASE 2018 acoustic scene classification (ASC) dataset compared with other best models in the challenge.
Page(s): 770 - 784
Date of Publication: 15 January 2020

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.