Abstract:
Filter banks on a frequency domain are widely applied and studied mainly for MFCC and its variant methods on speech recognition tasks. In recent years, other types of aco...Show MoreMetadata
Abstract:
Filter banks on a frequency domain are widely applied and studied mainly for MFCC and its variant methods on speech recognition tasks. In recent years, other types of acoustic features which are derived from image classification literature have attracted attentions for the tasks regarding environmental sounds. For those features, the filter banks can also be employed mainly to effectively reduce feature dimensionality along the frequency. The filter banks have been designed according to human auditory process and are not necessarily optimal from the viewpoint of distinguishing actual acoustic data. We propose a method to build a filter bank from scratch in a data-driven manner based on the natural properties of filter banks without parametrically modeling them, which thereby more flexibly describes intrinsic characteristics of data. Those filters are optimized by incorporating discriminative criterion so as to provide effective features of high performance even with the smaller-sized filter bank. In the experiments on acoustic scene classification, the proposed method exhibits favorable performance especially on lower dimensional features.
Published in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 20-25 March 2016
Date Added to IEEE Xplore: 19 May 2016
ISBN Information:
Electronic ISSN: 2379-190X