Conferences >2014 IEEE International Confe...

Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Convolutional neural networks have proved very successful in image recognition, thanks to their tolerance to small translations. They have recently been applied to speech...Show More

Metadata

Abstract:

Convolutional neural networks have proved very successful in image recognition, thanks to their tolerance to small translations. They have recently been applied to speech recognition as well, using a spectral representation as input. However, in this case the translations along the two axes - time and frequency - should be handled quite differently. So far, most authors have focused on convolution along the frequency axis, which offers invariance to speaker and speaking style variations. Other researchers have developed a different network architecture that applies time-domain convolution in order to process a longer time-span of input in a hierarchical manner. These two approaches have different background motivations, and both offer significant gains over a standard fully connected network. Here we show that the two network architectures can be readily combined, like their advantages. With the combined model we report an error rate of 16.7% on the TIMIT phone recognition task, a new record on this dataset.

Published in: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 04-09 May 2014

Date Added to IEEE Xplore: 14 July 2014

Electronic ISBN:978-1-4799-2893-4

ISSN Information:

DOI: 10.1109/ICASSP.2014.6853584

Conference Location: Florence, Italy

Contents

References is not available for this document.

Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?