Abstract:
Zero resource spoken term discovery in continuous speech is the discovery of repeated patterns in acoustic signals without any higher level linguistic information. These ...Show MoreMetadata
Abstract:
Zero resource spoken term discovery in continuous speech is the discovery of repeated patterns in acoustic signals without any higher level linguistic information. These patterns are then combined to define the compositional units of that speech. We describe and implement an algorithm that tags similar subsequences among sequences of acoustic features. We then discuss the use of this algorithm as part of a complete spoken term discovery system. Our implementation leverages parallelization via modern GPUs, allowing many independent comparisons to be executed concurrently. This parallelization enables the described system to analyze large data sets in tractable time frames. The accuracy and performance of our approach are compared to existing approaches as well as human transcriptions on two corpora of continuous natural speech. Our system improved on published results for multiple metrics.
Published in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 05-09 March 2017
Date Added to IEEE Xplore: 19 June 2017
ISBN Information:
Electronic ISSN: 2379-190X