IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Special Section on Smart Multimedia & Communication Systems
Harmonic-Based Robust Voice Activity Detection for Enhanced Low SNR Noisy Speech Recognition System
Po-Yi SHIHPo-Chuan LINJhing-Fa WANG
Author information
JOURNAL RESTRICTED ACCESS

2016 Volume E99.A Issue 11 Pages 1928-1936

Details
Abstract

This paper describes a novel harmonic-based robust voice activity detection (H-RVAD) method with harmonic spectral local peak (HSLP) feature. HSLP is extracted by spectral amplitude analysis between the adjacent formants, and such characteristic can be used to identify and verify audio stream containing meaningful human speech accurately in low SNR environment. And, an enhanced low SNR noisy speech recognition system framework with wakeup module, speech recognition module and confirmation module is proposed. Users can determine or reject the system feedback while a recognition result was given in the framework, to prevent any chance that the voiced noise misleads the recognition result. The H-RVAD method is evaluated by the AURORA2 corpus in eight types of noise and three SNR levels and increased overall average performance from 4% to 20%. In home noise, the performance of H-RVAD method can be performed from 4% to 14% sentence recognition rate in average.

Content from these authors
© 2016 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top