Abstract:
This paper presents a robust, small-footprint, far-field keyword spotting (KWS) algorithm, which was inspired by the human auditory system's ability to achieve the so-cal...Show MoreMetadata
Abstract:
This paper presents a robust, small-footprint, far-field keyword spotting (KWS) algorithm, which was inspired by the human auditory system's ability to achieve the so-called cocktail party effect in adverse acoustic environments. It introduces the idea of combining microphone-array speech enhancement with machine learning, by incorporating a feedback path from the neural network (NN) KWS classifier to its signal preprocessing frontend so that frontend noise reduction can benefit from, and in turn, better serve backend machine intelligence. We find that the new system can significantly improve KWS performance for Google Home when there is strong music or TV noise in the background. While this innovative and successfully validated strategy of combining signal processing and machine learning is developed for KWS, its technical feasibility is presumably extensible to many other applications, including noise robust speaker identification and automatic speech recognition.
Published in: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 15-20 April 2018
Date Added to IEEE Xplore: 13 September 2018
ISBN Information:
Electronic ISSN: 2379-190X