Abstract:
In this paper, we propose improving automatic speech recognition (ASR) accuracy for local points of interest (POI) by leveraging a geo-specific language model (Geo-LM). G...Show MoreMetadata
Abstract:
In this paper, we propose improving automatic speech recognition (ASR) accuracy for local points of interest (POI) by leveraging a geo-specific language model (Geo-LM). Geographic regions are defined according to U.S. Census Bureau Combined Statistical Areas. Depending on the user's associated geographic region, for each user a class based Geo-LM is constructerd dynamically within a difference-LM based weighted finite state transducer (WFST) system. The benefits of this approach include: improved accuracy for local POI name recognition, flexibility in training, and efficient LM construction at runtime. Our experiments show that the proposed Geo-Lm achieves an average of over 18 % relative word error rate (WER) reduction on the tasks of local POI search, with no degradation to the general accuracy and very limited latency increase, compared to the baseline nationwide general LM. In addition to accuracy improvement, we also discuss optimization of runtime efficiency.
Published in: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 15-20 April 2018
Date Added to IEEE Xplore: 13 September 2018
ISBN Information:
Electronic ISSN: 2379-190X