ABSTRACT
The accuracy of power dispatching speech recognition system is related to the effect of language model. In order to improve the accuracy of power dispatching speech recognition, this paper proposes a class label language model based on double dictionaries (general dictionary and power dispatching professional word dictionary). The model improves the n-gram language model with adding class label information, so as to improve the accuracy of power dispatching speech recognition. In addition, the joint system (the joint system of word segmentation and part of Speech Tagging based on double dictionaries) is used to preprocess the corpus information, which will improve the adaptability of class label language model based on double dictionary to power dispatching language. Finally, the class label language model is trained on the collected training corpus of power dispatching instructions. The word error rate of the power dispatching language recognition system using the class label language model based on double dictionaries in the test set are only 4.14%.
- Wang Shiqian. Research on semantic parsing technology in power dispatching control system [D]. Shandong University, 2018Google Scholar
- Chen Fang, Gao Sheng. Speech recognition technology and development [J]. Telecommunication science, 1996 (10): 54-57Google Scholar
- C. Mendis, J. Droppo, S. Maleki, M. Musuvathi, T. Mytkowicz and G. Zweig, "Parallelizing WFST speech decoders," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 5325-5329.Google Scholar
- Yi Xuerong. Research and application of speech recognition in power system [D]. Wuhan University of technology, 2018Google Scholar
- Yang Liuqing. Voice human-computer interaction and its application in intelligent scheduling [D]. Shandong University, 2013Google Scholar
- Chen Lei, Zheng Weiyan, Yu Huihua, Fu Jing, Liu Hongwei, Xia Junqiang. Research on speech recognition language model of power grid dispatching based on Bert [J / OL]. Power grid technology: 1-8 [2021-02-20]Google Scholar
- T. R. Niesler and P. C. Woodland, "A variable-length category-based n-gram language model," 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, Atlanta, GA, USA, 1996, pp. 164-167.Google Scholar
- Zhang Lei, Lu Dong and Xiang Xue-zhi, "Combination of improved Katz and mutual information for speech recognition based on lattice," 2010 8th World Congress on Intelligent Control and Automation, Jinan, China, 2010, pp. 6379-6382.Google Scholar
- Ying Xiong and Jie Zhu, "Toward a unified approach to lexicon optimization and perplexity minimization for Chinese language modeling," 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, 2005, pp. 3824-3829 Vol. 6.Google Scholar
- A. Georgescu, H. Cucu and C. Burileanu, "Kaldi-based DNN Architectures for Speech Recognition in Romanian," 2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD), Timisoara, Romania, 2019, pp. 1-6.Google Scholar
Index Terms
- A Language Model for Intelligent Speech Recognition of Power Dispatching
Recommendations
Toward enhanced Arabic speech recognition using part of speech tagging
One major source of suboptimal performance in automatic continuous speech recognition systems is misrecognition of small words. In general, errors resulting from small words are much more than errors resulting from long words. Therefore, compounding ...
Small-word pronunciation modeling for arabic speech recognition: a data-driven approach
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval TechnologyIncorrect recognition of adjacent small words is considered one of the obstacles in improving the performance of automatic continuous speech recognition systems. The pronunciation variation in the phonemes of adjacent words introduces ambiguity to the ...
N-gram Language Model Based on Multi-Word Expressions in Web Documents for Speech Recognition and Closed-Captioning
IALP '12: Proceedings of the 2012 International Conference on Asian Language ProcessingAutomatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-...
Comments