Abstract
Voice search is the technology that enables users to access information using spoken queries. Automatic speech recognizer (ASR) is one of the key modules for voice search systems. However, the high error rate of the state-of-the-art large vocabulary continuous speech recognition (LVCSR) is the bottleneck for most voice search systems. In this paper, we first build a baseline system using language model (LM) with domain-specific information. To improve our system, we propose a forward-backward LVCSR system combination method to decrease the search errors in speech recognition. This also helps to improve the spoken language understanding (SLU) performance. Experiment results show that our proposed method improves the performance of speech recognition by 5.7% relative CER reduction and increases the F1-measure of SLU by 1.5% absolute on our test set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Miller, D.: Speech-enabled Mobile Search Marches On. Speech Technology Magazine (2007)
Wang, Y., Yu, D., Ju, Y., Acero, A.: An Introduction to Voice Search. Signal Processing Magazine, IEEE 25(3), 28–38 (2008)
Yu, D., Ju, Y., Wang, Y., Zweig, G., Acero, A.: Automated Directory Assistance System–from Theory to Practice. In: Proceedings of Interspeech (2007)
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition, pp. 200–238. Prentice-Hall International Inc., Englewood Cliffs (1999)
Gao, Y., Ramabhadran, B., Chen, J., Erdogan, H., Picheny, M., Center, I., Heights, Y.: Innovative approaches for large vocabulary name recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), vol. 1 (2001)
Austin, S., Schwartz, R., Placeway, P.: The Forward-backward Search Algorithm. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1991), pp. 697–700 (1991)
Povey, D., Woodland, P.: Minimum Phone Error and I-smoothing for Improved Discriminativetraining. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002) (2002)
Liu, C., Yan, Y.: Robust State Clustering Using Phonetic Decision Trees. Speech Communication 42(3), 391–408 (2004)
Stolcke, A.: SRILM-an Extensible Language Modeling Toolkit. In: Seventh International Conference on Spoken Language Processing (2002)
Shao, J., Li, T., Zhang, Q., Zhao, Q., Yan, Y.: A One-Pass Real-Time Decoder Using Memory-Efficient State Network. IEICE Transactions on Information and Systems 91(3), 529 (2008)
Ratnaparkhi, A., et al.: A Maximum Entropy Model for Part-of-speech Tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 133–142. Association for Computational Linguistics (1996)
Sinha, R., Gales, M., Kim, D., Liu, X., Sim, K., Woodland, P.: The CU-HTK Mandarin broadcast news transcription system. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006) (2006)
Hoffmeister, B., Plahl, C., Fritz, P., Heigold, G., Loof, J., Schluter, R., Ney, H.: Development of the 2007 RWTH Mandarin GALE LVCSR system. In: IEEE Automatic Speech Recognition and Understanding Workshop, Kyoto, Japan (December 2007)
Ng, T., Zhang, B., Nguyen, K., Nguyen, L.: Progress in the BBN 2007 Mandarin Speech to Text system. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 1537–1540 (2008)
Schwenk, H., Gauvain, J.: Combining Multiple Speech Recognizers Using Voting and Language Model Information. In: Sixth International Conference on Spoken Language Processing, ISCA (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Li, T., Bao, C., Xu, W., Pan, J., Yan, Y. (2009). Improving Voice Search Using Forward-Backward LVCSR System Combination. In: Wang, H., Shen, Y., Huang, T., Zeng, Z. (eds) The Sixth International Symposium on Neural Networks (ISNN 2009). Advances in Intelligent and Soft Computing, vol 56. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01216-7_82
Download citation
DOI: https://doi.org/10.1007/978-3-642-01216-7_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01215-0
Online ISBN: 978-3-642-01216-7
eBook Packages: EngineeringEngineering (R0)