Improving Voice Search Using Forward-Backward LVCSR System Combination

Li, Ta; Bao, Changchun; Xu, Weiqun; Pan, Jielin; Yan, Yonghong

doi:10.1007/978-3-642-01216-7_82

Ta Li⁴,
Changchun Bao⁴,
Weiqun Xu⁴,
Jielin Pan⁴ &
…
Yonghong Yan⁴

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 56))

1413 Accesses
4 Citations

Abstract

Voice search is the technology that enables users to access information using spoken queries. Automatic speech recognizer (ASR) is one of the key modules for voice search systems. However, the high error rate of the state-of-the-art large vocabulary continuous speech recognition (LVCSR) is the bottleneck for most voice search systems. In this paper, we first build a baseline system using language model (LM) with domain-specific information. To improve our system, we propose a forward-backward LVCSR system combination method to decrease the search errors in speech recognition. This also helps to improve the spoken language understanding (SLU) performance. Experiment results show that our proposed method improves the performance of speech recognition by 5.7% relative CER reduction and increases the F1-measure of SLU by 1.5% absolute on our test set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Miller, D.: Speech-enabled Mobile Search Marches On. Speech Technology Magazine (2007)
Google Scholar
Wang, Y., Yu, D., Ju, Y., Acero, A.: An Introduction to Voice Search. Signal Processing Magazine, IEEE 25(3), 28–38 (2008)
Article Google Scholar
Yu, D., Ju, Y., Wang, Y., Zweig, G., Acero, A.: Automated Directory Assistance System–from Theory to Practice. In: Proceedings of Interspeech (2007)
Google Scholar
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition, pp. 200–238. Prentice-Hall International Inc., Englewood Cliffs (1999)
Google Scholar
Gao, Y., Ramabhadran, B., Chen, J., Erdogan, H., Picheny, M., Center, I., Heights, Y.: Innovative approaches for large vocabulary name recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), vol. 1 (2001)
Google Scholar
Austin, S., Schwartz, R., Placeway, P.: The Forward-backward Search Algorithm. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1991), pp. 697–700 (1991)
Google Scholar
Povey, D., Woodland, P.: Minimum Phone Error and I-smoothing for Improved Discriminativetraining. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2002) (2002)
Google Scholar
Liu, C., Yan, Y.: Robust State Clustering Using Phonetic Decision Trees. Speech Communication 42(3), 391–408 (2004)
Article MathSciNet Google Scholar
Stolcke, A.: SRILM-an Extensible Language Modeling Toolkit. In: Seventh International Conference on Spoken Language Processing (2002)
Google Scholar
Shao, J., Li, T., Zhang, Q., Zhao, Q., Yan, Y.: A One-Pass Real-Time Decoder Using Memory-Efficient State Network. IEICE Transactions on Information and Systems 91(3), 529 (2008)
Article Google Scholar
Ratnaparkhi, A., et al.: A Maximum Entropy Model for Part-of-speech Tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 133–142. Association for Computational Linguistics (1996)
Google Scholar
Sinha, R., Gales, M., Kim, D., Liu, X., Sim, K., Woodland, P.: The CU-HTK Mandarin broadcast news transcription system. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006) (2006)
Google Scholar
Hoffmeister, B., Plahl, C., Fritz, P., Heigold, G., Loof, J., Schluter, R., Ney, H.: Development of the 2007 RWTH Mandarin GALE LVCSR system. In: IEEE Automatic Speech Recognition and Understanding Workshop, Kyoto, Japan (December 2007)
Google Scholar
Ng, T., Zhang, B., Nguyen, K., Nguyen, L.: Progress in the BBN 2007 Mandarin Speech to Text system. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 1537–1540 (2008)
Google Scholar
Schwenk, H., Gauvain, J.: Combining Multiple Speech Recognizers Using Voting and Language Model Information. In: Sixth International Conference on Spoken Language Processing, ISCA (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences, Beijing, P.R. China
Ta Li, Changchun Bao, Weiqun Xu, Jielin Pan & Yonghong Yan

Authors

Ta Li
View author publications
You can also search for this author in PubMed Google Scholar
Changchun Bao
View author publications
You can also search for this author in PubMed Google Scholar
Weiqun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jielin Pan
View author publications
You can also search for this author in PubMed Google Scholar
Yonghong Yan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Control Science and Engineering, Huazhong University of Science and Technology, No. 1037, Luoyu Road, 430074, Wuhan, Hubei, China
Hongwei Wang , Yi Shen & Zhigang Zeng , &
Texas A&M University at Qatar, PO Box 23874, Doha, Qatar,
Tingwen Huang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, T., Bao, C., Xu, W., Pan, J., Yan, Y. (2009). Improving Voice Search Using Forward-Backward LVCSR System Combination. In: Wang, H., Shen, Y., Huang, T., Zeng, Z. (eds) The Sixth International Symposium on Neural Networks (ISNN 2009). Advances in Intelligent and Soft Computing, vol 56. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01216-7_82

Download citation

DOI: https://doi.org/10.1007/978-3-642-01216-7_82
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01215-0
Online ISBN: 978-3-642-01216-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics