Abstract
This paper describes an error correction method of continuous speech recognition using WEB documents for spoken documents indexing. We performed an experiment of error correction for news speech automatically transcribed, where we focused on especially proper nouns. Two LVCSR systems were used to detect correctly and incorrectly recognized words. Keywords for the Internet search engine were selected among the correctly transcribed words, then correct candidates for the mis-recognized words were obtained in retrieved documents. A Dynamic Programming (DP) technique with a confusion matrix was utilized to compare the candidates with the mis-recognized words. In results of experiment of error correction, recognition rate of proper nouns achieved improvement of about 10% by using WEB documents.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Garofolo, J., Auzanne, C.G.P., Voorhees, E.: The TREC SDR Track: A Success Story. In: Proc. of the 8th Text Retrieval Conference, pp. 107–129 (2000)
Robinson, T., Abberley, D., Kirby, D., Renals, S.: Recognition, indexing and retrieval of British broadcast news with the THISL system. In: Proc. of EuroSpeech 1999, pp. 1267–1270 (1999)
Hauptmann, A.G., Wactlar, H.D.: Indexing and search of multimodal information. In: Proc. of ICASSP 1997, pp. 195–198 (1997)
Jourlin, P., Johnson, S.E., Jones, K.S., Woodland, P.C.: Spoken document representations for probabilistic retrieval. Speech Communication 32(1-2), 21–36 (2000)
Wechsler, M., Munteanu, E., Schauble, P.: New Techniques for Open-vocabulary Spoken Document Retrieval. In: Proceedings of the SIGIR 1998, pp. 20–27 (1998)
Ng, K., Zue, V.W.: Subword-based Approaches for Spoken Document Retrieval. Speech Communication 32(3), 157–186 (2000)
min Wang, H.: Experiments in Syllable-based Retrieval of Broadcast News Speech in Mandarin Chinese. Speech Communication 32(1-2), 49–60 (2000)
Ng, C., Wilkinson, R., Zobel, J.: Experiments in Spoken Document Retrieval using Phoneme N-grams. Speech Communication 32(1-2), 61–77 (2000)
Fiscus, J.G.: A Post-processing System to Yield Reduced Word Error Rates: Recognizer Output Voting Error Reduction (ROVER). In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 347–354 (1997)
Nishizaki, H., Nakagawa, S.: Japanese Spoken Document Retrieval Considering OOV Keywords Using LVCSR System with OOV Detection Processing. In: Proc. of Human Language Technology Conference 2002, pp. 144–151 (March 2002)
Kai, A., Hirose, Y., Nakagawa, S.: Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech unde rstanding system. In: ICSLP 1998, pp. 2427–2430 (1998)
Kawahara, T., Kobayashi, T., Takeda, K., Minematsu, N., Itoh, K., Yamamoto, M., Yamamoto, A., Utsuro, T., Shikano, K.: Sharable software repository for japanese large vocabulary continuous speech recognition. In: ICSLP 1998, pp. 763–766 (1998)
Utsuro, T., Harada, T., Nishizaki, H., Nakagawa, S.: A Confidence Measure Based on Agreement among Multiple LVCSR Models – Correlation between Pair of Acoustic Models and Confidence. In: Proc. of ICSLP 2002, pp. 701–704 (September 2002)
Nishizaki, H., Nakagawa, S.: A System for Retrieving Broadcast News Speech Documents Using Voice Input Keywords and Similarity between Words. In: Proc. of ICSLP 2000, vol. 3, pp. 1073–1076 (October 2000)
Itoh, K., Yamamoto, M., Takeda, K., Takezawa, T., Matsuoka, T., Kobayashi, T., Shikano, K., Itahashi, S.: JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. Journal of the Acoustical Society of Japan (E) 20(3), 199–206 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nishizaki, H., Sekiguchi, Y. (2006). Word Error Correction of Continuous Speech Recognition Using WEB Documents for Spoken Document Indexing. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_23
Download citation
DOI: https://doi.org/10.1007/11940098_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)