Improving the Robustness to Recognition Errors in Speech Input Question Answering

Tsutsui, Hideki; Manabe, Toshihiko; Fukui, Mika; Sakai, Tetsuya; Fujii, Hiroko; Urata, Koji

doi:10.1007/11880592_23

Hideki Tsutsui²⁰,
Toshihiko Manabe²⁰,
Mika Fukui²⁰,
Tetsuya Sakai²⁰,
Hiroko Fujii²⁰ &
…
Koji Urata²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Included in the following conference series:

Asia Information Retrieval Symposium

944 Accesses
1 Citations

Abstract

In our previous work, we developed a prototype of a speech-input help system for home appliances such as digital cameras and microwave ovens. Given a factoid question, the system performs textual question answering using the manuals as the knowledge source. Whereas, given a HOW question, it retrieves and plays a demonstration video. However, our first prototype suffered from speech recognition errors, especially when the Japanese interrogative phrases in factoid questions were misrecognized. We therefore propose a method for solving this problem, which complements a speech query transcript with an interrogative phrase selected from a pre-determined list. The selection process first narrows down candidate phrases based on co-occurrences within the manual text, and then computes the similarity between each candidate and the query transcript in terms of pronunciation. Our method improves the Mean Reciprocal Rank of top three answers from 0.429 to 0.597 for factoid questions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., Kuo, S.W.: Experiments in Spoken Queries for Document Retrieval. In: Proceedings of Eurospeech 1997, pp. 1323–1326 (1997)
Google Scholar
Crestani, F.: Word recognition errors and relevance feedback in spoken query processing. In: Proceedings of the Fourth International Conference on Flexible Query Answering Systems, pp. 267–281 (2000)
Google Scholar
Fujii, A., Itou, K., Ishikawa, T.: Speech-Drive Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition. In: ACM SIGIR 2001 Workshop on Information Retrieval Techniques for Speech Application (2001)
Google Scholar
Fukumoto, J., Kato, T., Masui, F.: Question Answering Challenge (QAC-1): An Evaluation of QA Tasks at the NTCIR Workshop 3. In: Proceedings of AAAI Spring Symposium: New Directions in Question Answering, pp. 122–133 (2003)
Google Scholar
Hori, C., Hori, T., Isozaki, H., Maeda, E., Katagiri, S., Furui, S.: Deriving Disambiguous Queries in a Spoken Interactive ODQA System. In: Proc. the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 624–627 (2003)
Google Scholar
Ichimura, Y., Yoshimi, Y., Sakai, T., Kokubu, T., Koyama, M.: The Effect of Japanese Named Entity Extraction and Answer Type Taxonomy on the Performance of a Question Answering System. IEICE Journal J88-D2(6), 1067–1080 (2005)
Google Scholar
Kiyota, Y., Kurohashi, S., Misu, T., Komatani, K., Kawahara, T. Navigator, D.: A Spoken Dialog Q-A System based on Large Text Knowledge Base. In: Proceedings of 41st Annual Meeting of the Association for Computer Linguistics, pp. 149–152 (2003)
Google Scholar
Kokubu, T., Sakai, T., Saito, Y., Tsutsui, H., Manabe, T., Koyama, M., Fujii, H.: The Relationship between Answer Ranking and User Satisfaction in a Question Answering System. In: Proceedings of NTCIR-5 Workshop Meeting, pp. 537–544 (2005)
Google Scholar
LaLaVoice2001, http://www3.toshiba.co.jp/pc/lalavoice/index_j.htm
Magnini, B., Vallin, A., Ayache, C., Erbach, G., Penas, A., De Rijke, M., Rocha, P., Simov, K., Sutcliffe, R.: Overview of the CLEF 2004 Multilingual Question Answering Track. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 371–391. Springer, Heidelberg (2005)
Chapter Google Scholar
Masai, Y., Tanaka, S., Nitta, T.: Speaker-independent keyword recognition based on SMQ/HMM. In: Proceedings of International Conference on Spoken Language Processing, pp. 619–622 (1992)
Google Scholar
MPEG7, http://www.itscj.ipsj.or.jp/mpeg7/
Nishizaki, H., Nakagawa, S.: A System for Retrieving Broadcast News Speech Documents Using Voice Input Keywords and Similarity between Words. In: Proceedings of ICSLP 2000, vol. 3, pp. 1073–1076 (2000)
Google Scholar
Nitta, T., Kawamura, A.: Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 2107–2110 (1997)
Google Scholar
Sakai, T., Saito, Y., Ichimura, Y., Koyama, M., Kokubu, T., Manabe, T.: ASKMi: A Japanese question answering system based on semantic role analysis. In: RIAO 2004 Proceedings, pp. 215–231 (2004)
Google Scholar
Suzuki, M., Manabe, T., Sumita, K., Nakayama, Y.: Customer Support Operation with a Knowledge Sharing System KIDS: An Approach based on Information Extraction and Text Structurization. In: SCI 2001 Proceedings, vol. 7, pp. 89–96 (2001)
Google Scholar
Urata, K., Fukui, M., Fujii, H., Suzuki, M., Sakai, T., Saito, Y., Ichimura, Y., Sasaki, H.: A multimodal help system based on question answering technology. In: IPSJ SIG Technical Reports FI-74-4, pp. 23–29 (2004)
Google Scholar
Voorhees, E.M.: Overview of the TREC 2004 Question Answering Track. In: Proceedings of the Thirteenth Text REtreival Conference, TREC 2004 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Media Laboratory, Corporate R&D Center, TOSHIBA Corp., Kawasaki, 212-8582, Japan
Hideki Tsutsui, Toshihiko Manabe, Mika Fukui, Tetsuya Sakai, Hiroko Fujii & Koji Urata

Authors

Hideki Tsutsui
View author publications
You can also search for this author in PubMed Google Scholar
Toshihiko Manabe
View author publications
You can also search for this author in PubMed Google Scholar
Mika Fukui
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuya Sakai
View author publications
You can also search for this author in PubMed Google Scholar
Hiroko Fujii
View author publications
You can also search for this author in PubMed Google Scholar
Koji Urata
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Hwee Tou Ng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Mun-Kew Leong
Department of Computer Science, School of Computing, National University of Singapore, 117543, Singapore
Min-Yen Kan
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, P.O. Box, 119613, Singapore
Donghong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsutsui, H., Manabe, T., Fukui, M., Sakai, T., Fujii, H., Urata, K. (2006). Improving the Robustness to Recognition Errors in Speech Input Question Answering. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_23

Download citation

DOI: https://doi.org/10.1007/11880592_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45780-0
Online ISBN: 978-3-540-46237-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics