Abstract
In this paper, we present a method for the automatic acquisition of semantic-based reformulations from natural language questions. Our goal is to find useful and generic reformulation patterns, which can be used in our question answering system to find better candidate answers. We used 1343 examples of different types of questions and their corresponding answers from the TREC-8, TREC-9 and TREC-10 collection as training set. The system automatically extracts patterns from sentences retrieved from the Web based on syntactic tags and the semantic relations holding between the main arguments of the question and answer as defined in WordNet. Each extracted pattern is then assigned a weight according to its length, the distance between keywords, the answer sub-phrase score, and the level of semantic similarity between the extracted sentence and the question. The system differs from most other reformulation learning systems in its emphasis on semantic features. To evaluate the generated patterns, we used our own Web QA system and compared its results with manually created patterns and automatically generated ones. The evaluation on about 500 questions from TREC-11 shows comparable results in precision and MRR scores. Hence, no loss of quality was experienced, but no manual work is now necessary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aceves-Pérez, R., nor Pineda, L.V., Montes-y-Gòmez, M.: Towards a Multilingual QA System Based on the Web Data Redundancy. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS (LNAI), vol. 3528, pp. 32–37. Springer, Heidelberg (2005)
Hermjakob, U., Echihabi, A., Marcu, D.: Natural language based reformulation resource and wide exploitation for question answering. In: [22]
Ravichandran, D., Hovy, E.H.: Learning surface text patterns for a question answering system. In: Proceedings of ACL 2002, Philadelphia (2002)
Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plain-Text Collections. In: Proceedings of the 5th ACM International Conference on Digital Libraries (2000)
Agichtein, E., Lawrence, S., Gravano, L.: Learning search engine specific query transformations for question answering. In: Proceedings of WWW10, Hong Kong, pp. 169–178 (2001)
Soubbotin, M., Soubbotin, S.: Patterns of potential answer expressions as clues to the right answers. In: Proceedings of The Tenth Text Retrieval Conference (TREC-X), Gaithersburg, Maryland, pp. 175–182 (2001)
Brill, E., Lin, J., Banko, M., Dumais, S., Ng, A.: Data-Intensive Question Answering. In: [22]
Lawrence, S., Giles, C.L.: Context and Page Analysis for Improved Web Search. IEEE Internet Computing 2, 38–46 (1998)
Rivachandram, D., Hovy, E.: Learning Surface Text Patterns for a Question Answering System. In: Proceeding of ACL Conference, Philadephia, pp. 41–47 (2002)
Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling question answering to the web. In: World Wide Web, pp. 150–161 (2001)
Radev, D.R., Qi, H., Zheng, Z., Blair-Goldensohn, S., Zhang, Z., Fan, W., Prager, J.M.: Mining the web for answers to natural language questions. In: CIKM, pp. 143–150 (2001)
Duclaye, F., Yvon, F., Collin, O.: Using the Web as a Linguistic Resource for Learning Reformulations Automatically. In: LREC 2002, Las Palmas, Spain, pp. 390–396 (2002)
Kosseim, L., Plamondon, L., Guillemette, L.: Answer formulation for question-answering. In: Proceedings of The Sixteenth Canadian Conference on Artificial Intelligence (AI 2003), Halifax, Canada (2003)
Plamondon, L., Lapalme, G., Kosseim, L.: The QUANTUM Question-Answering System at TREC-11. In: [22]
Cunningham, H.: GATE, a General Architecture for Text Engineering. Computers and the Humanities 36, 223–254 (2002)
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, pp. 82–94. MIT, Cambridge (1995)
NIST: Proceedings of TREC-8, Gaithersburg, Maryland, NIST (1999)
NIST: Proceedings of TREC-9, Gaithersburg, Maryland, NIST (2000)
NIST: Proceedings of TREC-10, Gaithersburg, Maryland, NIST (2001)
Plamondon, L., Lapalme, G., Kosseim, L.: The QUANTUM Question Answering System. In: [22]
Porter, M.: An algorithm for suffix stripping. Program 14, 130–137 (1980)
NIST: Proceedings of TREC-11, Gaithersburg, Maryland, NIST (2002)
Jacquemin, C.: Spotting and discovering terms through natural language processing. MIT Press, Cambridge (2001)
Ferro, J.V., Barcala, F.M., Alonso, M.A.: Using syntactic dependency-pairs conflation to improve retrieval performance in spanish. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 381–390. Springer, Heidelberg (2002)
Ribadas, F.J., Vilares, M., Vilares, J.: Semantic similarity between sentences through approximate tree matching. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 638–646. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yousefi, J., Kosseim, L. (2006). Automatic Acquisition of Semantic-Based Question Reformulations for Question Answering. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2006. Lecture Notes in Computer Science, vol 3878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11671299_46
Download citation
DOI: https://doi.org/10.1007/11671299_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32205-4
Online ISBN: 978-3-540-32206-1
eBook Packages: Computer ScienceComputer Science (R0)