Abstract
Digital reference services normally rely on human experts to provide quality answers to the user requests via online communication tools. As the services gain more popularity, more experts are needed to keep up with a growing demand. Alternatively, automated question answering module can help shorten the question-answering cycle. When the system receives a new user submitted question, the similarity of the user’s request and the existing questions in the archive can be compared. If the appropriate match is found, the system then uses the associated answer to response to such request. Since a question is relatively short and two questions might contain very few words in common, the challenge is how to effectively identify the similarity of questions. In this paper, we focus on the problem of identifying questions that convey the similar information need. That is, our goal is to find paraphrases of the original questions. To achieve this, we propose a hybrid approach that combines semantic, syntactic, and question category to judge question similarity. Semantic and syntactic information is measured by taking into account word similarity, word order, and part of speech information. Information about the types of question is derived from a Support Vector Machine classifier. The experimental results demonstrate that our combined measures are highly effective in distinguishing original questions and their paraphrases, thus improving the potency of question matching task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Achananuparp, P., Han, H., Nasraoui, O., Johnson, R.: Semantically enhanced user modeling. In: Proceedings of SAC 2007, pp. 1335–1339. ACM Press, New York (2007)
Achananuparp, P., Hu, X., Zhou, X., Zhang, X.: Utilizing Sentence Similarity and Question Type Similarity to Response to Similar Questions in Knowledge-Sharing Community. In: Proceedings of QAWeb 2008 Workshop, Beijing, China (2008)
Berger, A., Caruana, D., Cohn, D., Freitag, D., Mittal, V.: Bridging the lexical chasm: Statistical approaches to answer-finding. In: Proceedings of SIGIR, pp. 222–229 (2000)
Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N., Schoenberg, S.: Question answering from frequently asked question files: Experiences with the FAQ finder system. Technical report (1997)
Corley, C., Mihalcea, R.: Measuring the semantic similarity of texts. In: Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, Michigan, pp. 13–18 (June 2005)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of ACM CIKM, pp. 84–90 (2005)
Joachims, T.: Text categorization with support vector machines: Leaning with many relevant features. In: Proceedings of European Conference on Machine Learning, pp. 137–142 (1998)
Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th annual international conference on Systems documentation, pp. 24–26 (1986)
Li, X., Roth, D.: Learning Question Classifiers. In: COLING 2002 (August 2002)
Li, Y., McLean, D., Bandar, Z.A., O’Shea, J.D., Crockett, K.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18(8), 1138–1150 (2006)
Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of the Fifteenth international Conference on Machine Learning, San Francisco, CA, pp. 296–304 (1998)
Lytinen, S., Tomuro, N.: The Use of Question Types to Match Questions in FAQFinder. In: 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases, pp. 46–53. AAAI Press, Menlo Park (2002)
Malik, R., Subramaniam, V., Kaushik, S.: Automatically Selecting Answer Templates to Respond to Customer Emails. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, pp. 1659–1664 (2007)
Metzler, D., Bernstein, Y., Croft, W., Moffat, A., Zobel, J.: Similarity measures for tracking information flow. In: Proceedings of CIKM, pp. 517–524 (2005)
Metzler, D., Croft, W.B.: Analysis of Statistical Question Classification for Fact-based Questions. Information Retrieval 8(3), 481–504 (2005)
Metzler, D., Dumais, S.T., Meek, C.: Similarity Measures for Short Segments of Text. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 16–27. Springer, Heidelberg (2007)
Murdock, V.: Aspects of sentence retrieval. Ph.D. Thesis, University of Massachusetts (2006)
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: International Joint Conference for Artificial Intelligence (IJCAI 1995), pp. 448–453 (1995)
Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th international Conference on World Wide Web, Edinburgh, Scotland, pp. 377–386 (2006)
Smadja, F.: Retrieving collocations from text: Xtract. Computational Linguistics 19(1), 143–177 (1993)
Tomuro, N.: Interrogative Reformulation Patterns and Acquisition of Question Paraphrases. In: Proceedings of the Second international Workshop on Paraphrasing, pp. 33–40 (2003)
Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of SIGIR 2003, pp. 26–32. ACM Press, New York (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Achananuparp, P., Hu, X., Zhou, X., Zhang, X. (2008). Utilizing Semantic, Syntactic, and Question Category Information for Automated Digital Reference Services. In: Buchanan, G., Masoodian, M., Cunningham, S.J. (eds) Digital Libraries: Universal and Ubiquitous Access to Information. ICADL 2008. Lecture Notes in Computer Science, vol 5362. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89533-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-89533-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89532-9
Online ISBN: 978-3-540-89533-6
eBook Packages: Computer ScienceComputer Science (R0)