ABSTRACT
This paper introduces a new question expanding method for question classification in cQA services. Input questions are mostly generated by a small size of text in the cQA services, and test inputs consist of only a question whereas training data do a pair of question and answer. Thus, the input questions cannot provide enough information for good classification in many cases. To solve this problem, we propose the question expanding method by pseudo relevant feedback and automatic answer generation. For pseudo relevant feedback, we first find relevant question-answer pairs related to an input question using the Indri search engine, and then top relevant words are chosen as expanded words. The automatic answer generation tries to create pseudo answers by adding question-related words using translation probabilities from questions to answers by Giza++. As a result, we obtain the significant improved performances when two approaches are effectively combined.
- B. Loni, 2011, A survey of state-of-the-art methods on question classification, Delft University of Technology, Tech. Rep., pp.1--40.Google Scholar
- L. Cai, G. Zhou, K. Liu and J. Zhau, 2011, Large-Scal Question Classification in cQA by Leveraging Wikipedia Semantic Knowledge, CIKM'11, pp. 1321--1330. Google ScholarDigital Library
- I. Ruthven and M. Lalmas, 2003, A survey on the use of relevance feedback for information access systems, The Knowledge Engineering Review, Vol. 18:2, pp. 95--145. Google ScholarDigital Library
- W. Magdy and G. J. F. Jones, 2011, A Study on Query Expansion Methods for Patent Retrieval, PaIR'11, pp. 19--24. Google ScholarDigital Library
- J. Jeon, X. Xue and W. B. Croft, 2008, Retrieval Models for Question and Answer Archives, SIGIR'08, pp. 475--482. Google ScholarDigital Library
- J. Jeon, W. B. Croft, and J. H. Lee, 2005, Finding similar questions in large question and answer archives, In Proceedings of the 14th ACM Conference on Information and Knowledge Management, 84--90. Google ScholarDigital Library
- R. Prasad, P. Natarajan, K. Subramanian, S. Saleem and R. Schwartz, 2007, Finding Structure in Noisy Text: Topic Classification and Unsupervised Clustering, International Journal on Document Analysis and Recognition,10(3--4):187--198. Google ScholarDigital Library
- D. Liu, S. McVeety, R. Prasad and P. Natarajan, 2008, SEMI-SUPERVISED TOPIC CLASSIFICATION FOR LOW RESOURCE LANGUAGES, IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, 5093--5096.Google Scholar
- D. B. Bracewell, J. Yan, F Ren and S Kuroiwa, 2009, Category Classification and Topic Discovery of Japanese and English News Articles, Electronic Notes in Theoretical Computer Science, 225(2):51--65. Google ScholarDigital Library
- Q. Huang, D. Song and S. Ruger, 2008, Robust Query-Specific Pseudo Feedback Document Selection for Query Expasion, ECIR 2008, LNCS 4956, pp. 547--554. Google ScholarDigital Library
- K. S. Lee, W. B. Croft and J. Allan, A Cluster-Based Resampling Method for Pseudo-Relevance Feedback, SIGIR'08, pp. 235--242. Google ScholarDigital Library
- G. Cao, J. Gao and S. Robertson, 2008, Selecting Good Expansion Terms for Pseudo-Relevance Feedback, SIGIR '08, pp.243--250. Google ScholarDigital Library
- S.E. Robertson and S. Walker, 1999, Okapi/Keenbow at TREC-8. In Proceedings of the Eighth Text Retrieval Conference (TREC-8), pp. 151--161.Google Scholar
- S. H. Kim, Y. J. Ko and D. W. Oard, Combining Lexical and Statistical Translation Evidence for Cross-Language Information Retrieval, Journal of the American Society for Information Science and Technology, pp. 1--17.Google Scholar
Index Terms
- An Effective Question Expanding Method for Question Classification in cQA services
Recommendations
An effective category classification method based on a language model for question category recommendation on a cQA service
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementClassiying user's question into several topics helps respondents answering the question in a cQA service. The word weighting method must estimate the appropriate weight of a word to improve the category (or topic) classification. In this paper, we ...
Question classification for a Croatian QA system
TSD'11: Proceedings of the 14th international conference on Text, speech and dialogueQuestion Answering (QA) systems provide efficient means for retrieval of information, which in many cases more directly address users' information needs. The performance of a QA system crucially depends on its ability to correctly classify the query ...
Semi-supervised learning for question classification in CQA
In a community question answering (CQA) system, the new questions are appeared endlessly which have no tags. And the questions must be marked as some labels. Therefore, the question classification is very important for CQA. In the traditional task of ...
Comments