Skip to main content

A Novel Composite Kernel for Finding Similar Questions in CQA Services

  • Conference paper
Book cover Web-Age Information Management (WAIM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6184))

Included in the following conference series:

  • 1665 Accesses

Abstract

Finding similar questions in Community Question Answering (CQA) services plays more and more important role in current web and IR applications. The task aims to retrieve historical questions that are similar or relevant to new questions posed by users. However, traditional “bag-of-words” based models would fail to measure the similarity between question sentences, as they usually ignore sequential and syntactic information. In this paper, we propose a novel composite kernel to improve the accuracy in question matching. Our study illustrate that the composite kernel can efficiently capture both lexical semantics and syntactic information in a question sentence by leveraging word sequence kernel, POS tag sequence kernel and syntactic tree kernel. Experimental results on real world datasets show that our proposed method significantly outperforms the state-of-the-art models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burke, R.D., Hammond, K.J., Kulyukin, V.A., Lytinen, S.L., Tomuro, N., Schoenberg, S.: Question answering from frequently asked question files: Experiences with the faq finder system. AI Magazine 18(2), 57–66 (1997)

    Google Scholar 

  2. Cao, X., Cong, G., Cui, B., Jensen, C.S., Zhang, C.: The use of categorization information in language models for question retrieval. In: CIKM 2009, pp. 265–274. ACM, New York (2009)

    Chapter  Google Scholar 

  3. Choon Hui Teo, S.: Fast and space efficient string kernels using suffix arrays. In: ICML 2006, pp. 929–936. ACM, New York (2006)

    Google Scholar 

  4. Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures and the voted perceptron. In: ACL 2002, pp. 263–270. Association for Computational Linguistics (2002)

    Google Scholar 

  5. Duan, H., Cao, Y., Lin, C.-Y., Yu, Y.: Searching questions by identifying question topic and question focus. In: Proceedings of ACL 2008: HLT, pp. 156–164. Association for Computational Linguistics (June 2008)

    Google Scholar 

  6. Jijkoun, V., de Rijke, M.: Retrieving answers from frequently asked questions pages on the web. In: CIKM 2005, pp. 76–83. ACM, New York (2005)

    Chapter  Google Scholar 

  7. Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)

    Article  MATH  Google Scholar 

  8. Moschitti, A.: Kernel methods, syntax and semantics for relational text categorization. In: CIKM 2008, pp. 253–262. ACM Press, New York (2008)

    Chapter  Google Scholar 

  9. Wang, K., Ming, Z., Chua, T.-S.: A syntactic tree matching approach to finding similar questions in community-based qa services. In: SIGIR 2009, pp. 187–194. ACM Press, New York (2009)

    Chapter  Google Scholar 

  10. Wang, X.-J., Tu, X., Feng, D., Zhang, L.: Ranking community answers by modeling question-answer relationships via analogical reasoning. In: SIGIR 2009, pp. 179–186. ACM, New York (2009)

    Chapter  Google Scholar 

  11. Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: SIGIR 2008, pp. 475–482. ACM, New York (2008)

    Chapter  Google Scholar 

  12. Zhang, D., Lee, W.S.: Question classification using support vector machines. In: SIGIR 2003, pp. 26–32. ACM, New York (2003)

    Chapter  Google Scholar 

  13. Zhao, S., Grishman, R.: Extracting relations with integrated information using kernel methods. In: ACL 2005, pp. 419–426. Association for Computational Linguistics (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, J., Li, Z., Hu, X., Hu, B. (2010). A Novel Composite Kernel for Finding Similar Questions in CQA Services. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds) Web-Age Information Management. WAIM 2010. Lecture Notes in Computer Science, vol 6184. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14246-8_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14246-8_59

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14245-1

  • Online ISBN: 978-3-642-14246-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics