Skip to main content

Supervised Learning Approach to Optimize Ranking Function for Chinese FAQ-Finder

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Abstract

In this paper, we address the optimization problem for huge Question-Answer (QA) pairs collection based Chinese FAQ-Finder system. Unlike most published researches which leaned to address word mismatching problem among questions, we focus on more fundamental problem: ranking function, which was always arbitrarily borrowed from traditional document retrieval directly. One unified ranking function with four embedded parameters is proposed and the characteristics of three different fields of QA pair and effects of two different Chinese word segmentation settings are investigated. Experiments on 1,000 question queries and 3.8 million QA pairs show that the unified ranking function can achieve 6.67% promotion beyond TFIDF baseline. One supervised learning approach is also proposed to optimize ranking function by employing 264 features, including part-of-speech, and bigram co-occurrence etc. Experiments show that 7.06% further improvement can be achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burke, R.D., et al.: Question Answering from Frequently Asked Question Files: Experiences with the FAQ FINDER System. AI Magazine 18(2), 57–66 (1997)

    Google Scholar 

  2. Jijkoun, V., de Rijke, M.: Retrieving Answers from Frequently Asked Questions Pages on the Web. In: CIKM 2005, pp. 76–83 (2005)

    Google Scholar 

  3. Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proc. of CIKM 2005, pp. 84–90 (2005)

    Google Scholar 

  4. Lytinen, S., Tomuro, N.: The Use of Question Types to Match Questions in FAQFinder. In: AAAI 2002 Spring Symposium on Mining Answers From Text, AAAI Press, Menlo Park (2002)

    Google Scholar 

  5. Wanxiang, C., et al.: Chinese Sentence Similarity Computing for Bilingual Sentence Pair Retireval (in Chinese). In: JSCL-2003 (2003)

    Google Scholar 

  6. Lucene.NET, http://www.dotlucene.net/

  7. Kishida, K.: Property of Average Precision and its Generalization: An Examination of Evaluation Indicator for Informaiton Retrieval Experiments. NII Technical Report, NII-2005-014E (Oct. 2005)

    Google Scholar 

  8. Hu, G., et al.: A Supervised Learning Approach to Entity Search. In: Ng, H.T., et al. (eds.) AIRS 2006. LNCS, vol. 4182, Springer, Heidelberg (2006)

    Google Scholar 

  9. Mei, J.-J.: TongYiCiCiLin (The Thesaurus). Shanghai Cishu Press, Shanghai (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Hu, G., Liu, D., Liu, Q., Wang, Rh. (2007). Supervised Learning Approach to Optimize Ranking Function for Chinese FAQ-Finder. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71701-0_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71700-3

  • Online ISBN: 978-3-540-71701-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics