Skip to main content

Towards Domain Independent Why Text Segment Classification Based on Bag of Function Words

  • Conference paper
AI 2012: Advances in Artificial Intelligence (AI 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7691))

Included in the following conference series:

Abstract

Increased attention has been focused on question answering (QA) technology as next generation search since it improves the usability of information acquisition from web. However, not much research has been conducted on “non-factoid-QA”, especially on Why Question Answering (Why-QA). In this paper, we introduce a machine learning approach to automatically construct a classifier with function words as features to perform Why Text Segments Classification (WTS classification) by using SVM. It is a process of detecting text segments describing “reasons-causes” and is a subtask of Why-QA mainly related to an answer extraction part. We argue that function words are a strong discriminator for WTS classification. Furthermore, since function words appear in almost all text segments regardless of the domain of the topic, it also enables construction of a domain independent classifier. The experimental results showed significant improvement over state-of-the-art results in terms of accuracy of WTS classification as well as domain independent capability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. Machine Learning, 148–156 (1996)

    Google Scholar 

  2. Friedman, J.H., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Technical Report, Stanford University (1998)

    Google Scholar 

  3. Radev, D., Fan, W., Qi, H., Wu, H., Grewal, A.: Probabilistic question answering on the web. In: WWW, pp. 408–419 (2002)

    Google Scholar 

  4. Tanaka, K., Takiguchi, T., Ariki, Y.: Automatic Why Text Segment Classification and Answer Extraction by Machine Learning. IPSJ Journal 49(6), 57–64 (2008) (Japanese)

    Google Scholar 

  5. Higashinaka, R., Isozaki, H.: Automatically Acquiring Causal Expression Patterns from Relation-annotated Corpora to Improve Question Answering for why-Questions. TALIP 7, 1–29 (2008)

    Article  Google Scholar 

  6. Yin, L.A.: Two-Stage Approach to Retrieving Answers for How-To Questions. In: EACL 2006, pp. 63–70 (2006)

    Google Scholar 

  7. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  8. Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling Question and Answering to the Web. In: WWW, pp. 150–161 (2002)

    Google Scholar 

  9. Lin, J., Katz, B.: Question answering from the web using knowledge annotation and knowledge mining techniques. In: CIKM, pp. 116–123 (2003)

    Google Scholar 

  10. Platt, J.C.: Fast Training of Support Vector Machines using Sequential Minimal Optimization, pp. 185–208. MIT Press (1999)

    Google Scholar 

  11. Nagy, I., Tanaka, K., Ariki, Y.: Why Text Segment Classification Based on Part of Speech Feature Selection. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 87–101. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Matsumoto, Y.: Morphological Analysis System Chasen. IPSJ 41(11), 1208–1214 (2000) (Japanese)

    Google Scholar 

  13. Mizuno, J., Akiba, T., Fujii, A., Itou, K.: Non-factoid Question Answering Experiments at NTCIR-6: Towards Answer Type Detection for Realworld Questions. In: The Sixth NTCIR Workshop, pp. 487–492 (2007)

    Google Scholar 

  14. Ishioroshi, M., Sato, M., Mori, T.: Answering Any Class of Japanese Non-factoid Question by Using the Web and Example Q&A Pairs from a Social Q&A Website. In: WAIIT, pp. 59–65 (2008)

    Google Scholar 

  15. Cortes, C., Vapnik, V.: Support Vector Networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  16. Shibusawa, U., Hayashi, T., Onai, R.: Development and Evaluation of a System for Extracting Answers of a ”Why” Type Question from the WEB. IPSJ Journal 48(3), 1512–1523 (2007) (Japanese)

    Google Scholar 

  17. Soricut, R., Brill, E.: Automatic Question Answering: Beyond the Factoid. In: HLT/NAACL, pp. 54–64 (2004)

    Google Scholar 

  18. Srihari, R., Li, W.: Information Extraction Supported Question Answering. In: TREC, pp. 185–196 (1999)

    Google Scholar 

  19. Verberne, S., Boves, L., Oostdijk, N., Coppen, P.A.J.M.: What is not in the Bag of Words for Why-QA? Comput. Linguist. 36(2), 229–245 (2010)

    Google Scholar 

  20. Verberne, S., Boves, L., Oostdijk, N.H.J., Coppen, P.A.J.M.: Evaluating Discourse-based Extraction for Why-Question Answering. In: SIGIR, pp. 735–737 (2007)

    Google Scholar 

  21. Verberne, S., Boves, L., Oostdijk, N., Coppen, P.: Using Syntactic Information for Improving Why-Question Answering. In: COLING, pp. 953–960 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tanaka, K., Takiguchi, T., Ariki, Y. (2012). Towards Domain Independent Why Text Segment Classification Based on Bag of Function Words. In: Thielscher, M., Zhang, D. (eds) AI 2012: Advances in Artificial Intelligence. AI 2012. Lecture Notes in Computer Science(), vol 7691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35101-3_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35101-3_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35100-6

  • Online ISBN: 978-3-642-35101-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics