Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Included in the following conference series:

Abstract

This paper tries to use unlabelled in combination with labelled questions for semi-supervised learning to improve the performance of question classification task. We also give two proposals to modify the Tri-training which is a simple but efficient co-training style algorithm to make it more suitable for question data type. In order to avoid bootstrap-sampling the training set to get different sets for training the three classifiers, the first proposal is to use multiple algorithms for classifiers in Tri-training, the second one is to use multiple algorithms for classifiers in combination with multiple views. The modification prevents the error rate at the initial step from being increased and our experiments show promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berger, A., Pietra, S.D., Pietra, V.D.: A maximum entropy approach to natural language processing. Computational Linguistics 22(1) (1996)

    Google Scholar 

  2. Carlson, A., Cumby, C., Roth, D.: The SNoW learning architecture, Technical Report UIUC-DCS-R-99-2101, UIUC Computer Science Department (1999)

    Google Scholar 

  3. Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20(3), 273–297 (1995)

    MATH  Google Scholar 

  4. Zhang, D., Lee, W.S.: Question classification using Support vector machine. In: Proceedings of the 26th Annual International ACM SIGIR Conference, pp. 26–32 (2003)

    Google Scholar 

  5. Voorhees, E.: The TREC-8 Question Answering Track Report. In: Proceedings of the 8th Text Retrieval Conference (TREC8), pp. 77–82 (1999)

    Google Scholar 

  6. Voorhees, E.: The TREC-9 Question Answering Track. In: Proceedings of the 9th Text Retrieval Conference (TREC9), pp. 71–80 (2000)

    Google Scholar 

  7. Voorhees, E.: Overview of the TREC 2001 Question Answering Track. In: Proceedings of the 10th Text Retrieval Conference (TREC10), pp. 157–165 (2001)

    Google Scholar 

  8. Mulenbach, F., et al.: Identifying and handling mislabelled Instances. Journal of Intelligent Information Systems 22(1), 89–109 (2004)

    Article  Google Scholar 

  9. Kanji, G.K.: 100 Statistical Tests. SAGE Publications, Thousand Oaks (1994)

    Google Scholar 

  10. Kadri, H., Wayne, W.: Question classification with Support vector machines and error correcting codes. In: Proceedings of NAACL/Human Language Technology Conference, pp. 28–30 (2003)

    Google Scholar 

  11. Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning, pp. 327–334 (2000)

    Google Scholar 

  12. Joachims, T.: Text categorization with Support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  13. Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 556–562 (2002)

    Google Scholar 

  14. Zhou, Z.-H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tri, N.T., Le, N.M., Shimazu, A. (2006). Using Semi-supervised Learning for Question Classification. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_4

Download citation

  • DOI: https://doi.org/10.1007/11940098_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49667-0

  • Online ISBN: 978-3-540-49668-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics