Skip to main content

Sequential Classifiers Combination for Text Categorization: An Experimental Study

  • Conference paper
Advances in Web-Age Information Management (WAIM 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3129))

Included in the following conference series:

Abstract

In this paper, we introduce Sequential Classifiers Combination (SCC) into text categorization to improve both the classification effectiveness and classification efficiency of the combined individual classifiers. We apply two classifiers sequentially for experimental study, where the first classifier (called filtering classifier) is used to generate candidate categories for the test document and the second classifier (called deciding classifier) is used to select a final category for the test document from the candidate categories. Experimental results indicate that when combining boosting and kNN methods, the combined classifier outperforms the best one of the two individual classifiers, and in the case of combining Rocchio and kNN methods, the combined classifier performs equally well as kNN while its efficiency is much better than kNN and is close to that of Rocchio.

This work was supported by National Natural Science Foundation of China under grant No. 60373019.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lewis, D.D.: An evaluation of phrasal and clustered representations on a text categorization task. In: Proceedings of SIGIR-92, 15th ACM International Conference on Research and Development in Information Retrieval, pp. 37–50 (1992)

    Google Scholar 

  2. Lewis, D.D., Ringuette, M.: A comparison of two learning algorithms for text categorization. In: Proceedings of SDAIR 1994, 3rd Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, US, pp. 81–93 (1994)

    Google Scholar 

  3. Yang, Y.: Expert network: effective and efficient learning from human decisions in text categorization and retrieval. In: Proceedings of SIGIR 1994, 17th ACM International Conference on Research and Development in Information Retrieval, Dublin, IE, pp. 13–22 (1994)

    Google Scholar 

  4. Wiener, E., Pedersen, J.O., Weigend, A.S.: A neural network approach to topic spotting. In: Proceedings of SDAIR-1995, 4th Annual Symposium on Document Analysis and Information Retrieval, Las Vegas, US, pp. 317–332 (1995)

    Google Scholar 

  5. Hull, D.A.: Improving text retrieval for the routing problem using latent semantic indexing. In: Proceedings of SIGIR 1994, 17th ACM International Conference on Research and Development in Information Retrieval, Dublin, IE, pp. 282–289 (1994)

    Google Scholar 

  6. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  7. Schapire, R.E., Singer, Y., Singhal, A.: Boosting and Rocchio applied to text filtering. In: Proceedings of SIGIR 1998, 21st ACM International Conference on Research and Development in Information Retrieval, Melbourne, AU, pp. 215–223 (1998)

    Google Scholar 

  8. Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of SIGIR 1999, 22nd ACM International Conference on Research and Development in Information Retrieval, Berkeley, US, pp. 42–49 (1999)

    Google Scholar 

  9. Larkey, S., Croft, W.: Combining classifiers in text classification. In: Proc. SIGIR, pp. 81–93 (1996)

    Google Scholar 

  10. Woods, K., Kegelmeyer Jr., W.P., Bowyer Jr., K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. PAMI 19(4), 405–410 (1997)

    Google Scholar 

  11. Li, Y.H., Jain, A.K.: Classification of text documents. Computer. J. 41(8), 537–546 (1998)

    Article  MATH  Google Scholar 

  12. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  13. Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  14. Zhou, S., Guan, J., Hu, Y.: Chinese documents classification based on N-grams. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 405–414. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  15. Larkey, L.S.: Automatic essay grading using text categorization techniques. In: Proceedings of SIGIR 1998, 21st ACM International Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp. 90–95 (1998)

    Google Scholar 

  16. Rahman, A.F.R., Fairhurst, M.C.: Serial Combination of Multiple Experts: A Unified Evaluation. Pattern Anal. Appl. 2(4), 292–311 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Z., Zhou, S., Zhou, A. (2004). Sequential Classifiers Combination for Text Categorization: An Experimental Study. In: Li, Q., Wang, G., Feng, L. (eds) Advances in Web-Age Information Management. WAIM 2004. Lecture Notes in Computer Science, vol 3129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27772-9_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27772-9_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22418-1

  • Online ISBN: 978-3-540-27772-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics