Skip to main content

Combining Multiple Classifiers Using Dempster’s Rule of Combination for Text Categorization

  • Conference paper
Modeling Decisions for Artificial Intelligence (MDAI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3131))

Abstract

In this paper, we present an investigation into the combination of four different classification methods for text categorization using Dempster’s rule of combination. These methods include the Support Vector Machine, kNN (nearest neighbours), kNN model-based approach (kNNM), and Rocchio methods. We first present an approach for effectively combining the different classification methods. We then apply these methods to a benchmark data collection of 20-newsgroup, individually and in combination. Our experimental results show that the performance of the best combination of the different classifiers on the 10 groups of the benchmark data can achieve 91.07% classification accuracy, which is 2.68% better than that of the best individual method, SVM, on average.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Xu, L., Krzyzak, A., Suen, C.Y.: Several Methods for Combining Multiple Classifiers and Their Applications in Handwritten Character Recognition. IEEE Trans. on System, Man and Cybernetics 22(3), 418–435 (1992)

    Article  Google Scholar 

  2. Denoeux, T.: A neural network classifier based on Dempster-Shafer theory. IEEE transactions on Systems, Man and Cybernetics A 30(2), 131–150 (2000)

    Article  MathSciNet  Google Scholar 

  3. Yang, Y., Ault, T., Pierce, T.: Combining multiple learning strategies for effective cross validation. In: The Seventeenth International Conference on Machine Learning (ICML 2000), pp. 1167–1182 (2000)

    Google Scholar 

  4. Ho, T.K.: Multiple Classifier Combination: Lessons and Next Steps. In: Ho, T.K., Kandel, A., Bunke, H. (eds.) Hybrid Methods in Pattern Recognition, pp. 171–198. World Scientific, Singapore (2002)

    Chapter  Google Scholar 

  5. Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1) (2002)

    Google Scholar 

  6. Li, Y.H., Jain, A. k.: Classification of Text Documents. The Computer Journal 41(8), 537–546 (1998)

    Article  MATH  Google Scholar 

  7. Larkey, L.S., Croft, W.B.: Combining classifiers in text categorization. In: Proceedings of SIGIR 1996, 19th ACM International Conference on Research and Development in Information Retrieval, pp. 289–297 (1996)

    Google Scholar 

  8. Guan, J.W., Bell, D.: Evidence Theory and its applications, vol. 1,2 (1991)

    Google Scholar 

  9. Bi, Y., Bell, D., Guan, J.W.: Combining Evidence from Classifiers in Text Categorization. To appear in 8th International Conference on Knowledge-Based Intelligent Information & Engineering Systems (2004)

    Google Scholar 

  10. Bi, Y.: Combining Multiple Classifiers for Text Categorization using Dempster- Shafer Theory of Evidence. Internal report (2004)

    Google Scholar 

  11. Shi, S.: On Reasoning with Uncertainty and Belief Change. PhD thesis. University of Ulster (1995)

    Google Scholar 

  12. van Rijsbergen, C.J.: Information Retrieval, 2nd edn., Butterworths (1979)

    Google Scholar 

  13. Joachims, T.: A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: The Fourteen International Conference on Machine Learning, ICML 1997 (1997)

    Google Scholar 

  14. Salton, G., Allan, J., Buckley, C., Singhal, A.: Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts. Science 264, 1421–1426 (1994)

    Article  Google Scholar 

  15. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

  16. Guo, G., Wang, H., Bell, D.J., Bi, Y., Greer, K.: KNN model-based approach in classification. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 986–996. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Yang, Y.: A study on thresholding strategies for text categorization. In: Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 137–145 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bi, Y., Bell, D., Wang, H., Guo, G., Greer, K. (2004). Combining Multiple Classifiers Using Dempster’s Rule of Combination for Text Categorization. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2004. Lecture Notes in Computer Science(), vol 3131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27774-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27774-3_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22555-3

  • Online ISBN: 978-3-540-27774-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics