Skip to main content

Techniques for Improving the Performance of Naive Bayes for Text Classification

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3406))


Naive Bayes is often used in text classification applications and experiments because of its simplicity and effectiveness. However, its performance is often degraded because it does not model text well, and by inappropriate feature selection and the lack of reliable confidence scores. We address these problems and show that they can be solved by some simple corrections. We demonstrate that our simple modifications are able to improve the performance of Naive Bayes for text classification significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization: Papers from the AAAI Workshop, Madison Wisconsin, pp. 55–62. AAAI Press, Menlo Park (1998); Technical Report WS-98-05

    Google Scholar 

  2. Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C.D., Stamatopoulos, P.: Learning to filter spam e-mail: A comparison of a Naive Bayesian and a memory-based approach. In: Zaragoza, H., Gallinari, P., Rajman, M. (eds.) Proc. Workshop on Machine Learning and Textual Information Access, 4th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Lyon, France, pp. 1–13 (2000)

    Google Scholar 

  3. Lang, K.: NewsWeeder: Learning to filter netnews. In: Proc. 12th International Conference on Machine Learning (ICML 1995), pp. 331–339. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  4. Pazzani, M., Billsus, D.: Learning and revising user profiles: The identification of interesting web sites. Machine Learning 27, 313–331 (1997)

    Article  Google Scholar 

  5. Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: 14th International Conference on Machine Learning (ICML 1997), pp. 170–178 (1997)

    Google Scholar 

  6. Cohen, W.W., Singer, Y.: Context-sensitive learning methods for text categorization. ACM Transactions on Information Systems 17, 141–173 (1999)

    Article  Google Scholar 

  7. McCallum, A., Nigam, K.: A comparison of event models for Naive Bayes text classification. In: Learning for Text Categorization: Papers from the AAAI Workshop, pp. 41–48. AAAI Press, Menlo Park (1998); Technical Report WS-98-05

    Google Scholar 

  8. Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proc. 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1999), pp. 42–49 (1999)

    Google Scholar 

  9. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  10. Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  11. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to construct knowledge bases from the World Wide Web. Artificial Intelligence 118, 69–113 (2000)

    Article  MATH  Google Scholar 

  12. Katz, S.M.: Distribution of content words and phrases in text and language modelling. Natural Language Engineering 2, 15–59 (1996)

    Article  Google Scholar 

  13. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  14. Friedman, J.H.: On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1, 55–77 (1997)

    Article  Google Scholar 

  15. Mladenić, D., Grobelnik, M.: Word sequences as features in text-learning. In: Proc. 17th Electrotechnical and Computer Science Conference (ERK 1998), Ljubljana, Slovenia (1998)

    Google Scholar 

  16. Gómez-Hidalgo, J.M., de Buenaga Rodríguez, M.: Integrating a lexical database and a training collection for text categorization. In: ACL/EACL 1997 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, pp. 39–44 (1997)

    Google Scholar 

  17. Dhillon, I.S., Mallela, S., Kumar, R.: A divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research 3, 1265–1287 (2003)

    Article  MATH  Google Scholar 

  18. Torkkola, K.: Linear discriminant analysis in document classification. In: IEEE ICDM 2001 Workshop on Text Mining (TextDM 2001), San Jose, CA (2001)

    Google Scholar 

  19. Rennie, J.D.M., Shih, L., Teevan, J., Karger, D.: Tackling the poor assumptions of Naive Bayes text classifiers. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), Washington, D.C, pp. 616–623. AAAI Press, Menlo Park (2003)

    Google Scholar 

  20. Kim, S.B., Rim, H.C., Yook, D., Lim, H.S.: Effective methods for improving Naive Bayes text classifiers. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 414–423. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  21. Eyheramendy, S., Lewis, D.D., Madigan, D.: On the Naive Bayes model for text categorization. In: Bishop, C.M., Frey, B.J. (eds.) AI & Statistics 2003: Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, pp. 332–339 (2003)

    Google Scholar 

  22. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proc. 14th International Conference on Machine Learning (ICML 1997), pp. 412–420 (1997)

    Google Scholar 

  23. Forman, G.: An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research 3, 1289–1305 (2003)

    Article  MATH  Google Scholar 

  24. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley, New York (1991)

    Book  MATH  Google Scholar 

  25. Bennett, P.N.: Assessing the calibration of Naive Bayes’ posterior estimates. Technical Report CMU-CS-00-155, School of Computer Science, Carnegie Mellon University (2000)

    Google Scholar 

  26. Apté, C., Damerau, F., Weiss, S.M.: Towards language independent automated learning of text categorization models. In: Proc. 17th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1994), pp. 23–30 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schneider, KM. (2005). Techniques for Improving the Performance of Naive Bayes for Text Classification. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24523-0

  • Online ISBN: 978-3-540-30586-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics