Skip to main content

One Dependence Augmented Naive Bayes

  • Conference paper
Advanced Data Mining and Applications (ADMA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

  • 2398 Accesses

Abstract

In real-world data mining applications, an accurate ranking is as important as an accurate classification. Naive Bayes has been widely used in data mining as a simple and effective classification and ranking algorithm. Since its conditional independence assumption is rarely true, numerous algorithms have been proposed to improve naive Bayes, for example, SBC[1] and TAN[2]. Indeed, the experimental results show that SBC and TAN achieve a significant improvement in term of classification accuracy. However, unfortunately, our experiments also show that SBC and TAN perform even worse than naive Bayes in ranking measured by AUC[3,4](the area under the Receiver Operating Characteristics curve). This fact raises the question of whether we can improve Naive Bayes with both accurate classification and ranking? In this paper, responding to this question, we present a new learning algorithm called One Dependence Augmented Naive Bayes(ODANB). Our motivation is to develop a new algorithm to improve Naive Bayes’ performance not only on classification measured by accuracy but also on ranking measured by AUC. We experimentally tested our algorithm, using the whole 36 UCI datasets recommended by Weka[5], and compared it to Naive Bayes, SBC and TAN. The experimental results show that our algorithm outperforms all the other algorithms significantly in yielding accurate ranking, yet at the same time outperforms all the other algorithms slightly in terms of classification accuracy.

This work was supported by Excellent Youth Foundation of China University of Geosciences(No.CUGQNL0505) and Natural Science Foundation of Hubei of China(No.2001ABB006 and No.2003ABA043).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp. 339–406 (1994)

    Google Scholar 

  2. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)

    Article  MATH  Google Scholar 

  3. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)

    Article  Google Scholar 

  4. Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: comparison under imprecise class and cost distribution. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 43–48. AAAI Press, Menlo Park (1997)

    Google Scholar 

  5. http://prdownloads.sourceforge.net/weka/datasets-UCI.jar

  6. Bennett, P.N.: Assessing the Calibration of Naive Bayes’ Posterior Estimates. Technical Report No. CMU-CS100-155 (2000)

    Google Scholar 

  7. Chickering, D.M.: Learning Bayesian networks is NP-Complete. In: Fisher, D., Lenz, H. (eds.) Learning from Data: Artificial Intelligence and Statistics V, pp. 121–130. Springer, Heidelberg (1996)

    Google Scholar 

  8. Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)

    Article  MATH  Google Scholar 

  9. Domingos, P., Pazzani, M.: Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  10. Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)

    Google Scholar 

  11. Witten, I.H., Frank, E.: Data Mining-Practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  12. Nadeau, C., Bengio, Y.: Inference for the generalization error. In: Advances in Neural In- formation Processing Systems 12, pp. 307–313. MIT Press, Cambridge (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiang, L., Zhang, H., Cai, Z., Su, J. (2005). One Dependence Augmented Naive Bayes. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_22

Download citation

  • DOI: https://doi.org/10.1007/11527503_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27894-8

  • Online ISBN: 978-3-540-31877-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics