Skip to main content

Wrapping the Naive Bayes Classifier to Relax the Effect of Dependences

  • Conference paper
Book cover Intelligent Data Engineering and Automated Learning - IDEAL 2007 (IDEAL 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4881))

Abstract

The Naive Bayes Classifier is based on the (unrealistic) assumption of independence among the values of the attributes given the class value. Consequently, its effectiveness may decrease in the presence of interdependent attributes. In spite of this, in recent years, Naive Bayes classifier is worked for a privilege position due to several reasons [1]. We present DGW (Dependency Guided Wrapper), a wrapper that uses information about dependences to transform the data representation to improve the Naive Bayes classification. This paper presents experiments comparing the performance and execution time of 12 DGW variations against 12 previous approaches, as constructive induction of cartesian product attributes, and wrappers that perform a search for optimal subsets of attributes.

Experimental results show that DGW generates a new data representation that allows the Naive Bayes to obtain better accuracy more times than any other wrapper tested. DGW variations also obtain the best possible accuracy more often than the state of the art wrappers while often spending less time in the attribute subset search process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rish, I.: An empirical study of the naive bayes classifier. In: International Joint Conference on Artificial Intelligence, American Association for Artificial Intelligence, pp. 41–46 (2001)

    Google Scholar 

  2. Fisher, R.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188 (1936)

    Google Scholar 

  3. Bayes, T.: An essay towards solving a problem in the doctrine of chances. Philosophical Transactions 53, 370–418 (1963)

    Article  Google Scholar 

  4. Kononenko, I.: Semi-naive bayesian classifier. In: EWSL 1991. Proceedings of the European working session on learning on Machine learning, pp. 206–219. Springer, New York (1991)

    Chapter  Google Scholar 

  5. Zhang, H., Ling, C.X., Zhao, Z.: The learnability of naive bayes. In: Hamilton, H.J. (ed.) AI 2000. LNCS (LNAI), vol. 1822, pp. 432–441. Springer, Heidelberg (2000)

    Google Scholar 

  6. Kononenko, I.: Inductive and bayesian learning in medical diagnosis. Applied Artificial Intelligence 7(4), 317–337 (1993)

    Article  Google Scholar 

  7. Lewis, D.D.: Representation and learning in information retrieval. PhD thesis, Amherst, MA, USA (1992)

    Google Scholar 

  8. Lewis, D.D.: Naive (Bayes) at forty: The independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)

    Google Scholar 

  9. Mitchell, T.: Machine Learning, 1st edn. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  10. Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)

    Google Scholar 

  11. Kononenko, I.: Comparison od inductive and naive bayesian learning approaches to automatic knowledge adquisition

    Google Scholar 

  12. Langley, P., Iba, W., Thompson, K.: An analysis of bayesian classifiers. In: National Conference on Artificial Intelligence, pp. 223–228 (1992)

    Google Scholar 

  13. Zhang, H., Su, J.: Naive bayesian classifiers for ranking. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 501–512. Springer, Heidelberg (2004)

    Google Scholar 

  14. Cortizo, J.C., Giráldez, J.I.: Discovering data dependencies in web content mining. In: Gutierrez, J.M., Martinez, J.J., Isaias, P. (eds.) IADIS International Conference WWW/Internet (2004)

    Google Scholar 

  15. Cortizo, J.C., Giráldez, J.I.: Multi criteria wrapper improvements to naive bayes learning. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds.) IDEAL 2006. LNCS, vol. 4224, pp. 419–427. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  16. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2-3), 131–163 (1997)

    Article  MATH  Google Scholar 

  17. Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)

    Google Scholar 

  18. Pazzani, M.: Constructive induction of cartesian product attributes. ISIS: Information Statistics and Induction in Science (1996)

    Google Scholar 

  19. Domingos, P., Pazzani, M.J.: On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29(2-3), 103–130 (1997)

    Article  MATH  Google Scholar 

  20. Domingos, P., Pazzani, M.J.: Beyond independence: Conditions for the optimality of the simple bayesian classifier. In: International Conference on Machine Learning, pp. 105–112 (1996)

    Google Scholar 

  21. Hand, D.J., Yu, K.: Idiot’s bayes - not so stupid after all? International Statistical Review 69(3), 299–385 (2001)

    Article  Google Scholar 

  22. Bellman, R.: Adaptive Control Processes: a Guided Tour. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  23. Duch, W.: Filter Methods. In: Feature Extraction, Foundations and Applications, Springer, Heidelberg (2004)

    Google Scholar 

  24. Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAI Fall Symposium on Relevance (1994)

    Google Scholar 

  25. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  26. Langley, P., Sage, S.: Induction of selective bayesian classifiers, pp. 399–406 (1994)

    Google Scholar 

  27. Pazzani, M.J.: Searching for Dependencies in Bayesian Classifiers. In: 5thWorkshop on Artificial Intelligence and Statistics (1996)

    Google Scholar 

  28. Kittler, J.: Feature Selection and Extraction. In: Handbook of Pattern Recognition and Image Processing, Academic Press, London (1986)

    Google Scholar 

  29. Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)

    Google Scholar 

  30. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  31. Hall, M.A.: Correlation-based Feature Selection for Machine Learning. PhD thesis, Department of Computer Science, University of Waikato (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hujun Yin Peter Tino Emilio Corchado Will Byrne Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cortizo, J.C., Giraldez, I., Gaya, M.C. (2007). Wrapping the Naive Bayes Classifier to Relax the Effect of Dependences. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2007. IDEAL 2007. Lecture Notes in Computer Science, vol 4881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77226-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77226-2_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77225-5

  • Online ISBN: 978-3-540-77226-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics