Skip to main content

Improving performance of naive bayes classifier by including hidden variables

  • 3 Formal Tools
  • Conference paper
  • First Online:
Methodology and Tools in Knowledge-Based Systems (IEA/AIE 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1415))

Abstract

A basic task in many applications is classification of observed instances into a predetermined number of categories or classes. Popular classifiers for probabilistic classification are Bayesian network classifiers and particularly the restricted form known as the Naive Bayes classifier. Naive Bayes performs well in many domains but suffers the limitation that its classification performance cannot improve significantly with an increasing sample size. The expressive power of Naive Bayes is inadequate to capture higher order relationships in the data. This paper presents a method for improving predictive performance of the Naive Bayes classifier by augmenting its structure with additional variables learned from the training data. The resulting classifier retains the advantages of simplicity and efficiency and achieves better predictive performance than Naive Bayes. The approach proposed here can be extended to more general Bayesian network classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14 (1968) 462–467

    Article  MATH  Google Scholar 

  2. Riedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. To appear in Machine Learning.

    Google Scholar 

  3. Goodman, L.A.: Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrica 61 (1974) 215–231

    Article  MATH  Google Scholar 

  4. Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20 (1995) 197–243

    MATH  Google Scholar 

  5. Kononenko, I.: Semi-naive Bayesian classifier. In: Proceedings of the Sixth European Working Session on Learning. Springer-Verlag, Berlin, 1991, pp.206–219

    Google Scholar 

  6. Lam, W., Bacchus, F.: Using causal information and local measures to learn Bayesian networks. In: Heckerman, D., Mamdani, A. (eds.): Uncertainty in Artificial Intelligence. Proceedings of the Ninth Conference, Morgan Kaufmann, San Mateo, California, 1993, pp.243–250

    Google Scholar 

  7. Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their applications to expert systems (with discussion). J. Roy. Statist. Soc. Ser. B 50 (1988) 157–224

    MATH  MathSciNet  Google Scholar 

  8. Murphy, P.M., Aha, D.W.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science, 1994

    Google Scholar 

  9. Pazzani, M.J.: Searching for dependencies in Bayesian classifiers. In: Fisher, D., Lenz, H. (eds.): Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics. Fort Lauderdale, FL, 1995

    Google Scholar 

  10. J. R. Quinlan, J.R.: C4.5:Programs for machine learning. Morgan Kaufmann, San Mateo, 1993

    Google Scholar 

  11. Singh, M., Valtorta, M.: An algorithm for the construction of Bayesian network structures from data. In: Uncertainty in Artificial Intelligence. Proceedings of the Ninth Conference, Morgan Kaufmann, San Mateo, California, 1993, pp.259–265

    Google Scholar 

  12. Tarjan, R.E., Yannakakis, M.: Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. Comput. 13 (1984) 566–579

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Mira Angel Pasqual del Pobil Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag

About this paper

Cite this paper

Stewart, B. (1998). Improving performance of naive bayes classifier by including hidden variables. In: Mira, J., del Pobil, A.P., Ali, M. (eds) Methodology and Tools in Knowledge-Based Systems. IEA/AIE 1998. Lecture Notes in Computer Science, vol 1415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64582-9_757

Download citation

  • DOI: https://doi.org/10.1007/3-540-64582-9_757

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64582-5

  • Online ISBN: 978-3-540-69348-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics