Alternative prior assumptions for improving the performance of naïve Bayesian classifiers

Wong, Tzu-Tsung

doi:10.1007/s10618-008-0101-6

Alternative prior assumptions for improving the performance of naïve Bayesian classifiers

Published: 03 June 2008

Volume 18, pages 183–213, (2009)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Tzu-Tsung Wong¹

299 Accesses
32 Citations
Explore all metrics

Abstract

The prior distribution of an attribute in a naïve Bayesian classifier is typically assumed to be a Dirichlet distribution, and this is called the Dirichlet assumption. The variables in a Dirichlet random vector can never be positively correlated and must have the same confidence level as measured by normalized variance. Both the generalized Dirichlet and the Liouville distributions include the Dirichlet distribution as a special case. These two multivariate distributions, also defined on the unit simplex, are employed to investigate the impact of the Dirichlet assumption in naïve Bayesian classifiers. We propose methods to construct appropriate generalized Dirichlet and Liouville priors for naïve Bayesian classifiers. Our experimental results on 18 data sets reveal that the generalized Dirichlet distribution has the best performance among the three distribution families. Not only is the Dirichlet assumption inappropriate, but also forcing the variables in a prior to be all positively correlated can deteriorate the performance of the naïve Bayesian classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Aitchison J (1986) The statistical analysis of compositional data. John Wiley, New York
MATH Google Scholar
Anderson DR, Sweeney DJ, Williams TA, Chen JC (2006) Statistics for business and economics: a practical approach, Chap. 7, Thomson Learning
Bier VM, Yi W (1995) A Bayesian method for analyzing dependencies in precursor data. Int J Forecast 11: 25–41
Article Google Scholar
Blake C, Merz C (1998) UCI machine learning repository, http://www.ics.uci.edu/~mlearn/MLRepository.html.
Cestnik B, Bratko I (1991) On estimating probabilities in tree pruning. Machine Learning–EWSL-91, European Working Session on Learning. Springer-Verlag, Berlin, Germany, pp 138–150
Connor RJ, Mosimann JE (1969) Concepts of independence for proportions with a generalization of the Dirichlet distribution. J Am Stat Assoc 64: 194–206
Article MATH MathSciNet Google Scholar
Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29: 103–130
Article MATH Google Scholar
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 194–202
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman and Hall, New York
MATH Google Scholar
Hsu CN, Huang HJ, Wong TT (2003) Implications of the Dirichlet assumption for discretization of continuous attributes in naïve Bayesian classifiers. Mach Learn 53: 235–263
Article MATH Google Scholar
Ishwaran H, James LF (2001) Gibbs sampling methods for stick-breaking priors. J Am Stat Assoc 96: 161–173
Article MATH MathSciNet Google Scholar
Kohavi R, Sahami M (1996) Error-based and entropy-based discretization of continuous features. In: Proceedings of the second international conference on knowledge discovery and data mining, Portland, OR, pp 114–119
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. AI Research Branch, NASA Ames Research Center, Moffett Field, CA 94035, USA
Lochner RH (1975) A generalized Dirichlet distribution in Bayesian life testing. J Roy Stat Soc Series B 37: 103–113
MATH MathSciNet Google Scholar
Mitchell TM (1997) Machine learning. McGraw-Hill
Wilks SS (1962) Mathematical Statistics. John Wiley, New York
MATH Google Scholar
Wong TT (1998) Generalized Dirichlet distribution in Bayesian analysis. Appl Math Comput 97: 165–181
Article MATH MathSciNet Google Scholar
Wong TT (2005) A Bayesian approach employing generalized Dirichlet priors in predicting microchip yields. J Chin Inst Ind Eng 22: 210–217
Google Scholar
Wong TT (2007) Perfect aggregation of Bayesian analysis on compositional data. Stat Papers 48: 265–282
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Management, National Cheng Kung University, 1, Ta-Sheuh Road, Tainan City, 701, Taiwan, ROC
Tzu-Tsung Wong

Authors

Tzu-Tsung Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tzu-Tsung Wong.

Additional information

Responsible editor: Charles Elkan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wong, TT. Alternative prior assumptions for improving the performance of naïve Bayesian classifiers. Data Min Knowl Disc 18, 183–213 (2009). https://doi.org/10.1007/s10618-008-0101-6

Download citation

Received: 20 July 2007
Accepted: 14 May 2008
Published: 03 June 2008
Issue Date: April 2009
DOI: https://doi.org/10.1007/s10618-008-0101-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Alternative prior assumptions for improving the performance of naïve Bayesian classifiers

Abstract

Access this article

Similar content being viewed by others

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Alternative prior assumptions for improving the performance of naïve Bayesian classifiers

Abstract

Access this article

Similar content being viewed by others

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation