Abstract
Tree augmented Naive Bayes (TAN) classifier is a good tradeoff between the model complexity and learnability in practice. Since there are few complete datasets in real world, in this paper, we develop research on how to efficiently learn TAN from incomplete data. We first present an efficient method that could estimate conditional Mutual Information directly from incomplete data. And then we extend basic TAN learning algorithm to incomplete data using our conditional Mutual Information estimation method. Finally, we carry out experiments to evaluate the extended TAN and compare it with basic TAN. The experimental results show that the accuracy of the extended TAN is much higher than that of basic TAN on most of the incomplete datasets. Despite more time consumption of the extended TAN compared with basic TAN, it is still acceptable. Our conditional Mutual Information estimation method can be easily combined with other techniques to improve TAN further.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cheng, J., Greiner, R., Liu, W.: Comparing Bayesian network classifiers. In: Fifth Conf. on Uncertainty in Artificial Intelligence, pp. 101–107 (1999)
Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–161 (1997)
Friedman, N., Goldszmidt, M.: Building classifiers using Bayesian networks. In: AAAI/IAAI, vol. 2, pp. 1277–1284 (1996)
Gyllenberg, M., Carlsson, J., Koski, T.: Bayesian network classification of binarized DNA fingerprinting patterns. In: Capasso, V. (ed.) Mathematical Modeling and Computing in Biology and Medicine, Progetto Leonardo, Bologna, pp. 60–66 (2003)
Karieauskas, G.: Text categorization using hierarchical Bayesian network classifiers (2002), http://citeseer.ist.psu.edu/karieauskas02text.html
Kohavi, R., Becker, B., Sommerfield, D.: Improving simple Bayes. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 78–87. Springer, Heidelberg (1997)
Kononenko, I.: Semi-naive Bayesian classifier. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 206–219. Springer, Heidelberg (1991)
Langley, P., Iba, W., Thompson, K.: An analysis of Bayesian classifiers. In: Proceedings of AAAI 1992, pp. 223–228 (1992)
Pazzani, M.J.: Searching for dependencies in Bayesian classifiers. Learning from Data: Artificial intelligence And Statistics V, pp. 239–248. Springer, New York (1996)
Pham, H.V., Arnold, M.W., Smeulders, W.M.: Face detection by aggregated Bayesian network classifiers. Pattern Recognition Letters 23, 451–461 (2002)
Ramoni, M., Sebastiani, P.: Robust Bayes classifiers. Artificial Intelligence 125, 209–226 (2001)
Singh, M.: Learning Bayesian networks from incomplete data. In: The 14th National Conf. on Artificial Intelligence, pp. 27–31 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tian, F., Wang, Z., Yu, J., Huang, H. (2005). Learning TAN from Incomplete Data. In: Huang, DS., Zhang, XP., Huang, GB. (eds) Advances in Intelligent Computing. ICIC 2005. Lecture Notes in Computer Science, vol 3644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11538059_52
Download citation
DOI: https://doi.org/10.1007/11538059_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28226-6
Online ISBN: 978-3-540-31902-3
eBook Packages: Computer ScienceComputer Science (R0)