Abstract
The popular Naïve Bayes (NB) algorithm is simple and fast. We present a new learning algorithm, Extended Bayes (EB), which is based on Naïve Bayes. EB is still relatively simple, and achieves equivalent or higher accuracy than NB on a wide variety of the UC-Irvine datasets. EB is based on two ideas, which interact. The first is to find sets of seemingly dependent attributes and to add them as new attributes. The second idea is to exploit “zeroes”, that is, the negative evidence provided by attribute values that do not occur at all in particular classes in the training data. Zeroes are handled in Naïve Bayes by smoothing. In contrast, EB uses them as evidence that a potential class labeling may be wrong.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Ghosh, S., Imielinski, T., Iyer, B., Swami, A.: An interval classifier for database mining applications. In: Proc. of the 18th VLDB Conference, pp. 560–573 (1992)
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Transactions on Knowledge and Data Engineering 5(6), 914–925 (1993)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference, pp. 207–216 (1993)
Blake, C.L., Merz, C.J.U.: Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. In: Proc. of the ACM SIGMOD Conference, pp. 265–276 (1997)
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3, 261–283 (1989)
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley Interscience, New York (1973)
Foster, T., Kohavi, R., Provost, F.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the 15th International Conference on Machine Learning, pp. 445–453 (1998)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11, 63–91 (1993)
Hsu, C., Lin, C.: A comparison of methods for multi-class Support Vector Machines. IEEE Transactions On Neural Networks 13(2), 415–425 (2002)
Keogh, E.J., Pazzani, M.J.: Learning augmented Bayesian classifiers: a comparison of distribution-based and classification-based approaches. In: Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics, pp. 225–230 (1999)
Keim, M., Lewis, D.D., Madigan, D.: Bayesian information retrieval: preliminary evaluation. In: Preliminary Papers of the 6th International Workshop on Artificial Intelligence and Statistics, pp. 303–310 (1997)
Kohavi, R., Becker, B., Sommerfield, D.: Improving simple Bayes. In: Proceedings of the 9th European Conference on Machine Learning, pp. 78–87 (1997)
Kononenko, I.: Semi-naïve Bayesian classifier. In: Proceedings of the 6th European Working Session on Learning, pp. 206–219 (1991)
Lewis, D.D.: Naïve (Bayes) at forty: The independence assumption in information retrieval. In: Proceedings of the European Conference on Machine Learning, pp. 4–15 (1998)
Ling, C.X., Zhang, H.: Toward Bayesian classifiers with accurate probabilities. In: Proceedings of the 6th Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 123–134 (2002)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th ACM SIGKDD Conference, pp. 80–86 (1998), http://www.comp.nus.edu.sg/~dm2/result.html
McCallum, A., Nigam, K.: A comparison of event models for Naïve Bayes classification. In: Proceedings of the AAAI Workshop on Learning for Text Categorization (1998)
Meretakis, D., Wuthrich, B.: Extending Naïve Bayes classifiers using long itemsets. In: Proceedings of the 5th ACM SIGKDD Conference, pp. 165–174 (1999)
Meretakis, D., Hongjun, L., Wuthrich, B.: A Study on the performance of Large Bayes Classifier. In: Proceedings of the 11th European Conference on Machine Learning, pp. 271–279 (2000)
Mitchell, T.: Machine Learning. McGraw-Hill, San Francisco (1997)
Pazzani, M.: Searching for dependencies in Bayesian classifiers. In: Artificial Intelligence and Statistics IV. Lecture Notes In Statistics, Springer, New York (1995)
Peng, F., Schuurmans, D.: Combining Naïve Bayes and n-gram Language Models for Test Classification. In: Sebastiani, F. (ed.) Advances in Information Retrieval: Proceedings of the 25th European Conference On Information Retrieval Research, pp. 335–350 (2003)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Quinlan, J.R.: Simplifying decision trees. International Journal for Man Machine Studies 27, 221–234 (1987)
Rachlin, J., Kasif, S., Salzberg, S., Aha, D.W.: Towards a better understanding of memory-based reasoning systems. In: Proceedings of the 11th International Conference on Machine Learning, pp. 242–250 (1994)
Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumptions of Naïve Bayes text classifiers. In: Proceedings of the 20th International Conference on Machine Learning, pp. 616–623 (2003)
Roth, D.: Learning in natural language. In: Proceedings of the International Joint Conference of Artificial Intelligence, pp. 898–904 (1999)
Sahami, M.: Learning limited dependence Bayesian classifiers. In: Proceedings of the 2nd ACM SIGKDD Conference, pp. 335–338 (1995)
Witten, I., Frank, E.: Machine Learning Algorithms in Java. Morgan Kaufmann, San Francisco (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rosell, B., Hellerstein, L. (2004). Naïve Bayes with Higher Order Attributes. In: Tawfik, A.Y., Goodwin, S.D. (eds) Advances in Artificial Intelligence. Canadian AI 2004. Lecture Notes in Computer Science(), vol 3060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24840-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-24840-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22004-6
Online ISBN: 978-3-540-24840-8
eBook Packages: Springer Book Archive