A Combined Classification Algorithm Based on C4.5 and NB

Jiang, Liangxiao; Li, Chaoqun; Wu, Jia; Zhu, Jian

doi:10.1007/978-3-540-92137-0_39

A Combined Classification Algorithm Based on C4.5 and NB

Liangxiao Jiang⁵,
Chaoqun Li⁶,
Jia Wu⁵ &
…
Jian Zhu⁵

Conference paper

2232 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5370))

Abstract

When our learning task is to build a model with accurate classification, C4.5 and NB are two very important algorithms for achieving this task because of their simplicity and high performance. In this paper, we present a combined classification algorithm based on C4.5 and NB, simply C4.5-NB. In C4.5-NB, the class probability estimates of C4.5 and NB are weighted according to their classification accuracy on the training data. We experimentally tested C4.5-NB in Weka system using the whole 36 UCI data sets selected by Weka, and compared it with C4.5 and NB. The experimental results show that C4.5-NB significantly outperforms C4.5 and NB in terms of classification accuracy. Besides, we also observe the ranking performance of C4.5-NB in terms of AUC (the area under the Receiver Operating Characteristics curve). Fortunately, C4.5-NB also significantly outperforms C4.5 and NB.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mitchell, T.M.: Decision tree Learning. In: Machine Learning, ch. 3. McGraw-Hill, New York (1997)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)
MATH Google Scholar
Langley, P., Iba, W., Thomas, K.: An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference of Artificial Intelligence, pp. 223–228. AAAI Press, Menlo Park (1992)
Google Scholar
Friedman, G., Goldszmidt: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
Merz, C., Murphy, P., Aha, D.: UCI repository of machine learning databases. In: Dept of ICS, University of California, Irvine (1997), http://www.ics.uci.edu/mlearn/MLRepository.html
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005), http://prdownloads.sourceforge.net/weka/datasets-UCI.jar
MATH Google Scholar
Zhang, H., Jiang, L., Su, J.: Hidden Naive Bayes. In: Proceedings of the 20th National Conference on Artificial Intelligence, AAAI 2005, pp. 919–924. AAAI Press, Menlo Park (2005)
Chapter Google Scholar
Liang, H., Zhang, H., Guo, Y.: Decision Trees for Probability Estimation: An Empirical Study. In: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006, pp. 756–764. IEEE Computer Society Press, Los Alamitos (2006)
Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. Advances in Neural Information Processing Systems 12, 307–313 (1999)
MATH Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Article Google Scholar
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)
Article MATH Google Scholar
Jiang, L., Zhang, H., Cai, Z., Su, J.: Learning tree augmented naive bayes for ranking. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 688–698. Springer, Heidelberg (2005)
Chapter Google Scholar
Jiang, L., Zhang, H., Cai, Z.: Discriminatively Improving Naive Bayes by Evolutionary Feature Selection. Romanian Journal of Information Science and Technology 9(3), 163–174 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, China University of Geosciences, Wuhan, Hubei, P.R. China, 430074
Liangxiao Jiang, Jia Wu & Jian Zhu
Faculty of Mathematics, China University of Geosciences, Wuhan, Hubei, P.R. China, 430074
Chaoqun Li

Authors

Liangxiao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Chaoqun Li
View author publications
You can also search for this author in PubMed Google Scholar
Jia Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computation Center, Wuhan University, 430072, Wuhan, China
Lishan Kang
Faculty of Computer Science, China University of Geosciences, 430074, Wuhan, Hubei, P.R. China
Zhihua Cai
School of Computer Science, China University of Geosciences, Wu-Han, 430074, China Research Center for Space Science and Technology, China University of Geosciences, 430074, Wu-Han, China
Xuesong Yan
The University of Aizu, Tsuruga, Ikki-machi, 965-8580, Aizu-Wakamatsu City Fukushima, Japan
Yong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, L., Li, C., Wu, J., Zhu, J. (2008). A Combined Classification Algorithm Based on C4.5 and NB. In: Kang, L., Cai, Z., Yan, X., Liu, Y. (eds) Advances in Computation and Intelligence. ISICA 2008. Lecture Notes in Computer Science, vol 5370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92137-0_39

Download citation

DOI: https://doi.org/10.1007/978-3-540-92137-0_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92136-3
Online ISBN: 978-3-540-92137-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics