Skip to main content

An Empirical Study on Several Classification Algorithms and Their Improvements

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5821))

Abstract

Classification algorithms as an important technology in data mining and machine learning have been widely studied and applied. Many methods can be used to build classifiers, such as the decision tree, Bayesian method, instance-based learning, artificial neural network and support vector machine. This paper focuses on the classification methods based on decision tree learning, Bayesian learning, and instance-based learning. In each kind of classification methods, many improvements have been presented to scale up the classification accuracy of the basic algorithm. The paper also studies and compares the classification performance on classification accuracy empirically, using the whole 36 UCI data sets obtained from various sources selected by Weka. The experiment results re-demonstrate the efficiency of all these improved algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fu, L.: A neural network model for learning domain rules based on its activation function characteristics. IEEE Transactions on Neural Networks 9, 787–795 (1998)

    Article  Google Scholar 

  2. Hajela, P., Lee, E., Cho, H.: Genetic algorithms in topologic design of grillage structure. Computer-Aided Civil and Infrastructure Engineering 13, 13–22 (1998)

    Article  Google Scholar 

  3. Mitchell, T.M.: Instance-Based Learning. In: Machine Learning, ch. 8. McGraw-Hill, New York (1997)

    Google Scholar 

  4. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  5. Kohavi, R.: Scaling up the accuracy of NaIve-Bayes classifiers: A decision tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 202–207. AAAI Press, Menlo Park (1996)

    Google Scholar 

  6. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  7. Zhang, H., Jiang, L., Su, J.: Hidden Naive Bayes. In: Proceedings of the 20th National Conference on Artificial Intelligence, AAAI 2005, pp. 919–924. AAAI Press, Menlo Park (2005)

    Chapter  Google Scholar 

  8. Webb, B.J., Wang, Z.: Not So NaiveBayes: Aggregating One-Dependence Estimators. Machine Learning 58, 5–24 (2005)

    Article  MATH  Google Scholar 

  9. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifier. Machine Learning 29, 131–163 (1997)

    Article  MATH  Google Scholar 

  10. Dudani, S.A.: The distance weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics 6, 325–327 (1976)

    Article  Google Scholar 

  11. Frank, E., Hall, M., Pfahringer, B.: Locally Weighted Naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  12. Cleary, J.G., Trigg, L.E.: K*: an instance-based learner using an entropic distance measure. In: Machine Learning: Proceedings of the 12th International, pp. 108–114 (1995)

    Google Scholar 

  13. Jiang, L., Wang, D., Cai, Z., Yan, X.: Survey of Improving Naive Bayes for Classification. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS, vol. 4632, pp. 134–145. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Jiang, L., Cai, Z., Wang, D., Jiang, S.: Survey of Improving K-Nearest-Neighbor for Classification. In: Proceedings of the 4th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007, pp. 679–683. IEEE Computer Society Press, Haikou Hainan China (2007)

    Google Scholar 

  15. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  16. Merz, C., Murphy, P., Aha, D.: UCI repository of machine learning databases. In: Department of ICS, University of California, Irvine (1997), http://prdownloads.sourceforge.net/weka/datasets-UCI.jar

    Google Scholar 

  17. Jiang, L., Li, C., Wu, J., Zhu, J.: A combined classification algorithm based on C4.5 and NB. In: Kang, L., Cai, Z., Yan, X., Liu, Y. (eds.) ISICA 2008. LNCS, vol. 5370, pp. 350–359. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, J., Gao, Z., Hu, C. (2009). An Empirical Study on Several Classification Algorithms and Their Improvements. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds) Advances in Computation and Intelligence. ISICA 2009. Lecture Notes in Computer Science, vol 5821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04843-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04843-2_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04842-5

  • Online ISBN: 978-3-642-04843-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics