Diversity in Combinations of Heterogeneous Classifiers

Hsu, Kuo-Wei; Srivastava, Jaideep

doi:10.1007/978-3-642-01307-2_97

Diversity in Combinations of Heterogeneous Classifiers

Kuo-Wei Hsu²³ &
Jaideep Srivastava²³

Conference paper

2420 Accesses
12 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5476))

Abstract

In this paper, we introduce the use of combinations of heterogeneous classifiers to achieve better diversity. Conducting theoretical and empirical analyses of the diversity of combinations of heterogeneous classifiers, we study the relationship between heterogeneity and diversity. On the one hand, the theoretical analysis serves as a foundation for employing heterogeneous classifiers in Multi-Classifier Systems or ensembles. On the other hand, experimental results provide empirical evidence. We consider synthetic as well as real data sets, utilize classification algorithms that are essentially different, and employ various popular diversity measures for evaluation. Two interesting observations will contribute to the future design of Multi-Classifier Systems and ensemble techniques. First, the diversity among heterogeneous classifiers is higher than that among homogeneous ones, and hence using heterogeneous classifiers to construct classifier combinations would increase the diversity. Second, the heterogeneity primarily results from different classification algorithms rather than the same algorithm with different parameters.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D., Kibler, D.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)
MATH Google Scholar
Alkoot, F.M., Kittler, J.: Multiple expert system design by combined feature selection and probability level fusion. In: Proc. of the 3rd International Conference on Information Fusion, vol. 2, pp. THC5/9–THC516 (2000)
Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Bahler, D., Navarro, L.: Methods for Combining Heterogeneous Sets of Classifiers. In: The 17th National Conference on Artificial Intelligence, Workshop on New Research Problems for Machine Learning (2000)
Google Scholar
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P.: A New Ensemble Diversity Measure Applied to Thinning Ensembles. In: International Workshop on Multiple Classifier Systems, pp. 306–316 (2003)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
Article MathSciNet MATH Google Scholar
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorization. Information Fusion 6(1), 5–20 (2005)
Article Google Scholar
Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Chapter Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proc. of the 13th International Conference on Machine Learning, pp. 148–156 (1996)
Google Scholar
Ghosh, J.: Multiclassifier Systems: Back to the Future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)
Chapter Google Scholar
Hettich, S., Bay, S.D.: The UCI KDD Archive. University of California, Department of Information and Computer Science, Irvine, CA (1999), http://kdd.ics.uci.edu
John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: The 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995)
Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Ten measures of diversity in classifier ensembles: limits for two classifiers. In: A DERA/IEE Workshop on Intelligent Sensor Processing, pp. 10/1–10/10 (2001)
Google Scholar
Kuncheva, L.I., Skurichina, M., Duin, R.P.W.: An experimental study on diversity for bagging and boosting with linear classifiers. Information Fusion 3(4), 245–258 (2002)
Article Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. Machine Learning 51(2), 181–207 (2003)
Article MATH Google Scholar
Kuncheva, L.I.: That elusive diversity in classifier ensembles. In: Proc. of Iberian Conference on Pattern Recognition and Image Analysis, pp. 1126–1138 (2003)
Google Scholar
Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. Journal of AI Research 11, 169–198 (1999)
MATH Google Scholar
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Google Scholar
Ranawana, R.: Multi-Classifier Systems - Review and a Roadmap for Developers. International Journal of Hybrid Intelligent Systems 3(1), 35–61 (2006)
Article MATH Google Scholar
Schapire, R.E.: The boosting approach to machine learning: An overview. In: MSRI Workshop on Nonlinear Estimation and Classification (2002)
Google Scholar
Skurichina, M., Kuncheva, L., Duin, R.P.: Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 62–71. Springer, Heidelberg (2002)
Chapter Google Scholar
Valentini, G., Masulli, F.: Ensembles of Learning Machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–22. Springer, Heidelberg (2002)
Chapter Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 Algorithms in Data Mining. Knowledge and Information Systems 14(1), 1–37 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Minnesota, Minneapolis, MN, USA
Kuo-Wei Hsu & Jaideep Srivastava

Authors

Kuo-Wei Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Jaideep Srivastava
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Sirindhorn International Institute of Technology, Thammasat University, 131 Moo 5 Tiwanont Road, 12000, Bangkadi, Muang, Pathumthani, Thailand
Thanaruk Theeramunkong
Dept. of Computer Engineering, Faculty of Engineering, Chulalongkorn University, 10330, Bangkok, Thailand
Boonserm Kijsirikul
Faculty of Science & Engineering, York University, 355 Lumbers Building, 4700 Keele Street, M3J 1P3, Toronto, Ontario, Canada
Nick Cercone
School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, 923-1292, Ishikawa, Japan
Tu-Bao Ho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hsu, KW., Srivastava, J. (2009). Diversity in Combinations of Heterogeneous Classifiers. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_97

Download citation

DOI: https://doi.org/10.1007/978-3-642-01307-2_97
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01306-5
Online ISBN: 978-3-642-01307-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics