Ensembles of Nested Dichotomies with Multiple Subset Evaluation

Leathart, Tim; Frank, Eibe; Pfahringer, Bernhard; Holmes, Geoffrey

doi:10.1007/978-3-030-16148-4_7

Tim Leathart¹⁹,
Eibe Frank¹⁹,
Bernhard Pfahringer¹⁹ &
…
Geoffrey Holmes¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11439))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2815 Accesses

Abstract

A system of nested dichotomies (NDs) is a method of decomposing a multiclass problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Many methods have been proposed to perform this split, each with various advantages and disadvantages. In this paper, we present a simple, general method for improving the predictive performance of NDs produced by any subset selection techniques that employ randomness to construct the subsets. We provide a theoretical expectation for performance improvements, as well as empirical results showing that our method improves the root mean squared error of NDs, regardless of whether they are employed as an individual model or in an ensemble setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Building Ensembles of Adaptive Nested Dichotomies with Random-Pair Selection

Reduction Stumps for Multi-class Classification

A Survey on Ensemble Multi-label Classifiers

Notes

1.
This is a variant of the approach from [11], where each member of the space of NDs has an equal probability of being sampled.
2.
Appropriate values for $\alpha $ for a given $\lambda $ can be found in Table 3 of [15].

References

Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: NIPS, pp. 163–171 (2010)
Google Scholar
Beygelzimer, A., Langford, J., Lifshits, Y., Sorkin, G., Strehl, A.: Conditional probability tree estimation analysis and algorithms. In: UAI, pp. 51–58 (2009)
Google Scholar
Beygelzimer, A., Langford, J., Ravikumar, P.: Error-correcting tournaments. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS (LNAI), vol. 5809, pp. 247–262. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04414-4_22
Chapter Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Brier, G.: Verification of forecasts expressed in term of probabilities. Mon. Weather Rev. 78, 1–3 (1950)
Article Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. JMLR 7(Jan), 1–30 (2006)
MathSciNet MATH Google Scholar
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. JAIR 2, 263–286 (1995)
Article MATH Google Scholar
Dong, L., Frank, E., Kramer, S.: Ensembles of balanced nested dichotomies for multi-class problems. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 84–95. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_13
Chapter Google Scholar
Duarte-Villaseñor, M.M., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Flores-Garrido, M.: Nested dichotomies based on clustering. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 162–169. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33275-3_20
Chapter Google Scholar
Fox, J.: Applied Regression Analysis, Linear Models, and Related Methods. Sage, Thousand Oaks (1997)
Google Scholar
Frank, E., Kramer, S.: Ensembles of nested dichotomies for multi-class problems. In: ICML, p. 39. ACM (2004)
Google Scholar
Freund, Y., Schapire, R.E.: Game theory, on-line prediction and boosting. In: COLT, pp. 325–332 (1996)
Google Scholar
Fürnkranz, J.: Round robin classification. JMLR 2(Mar), 721–747 (2002)
MathSciNet MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Harter, H.L.: Expected values of normal order statistics. Biometrika 48(1/2), 151–165 (1961)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., et al.: Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)
Article MathSciNet MATH Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)
Article MATH Google Scholar
Leathart, T., Frank, E., Holmes, G., Pfahringer, B.: On calibration of nested dichotomies. In: Yang, Q., et al. (eds.) Advances in Knowledge Discovery and Data Mining. LNAI, vol. 11439, pp. 69–80. Springer, Heidelberg (2019)
Chapter Google Scholar
Leathart, T., Pfahringer, B., Frank, E.: Building ensembles of adaptive nested dichotomies with random-pair selection. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9852, pp. 179–194. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46227-1_12
Chapter Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lichman, M.: UCI machine learning repository (2013)
Google Scholar
Meilă, M.: Comparing clusterings by the variation of information. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT-Kernel 2003. LNCS (LNAI), vol. 2777, pp. 173–187. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45167-9_14
Chapter Google Scholar
Melnikov, V., Hüllermeier, E.: On the effectiveness of heuristics for learning nested dichotomies: an empirical analysis. Mach. Learn. 107(8–10), 1–24 (2018)
MathSciNet MATH Google Scholar
Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: ICML, pp. 625–632. ACM (2005)
Google Scholar
Pimenta, E., Gama, J.: A study on error correcting output codes. In: Portuguese Conference on Artificial Intelligence, pp. 218–223. IEEE (2005)
Google Scholar
Rifkin, R., Klautau, A.: defense of one-vs-all classification. JMLR 5, 101–141 (2004)
MathSciNet MATH Google Scholar
Rodríguez, J.J., García-Osorio, C., Maudes, J.: Forests of nested dichotomies. Pattern Recognit. Lett. 31(2), 125–132 (2010)
Article Google Scholar
Royston, J.: Algorithm AS 177: expected normal order statistics (exact and approximate). J. R. Stat. Soc. Ser. C (Appl. Stat.) 31(2), 161–165 (1982)
Google Scholar
Wever, M., Mohr, F., Hüllermeier, E.: Ensembles of evolved nested dichotomies for classification. In: GECCO, pp. 561–568. ACM (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Tim Leathart, Eibe Frank, Bernhard Pfahringer & Geoffrey Holmes

Authors

Tim Leathart
View author publications
You can also search for this author in PubMed Google Scholar
Eibe Frank
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Pfahringer
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey Holmes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Leathart .

Editor information

Editors and Affiliations

Hong Kong University of Science and Technology, Hong Kong, China
Qiang Yang
Nanjing University, Nanjing, China
Zhi-Hua Zhou
University of Macau, Taipa, Macau, China
Zhiguo Gong
Southeast University, Nanjing, China
Min-Ling Zhang
Nanjing University of Aeronautics and Astronautics, Nanjing, China
Sheng-Jun Huang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 39 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leathart, T., Frank, E., Pfahringer, B., Holmes, G. (2019). Ensembles of Nested Dichotomies with Multiple Subset Evaluation. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11439. Springer, Cham. https://doi.org/10.1007/978-3-030-16148-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-16148-4_7
Published: 22 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16147-7
Online ISBN: 978-3-030-16148-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics