Abstract
Binary decomposition methods transform multiclass learning problems into a series of two-class learning problems that can be solved with simpler learning algorithms. As the number of such binary learning problems often grows super-linearly with the number of classes, we need efficient methods for computing the predictions. In this article, we discuss an efficient algorithm that queries only a dynamically determined subset of the trained classifiers, but still predicts the same classes that would have been predicted if all classifiers had been queried. The algorithm is first derived for the simple case of pairwise classification, and then generalized to arbitrary pairwise decompositions of the learning problem in the form of ternary error-correcting output codes under a variety of different code designs and decoding strategies.
Similar content being viewed by others
References
Allwein EL, Schapire RE, Singer Y (2000) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1: 113–141
Bose RC, Ray-Chaudhuri DK (1960) On a class of error correcting binary group codes. Inform Control 3(1): 68–79
Brenner SE, Koehl P, Levitt M (2000) The astral compendium for protein structure and sequence analysis. Nucleic Acids Res 28(1): 254–256
Cardoso JS, da Costa JFP (2007) Learning to classify ordinal data: the data replication method. J Mach Learn Res 8: 1393–1429
Crammer K, Singer Y (2002) On the learnability and design of output codes for multiclass problems. Mach Learn 47(2–3): 201–233
Cutzu F (2003a) How to do multi-way classification with two-way classifiers. In: Kaynak O, Alpaydin E, Oja E, Xu L (eds) Artificial neural networks and neural information processing—ICANN/ICONIP 2003, joint international conference ICANN/ICONIP 2003, Istanbul. Lecture notes in computer science, vol 2714. Springer, Heidelberg, pp 375–384
Cutzu F (2003b) Polychotomous classification with pairwise classifiers: a new voting principle. In: Windeatt T, Roli F (eds) Multiple classifier systems, 4th international workshop (MCS 2003), Guilford. Lecture notes in computer science, vol 2709. Springer, Heidelberg, pp 115–124
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2: 263–286
Donoho DL (2006) Compressed sensing. IEEE Trans Inform Theory 52(4): 1289–1306
Escalera S, Pujol O, Radeva P (2006) Decoding of ternary error correcting output codes. In: Trinidad JFM, Carrasco-Ochoa JA, Kittler J (eds) Proceedings of the 11th Iberoamerican congress in pattern recognition (CIARP-06). Springer, Heidelberg, pp 753–763
Escalera S, Pujol O, Radeva P (2010) On the decoding process in ternary error-correcting output codes. IEEE Trans Pattern Anal Mach Intell 32(1): 120–134
Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine
Fürnkranz J (2002) Round robin classification. J Mach Learn Res 2:721–747
Fürnkranz J (2003) Round robin ensembles. Intell Data Anal 7(5):385–403
Gallager RG (1968) Information theory and reliable communication. Wiley, New York
Ghani R (2001) Using error-correcting codes for efficient text classification with a large number of categories. Master’s thesis, Center for Automated Learning and Discovery, Carnegie Mellon University
Hastie T, Tibshirani R (1997) Classification by pairwise coupling. In: Jordan MI, Kearns MJ, Solla SA (eds) Advances in neural information processing systems 10 (NIPS 1997). MIT, Cambridge
Hsu CW, Lin CJ (2002) A comparison of methods for multi-class support vector machines. IEEE Trans Neural Netw 13(2): 415–425
Hsu D, Kakade S, Langford J, Zhang T (2009) Multi-label prediction via compressed sensing. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22, pp 772–780
Hüllermeier E, Fürnkranz J (2004a) Comparison of ranking procedures in pairwise preference learning. In: Proceedings of the 10th international conference on information processing and management of uncertainty in knowledge-based systems (IPMU-04), Perugia
Hüllermeier E, Fürnkranz J (2004b) Ranking by pairwise comparison: a note on risk minimization. In: Proceedings of the IEEE iInternational conference on fuzzy systems (FUZZ-IEEE-04), Budapest
Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916
Kong EB, Dietterich TG (1995) Error-correcting output coding corrects bias and variance. In: Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 313–321
Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5: 361–397
Lorena AC, de Carvalho ACPLF, Gama J (2008) A review on the combination of binary classifiers in multiclass problems. Artif Intell Rev 30(1–4): 19–37
MacWilliams FJ, Sloane NJA (1983) The theory of error-correcting codes. North-Holland Mathematical Library, North Holland
Melvin I, Ie E, Weston J, Noble WS, Leslie C (2007) Multi-class protein classification using adaptive codes. J Mach Learn Res 8: 1557–1581
Mencía EL, Park SH, Fürnkranz J (2010) Efficient voting prediction for pairwise multilabel classification. Neurocomputing 73(7–9): 1164–1176
Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247: 536–540
Park SH, Fürnkranz J (2007a) Efficient pairwise classification. In: Kok JN, Koronacki J, Lopez de Mantaras R, Matwin S, Mladenič D, Skowron A (eds) Proceedings of 18th European conference on machine learning (ECML-07), Warsaw. Springer-Verlag, Berlin, pp 658–665
Park SH, Fürnkranz J (2007b) Efficient pairwise classification and ranking. Technical Report TUD-KE-2007-03, Knowledge Engineering Group. TU Darmstadt
Park SH, Fürnkranz J (2009) Efficient decoding of ternary error-correcting output codes for multiclass classification. In: Buntine WL, Grobelnik M, Mladenič D, Shawe-Taylor J (eds) Proceedings of 20th European conference on machine learning (ECML-09), Bled. Springer-Verlag, Berlin, pp 189–204
Pimenta E, Gama J, de Leon Ferreira, de Carvalho ACP (2008) The dimension of ECOCs for multiclass classification problems. Int J Artif Intell Tools 17(3): 433–447
Platt JC, Cristianini N, Shawe-Taylor J (1999) Large margin DAGs for multiclass classification. In: Solla SA, Leen TK, Müller KR (eds) Advances in neural information processing systems 12 (NIPS 1999). The MIT Press, Denver, pp 547–553
Pujol O, Radeva P, Vitrià J (2006) Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Trans Pattern Anal Mach Intell 28(6):10071-1012
Rifkin RM, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5: 101–141
Smith RS, Windeatt T (2005) Decoding rules for error correcting output code ensembles. In: Oza NC, Polikar R, Kittler J, Roli F (eds) Proceedings of the 6th international workshop on multiple classifier systems (MCS-05), Seaside. Springer, New York, pp 53–63
Windeatt T, Ghaderi R (2003) Coding and decoding strategies for multi-class learning problems. Inform Fusion 4(1): 11–21
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Wu T-F, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5: 975–1005
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Fisher DH (ed) Proceedings of the fourteenth international conference on machine learning (ICML 1997), Nashville. Morgan Kaufmann, San Francisco, pp 412–420
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Charles Elkan.
Rights and permissions
About this article
Cite this article
Park, SH., Fürnkranz, J. Efficient prediction algorithms for binary decomposition techniques. Data Min Knowl Disc 24, 40–77 (2012). https://doi.org/10.1007/s10618-011-0219-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-011-0219-9