Skip to main content
Log in

Efficient prediction algorithms for binary decomposition techniques

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Binary decomposition methods transform multiclass learning problems into a series of two-class learning problems that can be solved with simpler learning algorithms. As the number of such binary learning problems often grows super-linearly with the number of classes, we need efficient methods for computing the predictions. In this article, we discuss an efficient algorithm that queries only a dynamically determined subset of the trained classifiers, but still predicts the same classes that would have been predicted if all classifiers had been queried. The algorithm is first derived for the simple case of pairwise classification, and then generalized to arbitrary pairwise decompositions of the learning problem in the form of ternary error-correcting output codes under a variety of different code designs and decoding strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allwein EL, Schapire RE, Singer Y (2000) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1: 113–141

    MathSciNet  Google Scholar 

  • Bose RC, Ray-Chaudhuri DK (1960) On a class of error correcting binary group codes. Inform Control 3(1): 68–79

    Article  MATH  MathSciNet  Google Scholar 

  • Brenner SE, Koehl P, Levitt M (2000) The astral compendium for protein structure and sequence analysis. Nucleic Acids Res 28(1): 254–256

    Article  Google Scholar 

  • Cardoso JS, da Costa JFP (2007) Learning to classify ordinal data: the data replication method. J Mach Learn Res 8: 1393–1429

    MATH  MathSciNet  Google Scholar 

  • Crammer K, Singer Y (2002) On the learnability and design of output codes for multiclass problems. Mach Learn 47(2–3): 201–233

    Article  MATH  Google Scholar 

  • Cutzu F (2003a) How to do multi-way classification with two-way classifiers. In: Kaynak O, Alpaydin E, Oja E, Xu L (eds) Artificial neural networks and neural information processing—ICANN/ICONIP 2003, joint international conference ICANN/ICONIP 2003, Istanbul. Lecture notes in computer science, vol 2714. Springer, Heidelberg, pp 375–384

  • Cutzu F (2003b) Polychotomous classification with pairwise classifiers: a new voting principle. In: Windeatt T, Roli F (eds) Multiple classifier systems, 4th international workshop (MCS 2003), Guilford. Lecture notes in computer science, vol 2709. Springer, Heidelberg, pp 115–124

  • Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2: 263–286

    MATH  Google Scholar 

  • Donoho DL (2006) Compressed sensing. IEEE Trans Inform Theory 52(4): 1289–1306

    Article  MathSciNet  Google Scholar 

  • Escalera S, Pujol O, Radeva P (2006) Decoding of ternary error correcting output codes. In: Trinidad JFM, Carrasco-Ochoa JA, Kittler J (eds) Proceedings of the 11th Iberoamerican congress in pattern recognition (CIARP-06). Springer, Heidelberg, pp 753–763

  • Escalera S, Pujol O, Radeva P (2010) On the decoding process in ternary error-correcting output codes. IEEE Trans Pattern Anal Mach Intell 32(1): 120–134

    Article  Google Scholar 

  • Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine

    Google Scholar 

  • Fürnkranz J (2002) Round robin classification. J Mach Learn Res 2:721–747

    MATH  MathSciNet  Google Scholar 

  • Fürnkranz J (2003) Round robin ensembles. Intell Data Anal 7(5):385–403

    Google Scholar 

  • Gallager RG (1968) Information theory and reliable communication. Wiley, New York

    MATH  Google Scholar 

  • Ghani R (2001) Using error-correcting codes for efficient text classification with a large number of categories. Master’s thesis, Center for Automated Learning and Discovery, Carnegie Mellon University

  • Hastie T, Tibshirani R (1997) Classification by pairwise coupling. In: Jordan MI, Kearns MJ, Solla SA (eds) Advances in neural information processing systems 10 (NIPS 1997). MIT, Cambridge

    Google Scholar 

  • Hsu CW, Lin CJ (2002) A comparison of methods for multi-class support vector machines. IEEE Trans Neural Netw 13(2): 415–425

    Article  Google Scholar 

  • Hsu D, Kakade S, Langford J, Zhang T (2009) Multi-label prediction via compressed sensing. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22, pp 772–780

  • Hüllermeier E, Fürnkranz J (2004a) Comparison of ranking procedures in pairwise preference learning. In: Proceedings of the 10th international conference on information processing and management of uncertainty in knowledge-based systems (IPMU-04), Perugia

  • Hüllermeier E, Fürnkranz J (2004b) Ranking by pairwise comparison: a note on risk minimization. In: Proceedings of the IEEE iInternational conference on fuzzy systems (FUZZ-IEEE-04), Budapest

  • Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916

    Article  MATH  Google Scholar 

  • Kong EB, Dietterich TG (1995) Error-correcting output coding corrects bias and variance. In: Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 313–321

  • Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5: 361–397

    Google Scholar 

  • Lorena AC, de Carvalho ACPLF, Gama J (2008) A review on the combination of binary classifiers in multiclass problems. Artif Intell Rev 30(1–4): 19–37

    Article  Google Scholar 

  • MacWilliams FJ, Sloane NJA (1983) The theory of error-correcting codes. North-Holland Mathematical Library, North Holland

    Google Scholar 

  • Melvin I, Ie E, Weston J, Noble WS, Leslie C (2007) Multi-class protein classification using adaptive codes. J Mach Learn Res 8: 1557–1581

    MATH  MathSciNet  Google Scholar 

  • Mencía EL, Park SH, Fürnkranz J (2010) Efficient voting prediction for pairwise multilabel classification. Neurocomputing 73(7–9): 1164–1176

    Article  Google Scholar 

  • Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247: 536–540

    Google Scholar 

  • Park SH, Fürnkranz J (2007a) Efficient pairwise classification. In: Kok JN, Koronacki J, Lopez de Mantaras R, Matwin S, Mladenič D, Skowron A (eds) Proceedings of 18th European conference on machine learning (ECML-07), Warsaw. Springer-Verlag, Berlin, pp 658–665

  • Park SH, Fürnkranz J (2007b) Efficient pairwise classification and ranking. Technical Report TUD-KE-2007-03, Knowledge Engineering Group. TU Darmstadt

  • Park SH, Fürnkranz J (2009) Efficient decoding of ternary error-correcting output codes for multiclass classification. In: Buntine WL, Grobelnik M, Mladenič D, Shawe-Taylor J (eds) Proceedings of 20th European conference on machine learning (ECML-09), Bled. Springer-Verlag, Berlin, pp 189–204

  • Pimenta E, Gama J, de Leon Ferreira, de Carvalho ACP (2008) The dimension of ECOCs for multiclass classification problems. Int J Artif Intell Tools 17(3): 433–447

    Article  Google Scholar 

  • Platt JC, Cristianini N, Shawe-Taylor J (1999) Large margin DAGs for multiclass classification. In: Solla SA, Leen TK, Müller KR (eds) Advances in neural information processing systems 12 (NIPS 1999). The MIT Press, Denver, pp 547–553

    Google Scholar 

  • Pujol O, Radeva P, Vitrià J (2006) Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Trans Pattern Anal Mach Intell 28(6):10071-1012

    Google Scholar 

  • Rifkin RM, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5: 101–141

    MATH  MathSciNet  Google Scholar 

  • Smith RS, Windeatt T (2005) Decoding rules for error correcting output code ensembles. In: Oza NC, Polikar R, Kittler J, Roli F (eds) Proceedings of the 6th international workshop on multiple classifier systems (MCS-05), Seaside. Springer, New York, pp 53–63

  • Windeatt T, Ghaderi R (2003) Coding and decoding strategies for multi-class learning problems. Inform Fusion 4(1): 11–21

    Article  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  • Wu T-F, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5: 975–1005

    MATH  MathSciNet  Google Scholar 

  • Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Fisher DH (ed) Proceedings of the fourteenth international conference on machine learning (ICML 1997), Nashville. Morgan Kaufmann, San Francisco, pp 412–420

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sang-Hyeun Park.

Additional information

Responsible editor: Charles Elkan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, SH., Fürnkranz, J. Efficient prediction algorithms for binary decomposition techniques. Data Min Knowl Disc 24, 40–77 (2012). https://doi.org/10.1007/s10618-011-0219-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-011-0219-9

Keywords

Navigation