A Scalable Boosting Learner for Multi-class Classification Using Adaptive Sampling

Chen, Jianhua

doi:10.1007/978-3-319-09912-5_6

A Scalable Boosting Learner for Multi-class Classification Using Adaptive Sampling

Jianhua Chen¹⁹

Conference paper

2343 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8610))

Abstract

Scalability has become an increasingly critical issue for successful data mining applications in the ”big data” era in which extremely huge data sets render traditional learning algorithms infeasible. Among various approaches to scalable learning, sampling techniques can be exploited to address the issue of scalability. This paper presents our study on applying a newly developed sampling-based boosting learning method for multi-class (non-binary) classification. Preliminary experimental results using bench-mark data sets from the UC-Irvine ML data repository confirm the efficiency and competitive prediction accuracy of the proposed adaptive boosting method for the multi-class classification task. We also show a formulation of using a single ensemble of non-binary base classifiers with adaptive sampling for multi-class problems.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Statist. 23, 493–507 (1952)
Article MATH MathSciNet Google Scholar
Zhu, J., Rosset, S., Zou, H., Hastie, T.: Multi-class AdaBoost. Statistics and its Interface 2, 349–360 (2009)
Article MATH MathSciNet Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additivel Logistic Regression: A Statistical View of Boosting. Annals of Statistics 28, 337–407 (2000)
Article MATH MathSciNet Google Scholar
Mukherjee, I., Shapire, R.: A Theory of Multiclass Boosting. Journal of Machine Learning Research 14, 437–497 (2013)
MATH Google Scholar
Sun, P., Reid, M.D., Zhou, J.: AOSO-LogitBoost: Adaptive one-vs-one LogitBoost for Multi-class Problem. In: International Conference on Machine Learning (ICML) (2012)
Google Scholar
Kegl, B.: The Return of AdaBoost.MH: Multi-class Hamming Trees. arXiv:1312.6086 [cs.LG] (preprint)
Google Scholar
Freund, Y., Schapire, R.: Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting. J. of Computer and System Sciences 55(1), 119–139 (1997)
Article MATH MathSciNet Google Scholar
Schapire, R., Singer, Y.: Improved Boosting Algorithms using Confidence-rated Prediction. Machine Learning 37(3), 297–336 (1999)
Article MATH Google Scholar
Allwein, E., Schapire, R., Singer, Y.: Reducing Multiclass to Binary: A Unifying Approach for Margin Classifier. Journal of Machine Learning Research 1, 113–141 (2000)
MathSciNet Google Scholar
Chen, J., Xu, J.: Sampling Adaptively using the Massart Inequality for Scalable Learning by Boosting. In: Proceedings of ICMLA Workshop on Machine Learning Algorithms, Systems and Applications, Miami, Florida (December 2013)
Google Scholar
Chen, J.: Scalable Ensemble Learning by Adaptive Sampling. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 622–625 (December 2012)
Google Scholar
Chen, J., Chen, X.: A New Method for Adaptive Sequential Sampling for Learning and Parameter Estimation. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS, vol. 6804, pp. 220–229. Springer, Heidelberg (2011)
Chapter Google Scholar
Chen, X.: A new framework of multistage parametric inference. In: Proceeding of SPIE Conference, Orlando, Florida, vol. 7666, pp. 76660R1–76660R12 (April 2010)
Google Scholar
Domingo, C., Watanabe, O.: Scaling up a boosting-based learner via adaptive sampling. In: Terano, T., Liu, H., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 317–328. Springer, Heidelberg (2000)
Chapter Google Scholar
Domingo, C., Watanabe, O.: Adaptive sampling methods for scaling up knowledge discovery algorithms. In: Proceedings of 2nd Int. Conference on discovery Science, Japan (December 1999)
Google Scholar
Frey, J.: Fixed-width sequential confidence intervals for a proportion. The American Statistician 64, 242–249 (2010)
Article MathSciNet Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded variables. J. Amer. Statist. Assoc. 58, 13–29 (1963)
Article MATH MathSciNet Google Scholar
Lipton, R., Naughton, J., Schneider, D.A., Seshadri, S.: Efficient sampling strategies for relational database operations. Theoretical Computer Science 116, 195–226 (1993)
Article MATH MathSciNet Google Scholar
Lipton, R., Naughton, J.: Query size estimation by adaptive sampling. Journal of Computer and System Sciences 51, 18–25 (1995)
Article MATH MathSciNet Google Scholar
Lynch, J.F.: Analysis and application of adaptive sampling. Journal of Computer and System Sciences 66, 2–19 (2003)
Article MATH MathSciNet Google Scholar
Watanabe, O.: Sequential sampling techniques for algorithmic learning theory. Theoretical Computer Science 348, 3–14 (2005)
Article MATH MathSciNet Google Scholar
Hanneke, S.: A bound on the label complexity of agnostic active learning. In: Corvallis, O.R. (ed.) Proceedings of the 24th Int. Conf. on Machine Learning (2007)
Google Scholar
Watanabe, O.: Simple sampling techniques for discovery sciences. IEICE Trans. Inf. & Sys. ED83-D, 19–26 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Computer Science and Engineering School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, LA, 70803-4020, USA
Jianhua Chen

Authors

Jianhua Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Warsaw and Infobright Inc., Poland
Dominik Ślȩzak
Department of Computer Science, Loughborough University, Loughborough, U.K.
Gerald Schaefer
Computer Science Department, University of British Columbia, 2366 Main Mall, P.O. Box, Vancouver, B.C., Canada
Son T. Vuong
Department of Information & Communication Engineering, Inha University, Korea
Yoo-Sung Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J. (2014). A Scalable Boosting Learner for Multi-class Classification Using Adaptive Sampling. In: Ślȩzak, D., Schaefer, G., Vuong, S.T., Kim, YS. (eds) Active Media Technology. AMT 2014. Lecture Notes in Computer Science, vol 8610. Springer, Cham. https://doi.org/10.1007/978-3-319-09912-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-09912-5_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09911-8
Online ISBN: 978-3-319-09912-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics