Abstract
ImageNet dataset with more than 14 million images and 21,000 classes makes the problem of visual classification more difficult to deal with. One of the most difficult tasks is to train a fast and accurate visual classifier on several multi-core computers with limited individual memory resource. In this paper we address this challenge by extending both state-of-the-art large scale linear classifier (LIBLINEAR-CDBLOCK) and non-linear classifier (Power Mean SVM) for large scale visual classification tasks in these following ways: (1) an incremental learning method for Power Mean SVM, (2) a balanced bagging algorithm for training binary classifiers. Our approach has been evaluated on the 100 largest classes of ImageNet and ILSVRC 2010. The evaluation shows that our approach can save up to 82.01 % memory usage and the learning process is much faster than the original implementation and LIBLINEAR SVM.







Similar content being viewed by others
References
Berg A, Deng J, Li FF (2010) Large scale visual recognition challenge 2010. Tech rep. http://www.image-net.org/challenges/LSVRC/2010/index
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) Smoteboost: improving prediction of the minority class in boosting. In: The principles of knowledge discovery in databases, pp 107–119
Crammer K, Singer Y (2002) On the learnability and design of output codes for multiclass problems. Mach Learn 47(2–3):201–233
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, European conference on computer vision, pp 1–22
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition. IEEE Computer Society, pp 886–893
Deng J, Berg AC, Li K, Li FF (2010) What does classifying more than 10,000 image categories tell us? In: European conference on computer vision, pp 71–84
Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) Imagenet: a large-scale hierarchical image database. In: IEEE computer society conference on computer vision and pattern recognition, pp 248–255
Do TN, Nguyen VH, Poulet F (2008) Speed up svm algorithm for massive classification tasks. In: International conference on advanced data mining and applications, pp 147–157
Doan TN, Do TN, Poulet F (2013) Large scale visual classification with parallel, imbalanced bagging and incremental liblinear svm. In: 9th international conference on data mining. Las Vegas, Nevada, pp 197–203
Doan TN, Do TN, Poulet F (2013) Parallel incremental svm for classifying million images with very high-dimensional signatures into thousand classes. In: IEEE international joint conference on neural networks. Dallas, pp 2976–2983
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Fergus R, Weiss Y, Torralba A (2009) Semi-supervised learning in gigantic image collections. In: Advances in neural information processing systems, pp 522–530
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. Tech. Rep. CNS-TR-2007-001, California Institute of Technology. http://authors.library.caltech.edu/7694
Griffin G, Perona D (2008) Learning and using taxonomies for fast visual categorization. In: IEEE computer society conference on computer vision and pattern recognition. IEEE Computer Society
Guermeur Y (2007) Svm multiclasses, théorie et applications
Hsieh CJ, Chang KW, Lin CJ, Keerthi SS, Sundararajan S (2008) A dual coordinate descent method for large-scale linear svm. In: International conference on machine learning, pp 408–415
Japkowicz N (ed) (2000) AAAI’Workshop on learning from imbalanced data sets, no. WS-00-05 in AAAI Tech Report
Joachims T (2006) Training linear svms in linear time. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery. ACM, pp 217–226
Keerthi SS, Sundararajan S, Chang KW, Hsieh CJ, Lin CJ (2008) A sequential dual method for large scale multi-class linear svms. In: KDD, pp 408–416
Krebel U (1999) Pairwise classification and support vector machines. In: Advances in kernel methods: support vector learning, pp 255–268
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE computer society conference on computer vision and pattern recognition, pp 2169–2178
Lenca P, Lallich S, Do TN, Pham NK (2008) A comparison of different off-centered entropies to deal with class imbalance for decision trees. In: The Pacific-Asia conference on knowledge discovery and data mining, LNAI 5012. Springer, pp 634–643
Li FF, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Li Y, Crandall DJ, Huttenlocher DP (2009) Landmark classification in large-scale image collections. In: IEEE 12th international conference on computer vision. IEEE, pp 1957–1964
Lin CJ, Weng RC, Keerthi SS (2008) Trust region newton method for logistic regression. J Mach Learn Res 9:627–650
Lin Y, Lv F, Zhu S, Yang M, Cour T, Yu K, Cao L, Huang TS (2011) Large-scale image classification: Fast feature extraction and svm training. In: IEEE computer society conference on computer vision and pattern recognition, pp 1689–1696
Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern Part B 39(2):539–550
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Maji S, Berg AC, Malik J (2008) Classification using intersection kernel support vector machines is efficient. In: IEEE computer society conference on computer vision and pattern recognition
Maji S, Berg AC, Malik J (2013) Efficient classification for additive kernel svms. IEEE Trans Pattern Anal Mach Intell 35(1):66–77
MPI-Forum: Mpi: a message-passing interface standard. http://www.mpi-forum.org
OpenMP Architecture Review Board: OpenMP application program interface version 3.0. http://www.openmp.org/mp-documents/spec30.pdf (2008)
Perronnin F, Akata Z, Harchaoui Z, Schmid C (2012) Towards good practice in large-scale learning for image classification. In: IEEE computer society conference on computer vision and pattern recognition, pp 3482–3489
Perronnin F, Sánchez J, Liu Y (2010) Large-scale image categorization with explicit data embedding. In: IEEE computer society conference on computer vision and pattern recognition, pp 2297–2304
Pham NK, Do TN, Lenca P, Lallich S (2008) Using local node information in decision trees: coupling a local decision rule with an off-centered entropy. In: International conference on data mining. CSREA Press, Las Vegas, pp 117–123
Platt J, Cristianini N, Shawe-Taylor J (2000) Large margin dags for multiclass classification. Adv Neural Inf Process Syst 12:547–553
Ricamato MT, Marrocco C, Tortorella F (2008) Mcs-based balancing techniques for skewed classes: an empirical comparison. In: International conference on pattern recognition, pp 1–4
Sánchez J, Perronnin F (2011) High-dimensional signature compression for large-scale image classification. In: IEEE computer society conference on computer vision and pattern recognition, pp 1665–1672
Shalev-Shwartz S, Singer Y, Srebro N (2007) Pegasos: primal estimated sub-gradient solver for svm. In: International conference on machine learning, pp 807–814
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300. doi:10.1023/A:1018628609742
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin
Vedaldi A, Gulshan V, Varma M, Zisserman A (2009) Multiple kernels for object detection. In: IEEE 12th international conference on computer vision. IEEE, pp 606–613
Vedaldi A, Zisserman A (2012) Efficient additive kernels via explicit feature maps. IEEE Trans Pattern Anal Mach Intell 34(3):480–492
Visa S, Ralescu A (2005) Issues in mining imbalanced data sets–a review paper. In: Midwest artificial intelligence and cognitive science conf. Dayton, pp 67–73
Wang C, Yan S, Zhang HJ (2009) Large scale natural image classification by sparsity exploration. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing. IEEE, pp 3709–3712
Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19:315–354
Weston J, Watkins C (1999) Support vector machines for multi-class pattern recognition. In: Proceedings of the seventh European symposium on artificial neural networks, pp 219–224
Wu J (2010) A fast dual method for hik svm learning. In: Daniilidis K, Maragos P, Paragios N (eds) European conference on computer vision, lecture notes in computer science, vol 6312. Springer, Berlin, pp 552–565
Wu J (2012) Power mean svm for large scale visual classification. In: IEEE computer society conference on computer vision and pattern recognition, pp 2344–2351
Wu J, Tan WC, Rehg JM (2011) Efficient and effective visual codebook generation using additive kernels. J Mach Learn Res 12:3097–3118
Yu HF, Hsieh CJ, Chang KW, Lin CJ (2012) Large linear classification when data cannot fit in memory. ACM Trans Knowl Discov Data 5(4):23
Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: Advances in neural information processing systems, pp 2223–2231
Yuan GX, Ho CH, Lin CJ (2012) Recent advances of large-scale linear classification. Proc IEEE 100(9):2584–2603
Zhou X, Yu K, Zhang T, Huang TS (2010) Image classification using super-vector coding of local image descriptors. In: European conference on computer vision, pp 141–154
Acknowledgments
This work was funded by Region Bretagne (France) and VIED (Vietnam International Education Development).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Doan, TN., Do, TN. & Poulet, F. Large scale classifiers for visual classification tasks. Multimed Tools Appl 74, 1199–1224 (2015). https://doi.org/10.1007/s11042-014-2049-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2049-4