Parallelizing multiclass support vector machines for scalable image annotation

Alham, Nasullah Khalid; Li, Maozhen; Liu, Yang

doi:10.1007/s00521-012-1237-2

Parallelizing multiclass support vector machines for scalable image annotation

Original Article
Published: 31 October 2012

Volume 24, pages 367–381, (2014)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Nasullah Khalid Alham¹,
Maozhen Li^1,2 &
Yang Liu³

556 Accesses
9 Citations
Explore all metrics

Abstract

Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them, Support Vector Machines (SVMs) are used extensively due to their generalization properties. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. Notably, SVM training is a computationally intensive process especially when the training dataset is large. This paper presents a resource aware parallel multiclass SVM algorithm (named RAMSMO) for large-scale image annotation which partitions the training dataset into smaller binary chunks and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm-based load balancing scheme is designed to optimize the performance of RAMSMO in balancing the computation of multiclass data chunks in heterogeneous computing environments. RAMSMO is evaluated in both experimental and simulation environments, and the results show that it reduces the training time significantly while maintaining a high level of accuracy in classifications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Article Google Scholar
Tsai C, Hung C (2008) Automatically annotating images with keywords: a review of image annotation. Recent Patents Comput Sci 1:55–68
Article Google Scholar
Wong W, Hsu S (2006) Application of SVM and ANN for image retrieval. Eur J Oper Res 173(3):938–950
Article MATH MathSciNet Google Scholar
Gao Y, Fan J (2005) Semantic image classification with hierarchical feature subset selection. In: Proceedings of the ACM multimedia workshop on multimedia information retrieval, pp 135–142
Boutell M, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771
Article Google Scholar
Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939
Google Scholar
Cusano C, Ciocca G, Schettini R (2004) Image annotation using SVM. In: Proceedings of the SPIE conference on internet imaging, pp 330–338
Fan J, Gao Y, Luo H, Xu G (2004) Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th annual international conference on research and development in information retrieval (SIGIR), pp 361–368
Le Saux B, Amato G (2004) Image recognition for digital libraries. In: Proceedings of the ACM multimedia workshop on multimedia information retrieval (MIR), pp 91–98
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge, pp 1–172
Book Google Scholar
Waring C, Liu X (2005) Face detection using spectral histograms and SVMs. IEEE Trans Syst Man Cybern B Cybern 35(3):467–476
Article Google Scholar
Colas F, Brazdil P (2006) Comparison of SVM and some older classification algorithms in text classification tasks. In: Proceedings of IFIP-AI world computer congress, pp 169–178
Do T, Nguyen V, Poulet F (2008) Speed up SVM algorithm for massive classification tasks. In: Proceedings of the 4th international conference on advanced data mining and applications (ADMA), pp 147–157
Khalid Alham N, Li M, Hammoud S, Qi H (2009) Evaluating machine learning techniques for automatic image annotations. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery (FSKD), pp 245–249
Guo J, Takahashi N, Nishi T (2006) An efficient method for simplifying decision functions of support vector machines. IEICE Trans 89-A(10):2795–2802
Article Google Scholar
Duan K, Keerthi S (2005) Which is the best multiclass SVM method? an empirical study. In: Proceedings of the 6th international workshop on multiple classifier systems, pp 278–285
Duda R, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York
MATH Google Scholar
Knerr S, Personnaz L, Dreyfus G (1990) Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Fogelman Soulié F, Hérault J (eds) Neurocomputing: algorithms, architectures and applications, vol F68 of NATO ASI Series, Springer, pp 41–50
Platt J, Cristanini N, Shawe-Taylor J (1999) Large margin DAGs for multiclass classification. In: Proceedings of neural information processing systems (NIP), MIT Press, pp 547–553
Chih-Wei H, Chih-Jen L (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Article Google Scholar
Herrero-Lopez S, Williams J, Sanchez v (2010) Parallel multiclass classification using SVMs on GPUs. In: Proceedings of the 3rd workshop on general-purpose computation on graphics processing units (GPGPU), pp 2–11
Cao L, Keerthi S, Ong C-J , Zhang J, Periyathamby U, Fu XJ, Lee H (2006) Parallel sequential minimal optimization for the training of support vector machines. IEEE Trans Neural Netw 17(4):1039–1049
Article Google Scholar
Munoz-Mari J, Plaza A, Gualtieri JA, Camps-Valls G (2009) Parallel implementations of SVM for earth observation. In: Xhafa F (ed) Parallel programming, models and applications in grid and P2P systems, pp 292–312
Do T, Poulet F (2006) Classifying one billion data with a new distributed SVM algorithm. In: Proceedings of the international conference on research, innovation and vision for the future (RIVF), pp 59–66
Cao LJ, Keerthi SS, Ong CJ, Uvaraj P, Fu XJ, Lee HP (2006) Developing parallel sequential minimal optimization for fast training support vector machine. Neurocomputing 70(1–3):93–104
Article Google Scholar
Zhang C, Li P, Rajendran A, Deng Y (2006) Parallel multicategory support vector machines (PMC-SVM) for classifying microcarray data. In: Proceedings of the 1st international multi-symposiums on computer and computational sciences (IMSCCS), pp 110–115, IEEE CS
Platt JC (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Technical Report, MSR-TR-98-14, Microsoft research
Lämmel R (2008) Google’s MapReduce programming model—revisited. Sci Comput Program 70(1):1–30
Article MATH Google Scholar
Apache Hadoop [Online]: http://hadoop.apache.org/ (Last accessed: 3 April 2011)
Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2:265–292
Google Scholar
Keerthi S, Sundararajan S, Chang K, Hsieh C, Lin C (2008) A sequential dual method for large scale multi-class linear SVMs. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 408–416
Dietterich T, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. Artif Intell Res 2:263–286
MATH Google Scholar
Li T, Zhang C, Ogihara M (2004) A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15):2429–2437
Article Google Scholar
Medusa [Online]: http://www.lsc-group.phys.uwm.edu/beowulf/medusa/index.html (Last accessed: 3 April 2011)
Lu BL, Wang KA, Utiyama M, Isahara H (2004) A part-versus-part method for massively parallel training of support vector machines. In: Proceedings of international joint conference on neural networks (IJCNN), pp 735–740, IEEE CS
Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649
Article MATH Google Scholar
Hastie T, Tibshirani R (1997) Classification by pairwise coupling. In: Proceedings of the conference on advances in neural information processing systems
Roth V, Tsuda K (2001) Pairwise coupling for machine recognition of hand-printed Japanese characters. In: Proceedings of the international conference on computer vision and pattern recognition (CVPR), pp 1120–1125
Ghemawat S, Gobioff H, Leung S (2003) The google file system. In: Proceedings of the 19th ACM symposium on operating systems principles (SOSP), pp 29–43
Weka 3 [Online]: http://www.cs.waikato.ac.nz/ml/weka/ (Last accessed: 3 April 2011)
Sikora T (2001) The MPEG-7 visual standard for content description-an overview. IEEE Trans Circuits Syst Video Technol 11(6):696–702
Article MathSciNet Google Scholar
Corel Image Databases [Online]: http:// www.corel.com (Last accessed: 3 April 2011)
Lire, an Open Source Java content based image retrieval library [Online]: http://www.semanticmetadata.net/lire/ (Last accessed: 3 April 2011)
Alham NK, Li M, Hammoud S, Liu Y, Ponraj M (2010) A distributed SVM for image annotation. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery (FSKD), pp 2983–2987
Liu Y, Li M, Alham NK, Hammoud S (2011) HSim: a MapReduce simulator in enabling cloud computing. Futur Gener Comput Syst. Published online at http://dx.doi.org/10.1016/j.future.2011.05.007
Ibarra OH, Kim CE (1977) Heuristic algorithms for scheduling independent tasks on non-identical processors. J Assoc Comput Mach (JACM) 24(2):280–289
Article MATH MathSciNet Google Scholar
Chen J, Wang C, Wang R (2009) Adaptive binary tree for fast SVM multiclass classification. Neurocomputing 72:3370–3375
Article Google Scholar
Manikandan J, Venkataramani B (2010) Study and evaluation of a multi-class SVM classifier using diminishing learning technique. Neurocomputing 73:1676–1685
Article Google Scholar
Zaharia M, Konwinski A, Joseph AD, Katz R, Stoica I (2008) Improving MapReduce performance in heterogeneous environments, (OSDI) 2008, pp 29–42

Download references

Author information

Authors and Affiliations

School of Engineering and Design, Brunel University, Uxbridge, UB8 3PH, UK
Nasullah Khalid Alham & Maozhen Li
The Key Laboratory of Embedded Systems and Service Computing, Ministry of Education, Tongji University, Shanghai, China
Maozhen Li
School of Electrical Engineering and Information System, Sichuan University, Chengdu, China
Yang Liu

Authors

Nasullah Khalid Alham
View author publications
You can also search for this author in PubMed Google Scholar
Maozhen Li
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maozhen Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alham, N.K., Li, M. & Liu, Y. Parallelizing multiclass support vector machines for scalable image annotation. Neural Comput & Applic 24, 367–381 (2014). https://doi.org/10.1007/s00521-012-1237-2

Download citation

Received: 23 November 2011
Accepted: 17 October 2012
Published: 31 October 2012
Issue Date: February 2014
DOI: https://doi.org/10.1007/s00521-012-1237-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallelizing multiclass support vector machines for scalable image annotation

Abstract

Access this article

Similar content being viewed by others

A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications

Incremental Parallel Support Vector Machines for Classifying Large-Scale Multi-class Image Datasets

Parallel Learning of Local SVM Algorithms for Classifying Large Datasets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallelizing multiclass support vector machines for scalable image annotation

Abstract

Access this article

Similar content being viewed by others

A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications

Incremental Parallel Support Vector Machines for Classifying Large-Scale Multi-class Image Datasets

Parallel Learning of Local SVM Algorithms for Classifying Large Datasets

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation