Skip to main content

Advertisement

Log in

Parallelizing multiclass support vector machines for scalable image annotation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them, Support Vector Machines (SVMs) are used extensively due to their generalization properties. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. Notably, SVM training is a computationally intensive process especially when the training dataset is large. This paper presents a resource aware parallel multiclass SVM algorithm (named RAMSMO) for large-scale image annotation which partitions the training dataset into smaller binary chunks and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm-based load balancing scheme is designed to optimize the performance of RAMSMO in balancing the computation of multiclass data chunks in heterogeneous computing environments. RAMSMO is evaluated in both experimental and simulation environments, and the results show that it reduces the training time significantly while maintaining a high level of accuracy in classifications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  2. Tsai C, Hung C (2008) Automatically annotating images with keywords: a review of image annotation. Recent Patents Comput Sci 1:55–68

    Article  Google Scholar 

  3. Wong W, Hsu S (2006) Application of SVM and ANN for image retrieval. Eur J Oper Res 173(3):938–950

    Article  MATH  MathSciNet  Google Scholar 

  4. Gao Y, Fan J (2005) Semantic image classification with hierarchical feature subset selection. In: Proceedings of the ACM multimedia workshop on multimedia information retrieval, pp 135–142

  5. Boutell M, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771

    Article  Google Scholar 

  6. Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939

    Google Scholar 

  7. Cusano C, Ciocca G, Schettini R (2004) Image annotation using SVM. In: Proceedings of the SPIE conference on internet imaging, pp 330–338

  8. Fan J, Gao Y, Luo H, Xu G (2004) Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th annual international conference on research and development in information retrieval (SIGIR), pp 361–368

  9. Le Saux B, Amato G (2004) Image recognition for digital libraries. In: Proceedings of the ACM multimedia workshop on multimedia information retrieval (MIR), pp 91–98

  10. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge, pp 1–172

    Book  Google Scholar 

  11. Waring C, Liu X (2005) Face detection using spectral histograms and SVMs. IEEE Trans Syst Man Cybern B Cybern 35(3):467–476

    Article  Google Scholar 

  12. Colas F, Brazdil P (2006) Comparison of SVM and some older classification algorithms in text classification tasks. In: Proceedings of IFIP-AI world computer congress, pp 169–178

  13. Do T, Nguyen V, Poulet F (2008) Speed up SVM algorithm for massive classification tasks. In: Proceedings of the 4th international conference on advanced data mining and applications (ADMA), pp 147–157

  14. Khalid Alham N, Li M, Hammoud S, Qi H (2009) Evaluating machine learning techniques for automatic image annotations. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery (FSKD), pp 245–249

  15. Guo J, Takahashi N, Nishi T (2006) An efficient method for simplifying decision functions of support vector machines. IEICE Trans 89-A(10):2795–2802

    Article  Google Scholar 

  16. Duan K, Keerthi S (2005) Which is the best multiclass SVM method? an empirical study. In: Proceedings of the 6th international workshop on multiple classifier systems, pp 278–285

  17. Duda R, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York

    MATH  Google Scholar 

  18. Knerr S, Personnaz L, Dreyfus G (1990) Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Fogelman Soulié F, Hérault J (eds) Neurocomputing: algorithms, architectures and applications, vol F68 of NATO ASI Series, Springer, pp 41–50

  19. Platt J, Cristanini N, Shawe-Taylor J (1999) Large margin DAGs for multiclass classification. In: Proceedings of neural information processing systems (NIP), MIT Press, pp 547–553

  20. Chih-Wei H, Chih-Jen L (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425

    Article  Google Scholar 

  21. Herrero-Lopez S, Williams J, Sanchez v (2010) Parallel multiclass classification using SVMs on GPUs. In: Proceedings of the 3rd workshop on general-purpose computation on graphics processing units (GPGPU), pp 2–11

  22. Cao L, Keerthi S, Ong C-J , Zhang J, Periyathamby U, Fu XJ, Lee H (2006) Parallel sequential minimal optimization for the training of support vector machines. IEEE Trans Neural Netw 17(4):1039–1049

    Article  Google Scholar 

  23. Munoz-Mari J, Plaza A, Gualtieri JA, Camps-Valls G (2009) Parallel implementations of SVM for earth observation. In: Xhafa F (ed) Parallel programming, models and applications in grid and P2P systems, pp 292–312

  24. Do T, Poulet F (2006) Classifying one billion data with a new distributed SVM algorithm. In: Proceedings of the international conference on research, innovation and vision for the future (RIVF), pp 59–66

  25. Cao LJ, Keerthi SS, Ong CJ, Uvaraj P, Fu XJ, Lee HP (2006) Developing parallel sequential minimal optimization for fast training support vector machine. Neurocomputing 70(1–3):93–104

    Article  Google Scholar 

  26. Zhang C, Li P, Rajendran A, Deng Y (2006) Parallel multicategory support vector machines (PMC-SVM) for classifying microcarray data. In: Proceedings of the 1st international multi-symposiums on computer and computational sciences (IMSCCS), pp 110–115, IEEE CS

  27. Platt JC (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Technical Report, MSR-TR-98-14, Microsoft research

  28. Lämmel R (2008) Google’s MapReduce programming model—revisited. Sci Comput Program 70(1):1–30

    Article  MATH  Google Scholar 

  29. Apache Hadoop [Online]: http://hadoop.apache.org/ (Last accessed: 3 April 2011)

  30. Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2:265–292

    Google Scholar 

  31. Keerthi S, Sundararajan S, Chang K, Hsieh C, Lin C (2008) A sequential dual method for large scale multi-class linear SVMs. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 408–416

  32. Dietterich T, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. Artif Intell Res 2:263–286

    MATH  Google Scholar 

  33. Li T, Zhang C, Ogihara M (2004) A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15):2429–2437

    Article  Google Scholar 

  34. Medusa [Online]: http://www.lsc-group.phys.uwm.edu/beowulf/medusa/index.html (Last accessed: 3 April 2011)

  35. Lu BL, Wang KA, Utiyama M, Isahara H (2004) A part-versus-part method for massively parallel training of support vector machines. In: Proceedings of international joint conference on neural networks (IJCNN), pp 735–740, IEEE CS

  36. Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649

    Article  MATH  Google Scholar 

  37. Hastie T, Tibshirani R (1997) Classification by pairwise coupling. In: Proceedings of the conference on advances in neural information processing systems

  38. Roth V, Tsuda K (2001) Pairwise coupling for machine recognition of hand-printed Japanese characters. In: Proceedings of the international conference on computer vision and pattern recognition (CVPR), pp 1120–1125

  39. Ghemawat S, Gobioff H, Leung S (2003) The google file system. In: Proceedings of the 19th ACM symposium on operating systems principles (SOSP), pp 29–43

  40. Weka 3 [Online]: http://www.cs.waikato.ac.nz/ml/weka/ (Last accessed: 3 April 2011)

  41. Sikora T (2001) The MPEG-7 visual standard for content description-an overview. IEEE Trans Circuits Syst Video Technol 11(6):696–702

    Article  MathSciNet  Google Scholar 

  42. Corel Image Databases [Online]: http:// www.corel.com (Last accessed: 3 April 2011)

  43. Lire, an Open Source Java content based image retrieval library [Online]: http://www.semanticmetadata.net/lire/ (Last accessed: 3 April 2011)

  44. Alham NK, Li M, Hammoud S, Liu Y, Ponraj M (2010) A distributed SVM for image annotation. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery (FSKD), pp 2983–2987

  45. Liu Y, Li M, Alham NK, Hammoud S (2011) HSim: a MapReduce simulator in enabling cloud computing. Futur Gener Comput Syst. Published online at http://dx.doi.org/10.1016/j.future.2011.05.007

  46. Ibarra OH, Kim CE (1977) Heuristic algorithms for scheduling independent tasks on non-identical processors. J Assoc Comput Mach (JACM) 24(2):280–289

    Article  MATH  MathSciNet  Google Scholar 

  47. Chen J, Wang C, Wang R (2009) Adaptive binary tree for fast SVM multiclass classification. Neurocomputing 72:3370–3375

    Article  Google Scholar 

  48. Manikandan J, Venkataramani B (2010) Study and evaluation of a multi-class SVM classifier using diminishing learning technique. Neurocomputing 73:1676–1685

    Article  Google Scholar 

  49. Zaharia M, Konwinski A, Joseph AD, Katz R, Stoica I (2008) Improving MapReduce performance in heterogeneous environments, (OSDI) 2008, pp 29–42

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maozhen Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alham, N.K., Li, M. & Liu, Y. Parallelizing multiclass support vector machines for scalable image annotation. Neural Comput & Applic 24, 367–381 (2014). https://doi.org/10.1007/s00521-012-1237-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-1237-2

Keywords

Navigation