Skip to main content
Log in

Parallel randomized sampling for support vector machine (SVM) and support vector regression (SVR)

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush–Kuhn–Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Adler I, Shamir R (1993) Arandomized scheme for speeding up algorithms for linear and convex programming with high constraints-to-variable ratio. Math Program 61(1–3): 39–52

    Article  MathSciNet  Google Scholar 

  2. Balcazar J, Dai Y, Tanaka J, Watanabe O (2001) Provably fast training algorithm for support vector machines. In: The 1st IEEE International Conference on Data Mining (ICDM01)

  3. Balcazar J, Dai Y, Tanaka J, Watanabe O (2002) Provably fast training algorithm for support vector machines. Technical Reports on Mathematical and Computing Sciences TR-C160, Tokyo Institute of Technology, Tokyo

  4. Bastiere A (1998) Methods for multisensor classification of aiborne targets integrating evidence theory. Aerospace Sci Technol 2: 401–411

    Article  MATH  Google Scholar 

  5. Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover type from cartographic variables. Comput Electronics Agric 24: 131–151

    Article  Google Scholar 

  6. Blake CL, Merz C (1998) UCI repository of machine learning databases http://www.ics.uci.edu/~mlearn/MLRepository.html

  7. Cesa-Bianchi N (2004) Applications of regularized least squares to classification problems. In: Lecture Notes in Artificial Intelligence. Springer, Berlin, pp 14–18

  8. Chellappa R, Zheng Q, Burlina P, Shekhar C, Eom KB (1997) On the positioning of multisensor imagery for exploitation and target recognition. In: Proceedings of the IEEE, vol 85, pp 120–138

  9. Clarkson KL (1988) Las vegas algorithms for linear and integer programming when the dimension is small. In: The 29th IEEE symposium on foundations of computer science (FOCS’88)

  10. Collobert R, Bengio S (2001) Svmtorch: Support vector machines for large-scale regress problems. J Mach Lear Res 1: 276–285

    MathSciNet  Google Scholar 

  11. Collobert R, Bengio S, Bengio Y (2001) A parallel mixture of svms for very large scale problems. In:Neural Information Processing Systems, pp 633–640

  12. Cortes C, Vapnik V (1995) Support-vector networks. Machine Learning 20: 273–297

    MATH  Google Scholar 

  13. Duda RO, Hart PE, Stork DG (2000) Pattern Classification. Wiley

  14. Flake GW, Lawrence S (2002) Efficient svm regression training with smo. Mach Learn 46(1–3): 271–290

    Article  MATH  Google Scholar 

  15. Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and application to boosting. J Comput Syst Sci 55(1): 119–139

    Article  MathSciNet  Google Scholar 

  16. Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4): 367–378

    Article  MATH  Google Scholar 

  17. Gartner B, Welzl E (2000) A simple sampling lemma: Analysis and applications in geometric optimization. In: Proceeding of the 16th annual ACM symposium on computational geometry (SCG)

  18. Graf HP, Cosatto E, Bottou L, Dourdanovic I, Vapnik V (2005) Parallel support vector machine: The cascade svm. In: Advances in neural information processing systems

  19. Jeon B, Landgrebe DA (1999) Decision fusion approach for multitemporal classification. IEEE Trans Geosci Remote Sens 37: 1227–1233

    Article  Google Scholar 

  20. Joachims T (1998) Making large-scale svm learning practical. In: Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge, pp 169–184

  21. Li S, Kwok JT-Y, Tsang IW-H, Wang Y (1998) Fusion images with different focuses using support vector machines. IEEE Trans Neural Netw 15: 1555–1560

    Article  Google Scholar 

  22. Mangasarian O, Musicant D (1999) Successive overrelaxation for support vectormachines. IEEE Trans Neural Netw 10(5): 1032–1037

    Article  Google Scholar 

  23. Osuna E, Freund R, Girosi F (1997) An improved training algorithm for support vectormachines. In: IEEE workshop on neural networks for signal processing. pp 276–285

  24. Osuna E, Freund R, Girosi F (1997) Support vector machine: training and applications, Artificial Intelligence Laboratory Memo No. 1602, Massachusetts Institute of Technology Massachusetts

  25. Platt JC (1998) Sequenctial minimal optimization: a fast algorithm for training support vector machines, Technical Report MSR-TR-98-14, Microsof Research

  26. Pohl C, Genderen JLV (1998) Mutisensor image fusion in remote sensing: Concepts, methods and applications. Int J Remote Sens 19: 823–854

    Article  Google Scholar 

  27. Schmidt MS (1996) Identity speaks with support vector networks. In: The 28th symposium on the interface (INTERFACE-96), Sydney

  28. Vapnik V (1995) The Nature of Statistical Learning Theory. Springer, New York

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yumao Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Y., Roychowdhury, V. Parallel randomized sampling for support vector machine (SVM) and support vector regression (SVR). Knowl Inf Syst 14, 233–247 (2008). https://doi.org/10.1007/s10115-007-0082-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-007-0082-6

Keywords

Navigation