Skip to main content
Log in

Random projections for linear SVM ensembles

  • Original Paper
  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper presents an experimental study using different projection strategies and techniques to improve the performance of Support Vector Machine (SVM) ensembles. The study has been made over 62 UCI datasets using Principal Component Analysis (PCA) and three types of Random Projections (RP), taking into account the size of the projected space and using linear SVMs as base classifiers. Random Projections are also combined with the sparse matrix strategy used by Rotation Forests, which is a method based in projections too. Experiments show that for SVMs ensembles (i) sparse matrix strategy leads to the best results, (ii) results improve when projected space dimension is bigger than the original one, and (iii) Random Projections also contribute to the results enhancement when used instead of PCA. Finally, random projected SVMs are tested as base classifiers of some state of the art ensembles, improving their performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Achlioptas D (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 66(4):671–687

    Article  MathSciNet  MATH  Google Scholar 

  2. Bingham E, Mannila H (2001) Random projection in dimensionality reduction: applications to image and text data. In: Knowledge Discovery and Data Mining. ACM, New York, pp 245–250

    Google Scholar 

  3. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  4. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46

    Article  Google Scholar 

  5. Dasgupta S, Freund Y (2008) Random projection trees and low dimensional manifolds. In: STOC ’08: proceedings of the 40th annual ACM symposium on theory of computing. ACM, New York, pp 537–546

    Chapter  Google Scholar 

  6. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  Google Scholar 

  7. Dietterich TG (1998) Approximate statistical test for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923

    Article  Google Scholar 

  8. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874

    Google Scholar 

  9. Fradkin D, Madigan D (2003) Experiments with random projections for machine learning. In: KDD ’03: proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 517–522

    Chapter  Google Scholar 

  10. Frank A, Asuncion A (2010) UCI machine learning repository. URL http://archive.ics.uci.edu/ml

  11. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  MATH  Google Scholar 

  12. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor 11(1):10–18

    Article  Google Scholar 

  13. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  14. Johnson W, Lindenstrauss J (1984) Extensions of Lipschitz maps into a Hilbert space. In: Conference in modern analysis and probability (1982, Yale University). Contemporary mathematics, vol 26. AMS, New York, pp 189–206

    Google Scholar 

  15. Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, New York

    Book  MATH  Google Scholar 

  16. Kuncheva LI, Rodríguez JJ (2007) An experimental study on rotation forest ensembles. In: 7th international workshop on multiple classifier systems, MCS 2007. LNCS, vol 4472. Springer, Berlin, pp 459–468

    Google Scholar 

  17. Margineantu DD, Dietterich TG (1997) Pruning adaptive boosting. In: Proc. 14th international conference on machine learning. Morgan Kaufmann, San Mateo, pp 211–218

    Google Scholar 

  18. Maudes J, Rodríguez JJ, García-Osorio C (2009) Disturbing neighbors diversity for decision forests. In: Okun O, Valentini G (eds) Applications of supervised and unsupervised ensemble methods. Studies in computational intelligence, vol 245. Springer, Berlin, pp 113–133

    Chapter  Google Scholar 

  19. Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: A new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10):1619–1630

    Article  Google Scholar 

  20. Schclar A, Rokach L (2009) Random projection ensemble classifiers. In: Enterprise information systems 11th international conference proceedings. Lecture notes in business information processing, pp 309–316

    Google Scholar 

  21. Vapnik VN (1999) The nature of statistical learning theory. Information science and statistics. Springer, Berlin

    Google Scholar 

  22. GI Webb (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40(2)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jesús Maudes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maudes, J., Rodríguez, J.J., García-Osorio, C. et al. Random projections for linear SVM ensembles. Appl Intell 34, 347–359 (2011). https://doi.org/10.1007/s10489-011-0283-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-011-0283-2

Keywords

Navigation