skip to main content
10.1145/2661829.2661994acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Solving Linear SVMs with Multiple 1D Projections

Published: 03 November 2014 Publication History

Abstract

We present a new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections. We show that the approach approximates the optimal solution with high accuracy and comes with analytical guarantees. Our solution adapts on methodologies from random projections, exponential search, and coordinate descent. In our experimental evaluation, we compare our approach with the popular liblinear SVM library. We demonstrate a significant speedup on various benchmarks. At the same time, the new methodology provides a comparable or better approximation factor of the optimal solution and exhibits smooth convergence properties. Our results are accompanied by bounds on the time complexity and accuracy.

References

[1]
S. Abe. Analysis of support vector machines. In Neural Networks for Signal Processing, 2002. Proceedings of the 2002 12th IEEE Workshop on, pages 89--98. IEEE, 2002.
[2]
D. Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci., 66(4):671--687, 2003.
[3]
J. Bentley and A. C.-C. Yao. An almost optimal algorithm for unbounded searching. Information Processing Letters, 5(3):82--87, 1976.
[4]
S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, New York, NY, USA, 2004.
[5]
P. Brucker. On the complexity of clustering problems. In Proc. of Optimization and Operations Research, pages 45--54, 1977.
[6]
K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for large-scale L2-loss linear support vector machines. Journal of Machine Learning Research, 2008.
[7]
O. Chapelle. Training a support vector machine in the primal. Neural Computation, 19(5), 2007.
[8]
K. Crammer and Y. Singer. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines. Journal of Machine Learning Research, 2:265--292, 2001.
[9]
D. DeCoste. Anytime query-tuned kernel machines via Cholesky factorization. In SIAM Int. Conf. on Data Mining, 2003.
[10]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 2008.
[11]
I. Guyon, S. Gunn, A. Ben-Hur, and G. Dror. Result analysis of the NIPS 2003 feature selection challenge. Advances in Neural Information Processing Systems, 17:545--552, 2004.
[12]
C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning, 2008.
[13]
C.-W. Hsu, C.-C. Chang, C.-J. Lin, et al. A practical guide to support vector classification, 2003.
[14]
T. Joachims. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.
[15]
W. B. Johnson and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert space. In Contemporary Mathematics, 1984.
[16]
S. S. Keerthi and D. DeCoste. A modified finite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research, 6(1):341, 2006.
[17]
S. S. Keerthi and C.-J. Lin. Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel. Neural Computation, 15(7):1667--1689, 2003.
[18]
S. Krishnan, C. Bhattacharyya, and R. Hariharan. A randomized algorithm for large scale support vector learning. In Neural Information Processing Systems (NIPS), 2008.
[19]
B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics, 2000.
[20]
E. Liberty and S. W. Zucker. The mailman algorithm: A note on matrix-vector multiplication. Information Processing Letters, 2009.
[21]
C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region newton method for logistic regression. The Journal of Machine Learning Research, 2008.
[22]
Y. Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems. Core discussion papers, 2010.
[23]
M. Osadchy, D. Keren, and B. Fadida-Specktor. Hybrid Classifiers for Object Classification with a Rich Background. In European Conference on Computer Vision-ECCV, pages 284--297, 2012.
[24]
S. Paul, C. Boutsidis, M. Magdon-Ismail, and P. Drineas. Random projections for support vector machines. CoRR, abs/1211.6085, 2012.
[25]
J. C. Platt. In Advances in Kernel Methods. MIT Press, Cambridge, MA, USA, 1999.
[26]
A. Rahimi and B. Recht. Random features for large-scale kernel machines. Advances in Neural Information Processing Systems, 2007.
[27]
B. P. Roe, H.-J. Yang, J. Zhu, Y. Liu, I. Stancu, and G. McGregor. Boosted decision trees as an alternative to artificial neural networks for particle identification. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 543(2):577--584, 2005.
[28]
F. Schwenker. Hierarchical support vector machines for multi-class pattern recognition. In Proc. on Knowledge-based Intelligent Engineering Systems and Aplied Technologies, pages 561--565, 2000.
[29]
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for SVM. In Proceedings of the 24th International Conference on Machine learning, 2007.
[30]
Y. Su, T. Murali, V. Pavlovic, and S. Kasif. Training support vector machines in 1D. http://genomics10.bu.edu/yangsu/rankgene/one-svm.pdf, 2002.
[31]
A. V. Uzilov, J. M. Keegan, and D. H. Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics, 7(1):173, 2006.
[32]
J. Weston and C. Watkins. Multi-class support vector machines. Technical report, CSD-TR-98-04, Royal Holloway University of London, 1998.
[33]
Z. Yu, J.-B. Thibault, K. Sauer, C. Bouman, and J. Hsieh. Accelerated line search for coordinate descent optimization. In Nuclear Science Symposium Conference Record, 2006.
[34]
G.-X. Yuan, C.-H. Ho, and C.-J. Lin. An improved GLMNET for L1-regularized logistic regression. Journal of Machine Learning Research, 2012.
[35]
G.-X. Yuan, C.-H. Ho, and C.-J. Lin. Recent advances of large-scale linear classification. Proceedings of the IEEE, 100(9), pages 2584--2603, 2012.

Cited By

View all
  • (2018)Increasing Trust in (Big) Data AnalyticsAdvanced Information Systems Engineering Workshops10.1007/978-3-319-92898-2_6(70-84)Online publication date: 5-Jun-2018
  • (2017)Scalable density-based clustering with quality guarantees using random projectionsData Mining and Knowledge Discovery10.1007/s10618-017-0498-x31:4(972-1005)Online publication date: 1-Jul-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
November 2014
2152 pages
ISBN:9781450325981
DOI:10.1145/2661829
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification
  2. coordinate descent
  3. data mining
  4. random projections
  5. support vector machines (svm)

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '14
Sponsor:

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Increasing Trust in (Big) Data AnalyticsAdvanced Information Systems Engineering Workshops10.1007/978-3-319-92898-2_6(70-84)Online publication date: 5-Jun-2018
  • (2017)Scalable density-based clustering with quality guarantees using random projectionsData Mining and Knowledge Discovery10.1007/s10618-017-0498-x31:4(972-1005)Online publication date: 1-Jul-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media