research-article

Solving Linear SVMs with Multiple 1D Projections

Authors:

Johannes Schneider,

Jasmina Bogojeska,

Michail VlachosAuthors Info & Claims

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

Pages 221 - 230

https://doi.org/10.1145/2661829.2661994

Published: 03 November 2014 Publication History

Abstract

We present a new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections. We show that the approach approximates the optimal solution with high accuracy and comes with analytical guarantees. Our solution adapts on methodologies from random projections, exponential search, and coordinate descent. In our experimental evaluation, we compare our approach with the popular liblinear SVM library. We demonstrate a significant speedup on various benchmarks. At the same time, the new methodology provides a comparable or better approximation factor of the optimal solution and exhibits smooth convergence properties. Our results are accompanied by bounds on the time complexity and accuracy.

References

[1]

S. Abe. Analysis of support vector machines. In Neural Networks for Signal Processing, 2002. Proceedings of the 2002 12th IEEE Workshop on, pages 89--98. IEEE, 2002.

[2]

D. Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci., 66(4):671--687, 2003.

Digital Library

[3]

J. Bentley and A. C.-C. Yao. An almost optimal algorithm for unbounded searching. Information Processing Letters, 5(3):82--87, 1976.

[4]

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, New York, NY, USA, 2004.

[5]

P. Brucker. On the complexity of clustering problems. In Proc. of Optimization and Operations Research, pages 45--54, 1977.

[6]

K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for large-scale L2-loss linear support vector machines. Journal of Machine Learning Research, 2008.

Digital Library

[7]

O. Chapelle. Training a support vector machine in the primal. Neural Computation, 19(5), 2007.

Digital Library

[8]

K. Crammer and Y. Singer. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines. Journal of Machine Learning Research, 2:265--292, 2001.

Digital Library

[9]

D. DeCoste. Anytime query-tuned kernel machines via Cholesky factorization. In SIAM Int. Conf. on Data Mining, 2003.

[10]

R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 2008.

Digital Library

[11]

I. Guyon, S. Gunn, A. Ben-Hur, and G. Dror. Result analysis of the NIPS 2003 feature selection challenge. Advances in Neural Information Processing Systems, 17:545--552, 2004.

[12]

C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning, 2008.

Digital Library

[13]

C.-W. Hsu, C.-C. Chang, C.-J. Lin, et al. A practical guide to support vector classification, 2003.

[14]

T. Joachims. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.

Digital Library

[15]

W. B. Johnson and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert space. In Contemporary Mathematics, 1984.

[16]

S. S. Keerthi and D. DeCoste. A modified finite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research, 6(1):341, 2006.

Digital Library

[17]

S. S. Keerthi and C.-J. Lin. Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel. Neural Computation, 15(7):1667--1689, 2003.

Digital Library

[18]

S. Krishnan, C. Bhattacharyya, and R. Hariharan. A randomized algorithm for large scale support vector learning. In Neural Information Processing Systems (NIPS), 2008.

[19]

B. Laurent and P. Massart. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics, 2000.

[20]

E. Liberty and S. W. Zucker. The mailman algorithm: A note on matrix-vector multiplication. Information Processing Letters, 2009.

Digital Library

[21]

C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region newton method for logistic regression. The Journal of Machine Learning Research, 2008.

Digital Library

[22]

Y. Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems. Core discussion papers, 2010.

[23]

M. Osadchy, D. Keren, and B. Fadida-Specktor. Hybrid Classifiers for Object Classification with a Rich Background. In European Conference on Computer Vision-ECCV, pages 284--297, 2012.

Digital Library

[24]

S. Paul, C. Boutsidis, M. Magdon-Ismail, and P. Drineas. Random projections for support vector machines. CoRR, abs/1211.6085, 2012.

[25]

J. C. Platt. In Advances in Kernel Methods. MIT Press, Cambridge, MA, USA, 1999.

Digital Library

[26]

A. Rahimi and B. Recht. Random features for large-scale kernel machines. Advances in Neural Information Processing Systems, 2007.

Digital Library

[27]

B. P. Roe, H.-J. Yang, J. Zhu, Y. Liu, I. Stancu, and G. McGregor. Boosted decision trees as an alternative to artificial neural networks for particle identification. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 543(2):577--584, 2005.

[28]

F. Schwenker. Hierarchical support vector machines for multi-class pattern recognition. In Proc. on Knowledge-based Intelligent Engineering Systems and Aplied Technologies, pages 561--565, 2000.

[29]

S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for SVM. In Proceedings of the 24th International Conference on Machine learning, 2007.

Digital Library

[30]

Y. Su, T. Murali, V. Pavlovic, and S. Kasif. Training support vector machines in 1D. http://genomics10.bu.edu/yangsu/rankgene/one-svm.pdf, 2002.

[31]

A. V. Uzilov, J. M. Keegan, and D. H. Mathews. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics, 7(1):173, 2006.

[32]

J. Weston and C. Watkins. Multi-class support vector machines. Technical report, CSD-TR-98-04, Royal Holloway University of London, 1998.

[33]

Z. Yu, J.-B. Thibault, K. Sauer, C. Bouman, and J. Hsieh. Accelerated line search for coordinate descent optimization. In Nuclear Science Symposium Conference Record, 2006.

[34]

G.-X. Yuan, C.-H. Ho, and C.-J. Lin. An improved GLMNET for L1-regularized logistic regression. Journal of Machine Learning Research, 2012.

Digital Library

[35]

G.-X. Yuan, C.-H. Ho, and C.-J. Lin. Recent advances of large-scale linear classification. Proceedings of the IEEE, 100(9), pages 2584--2603, 2012.

Cited By

Schneider JHandali Jvom Brocke J(2018)Increasing Trust in (Big) Data AnalyticsAdvanced Information Systems Engineering Workshops10.1007/978-3-319-92898-2_6(70-84)Online publication date: 5-Jun-2018
https://doi.org/10.1007/978-3-319-92898-2_6
Schneider JVlachos M(2017)Scalable density-based clustering with quality guarantees using random projectionsData Mining and Knowledge Discovery10.1007/s10618-017-0498-x31:4(972-1005)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10618-017-0498-x

Index Terms

Solving Linear SVMs with Multiple 1D Projections
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis

Recommendations

Random Projections for Linear Support Vector Machines

Let X be a data matrix of rank ρ, whose rows represent n points in d-dimensional space. The linear support vector machine constructs a hyperplane separator that maximizes the 1-norm soft margin. We develop a new oblivious dimension reduction technique ...
SVM-Induced Dimensionality Reduction and Classification
ICICTA '09: Proceedings of the 2009 Second International Conference on Intelligent Computation Technology and Automation - Volume 04

Support Vector Machines (SVM) has drawn extensive interests due to its attractive properties, based on which some dimensionality reduction methods have been proposed. However, SVM here only serves as a feature extractor rather than a classifier, the ...
Training linear SVMs in linear time
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for high-dimensional sparse data commonly encountered in applications like text classification, word-sense disambiguation, and drug design. These ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management

November 2014

2152 pages

ISBN:9781450325981

DOI:10.1145/2661829

General Chairs:
Jianzhong Li
Harbin Inst. of Technology
,
X. Sean Wang
Fudan University
,
Program Chairs:
Minos Garofalakis
Technical University of Crete, Greece
,
Ian Soboroff
National Institute of Standards, USA
,
Torsten Suel
New York University, USA
,
Min Wang
Google Research, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Seventh Framework Programme

Conference

CIKM '14

Sponsor:

CIKM '14: 2014 ACM Conference on Information and Knowledge Management

November 3 - 7, 2014

Shanghai, China

Acceptance Rates

CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
158
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Schneider JHandali Jvom Brocke J(2018)Increasing Trust in (Big) Data AnalyticsAdvanced Information Systems Engineering Workshops10.1007/978-3-319-92898-2_6(70-84)Online publication date: 5-Jun-2018
https://doi.org/10.1007/978-3-319-92898-2_6
Schneider JVlachos M(2017)Scalable density-based clustering with quality guarantees using random projectionsData Mining and Knowledge Discovery10.1007/s10618-017-0498-x31:4(972-1005)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10618-017-0498-x

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten