An accelerator for support vector machines based on the local geometrical information and data partition

Song, Yunsheng; Liang, Jiye; Wang, Feng

doi:10.1007/s13042-018-0877-7

An accelerator for support vector machines based on the local geometrical information and data partition

Original Article
Published: 09 October 2018

Volume 10, pages 2389–2400, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yunsheng Song^1,2,
Jiye Liang³ &
Feng Wang³

261 Accesses
Explore all metrics

Abstract

The support vector machines (SVM) is difficult to deal with large datasets for its low training efficiency. One of the important solutions has been developed by dividing a whole dataset into smaller subsets with data partition and combining the results of the classifiers over the divided subsets. However, traditional data partition approaches are difficult to preserve the class boundary of the dataset or control the size of divided subsets, so that their performance will be greatly influenced. To overcome this difficulty, we propose an accelerator for SVM algorithm based on the local geometrical information. In this algorithm, the feature space is divided into several regions with the approximately equal number of training instances by linear projection, and then each SVM classifier trained over the extended region only predicts the unlabeled instances within that original region. The proposed algorithm can not only hold the decision boundary of the raw data, but also saves a lot of execution time for implementing it in a parallel environment. Furthermore, the number of instances within each divided regions can be effectively controlled; it is conducive to choose the complexity of the execution in each of the processors. Experiments show that the classification performance of the proposed algorithm compares favorably with four state-of-the-art algorithms with the least training time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Big Data Classification: A Combined Approach Based on Parallel and Approx SVM

Incremental Parallel Support Vector Machines for Classifying Large-Scale Multi-class Image Datasets

Research on SVM environment performance of parallel computing based on large data set of machine learning

Article 21 June 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bosner B, Guyon I, Vapnik V (1992) A training algorithm for optimal margin classifier. In: Proceedings of the 5th annual ACM workshop on computational learning theory, pp 144–152
Doran G, Ray S (2014) A theoretical and empirical analysis of support vector machine methods for multiple-instance classification. Mach Learn 97(1–2):79–102
Article MathSciNet MATH Google Scholar
Chen W, Shao Y, Hong N (2014) Laplacian smooth twin support vector machine for semi-supervised classification. Int J Mach Learn Cybern 5(3):459–468
Article Google Scholar
Li C, Huang Y, Wu H, Shao Y, Yang Z (2016) Multiple recursive projection twin support vector machine for multi-class classification. Int J Mach Learn Cybern 7(5):729–740
Article Google Scholar
Abe S (2016) Fusing sequential minimal optimization and newtons method for support vector training. Int J Mach Learn Cybern 7(3):345–364
Article Google Scholar
Yang Z, Wu H, Li C, Shao Y (2016) Least squares recursive projection twin support vector machine for multi-class classification. Int J Mach Learn Cybern 7(3):411–426
Article Google Scholar
Peng X, Kong L, Chen D (2017) A structural information-based twin-hypersphere support vector machine classifier. Int J Mach Learn Cybern 8(1):295–308
Article Google Scholar
Ding S, Zhu Z, Zhang X (2017a) An overview on semi-supervised support vector machine. Neural Comput Appl 28(5):969–978
Article Google Scholar
Ding S, Zhang X, An Y, Xue Y (2017b) Weighted linear loss multiple birth support vector machine based on information granulation for multi-class classification. Pattern Recognit 67:32–46
Article Google Scholar
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181
MathSciNet MATH Google Scholar
Cachin C (1994) Pedagogical pattern selection strategies. Neural Netw 7(1):175–181
Article Google Scholar
Foody GM (1999) The significance of border training patterns in classification by a feedforward neural network using back propagation learning. Int J Remote Sens 20(18):3549–3562
Article Google Scholar
Hsieh CJ, Si S, Dhillon IS (2014) A divide-and-conquer solver for kernel support vector machines. In: Proceedings of the 31th international conference on machine learning, pp 566–574
Do TN, Poulet F (2015) Random local SVMS for classifying large datasets. In: Proceedings of the second international conference on future data and security engineering, pp 3–15
Poggio T, Cauwenberghs G (2001) Incremental and decremental support vector machine learning. In: Advances in neural information processing systems, pp 409–415
Pontil M, Verri A (1998) Properties of support vector machines. Neural Comput 10(4):955–974
Article Google Scholar
Koggalage R, Halgamuge S (2004) Reducing the number of training samples for fast support vector machine classification. Neural Inf Process Lett Rev 2(3):57–65
Google Scholar
Lyhyaoui A, Martinez M, Mora I, Vaquez M, Sancho JL, Figueiras-Vidal AR (1999) Sample selection via clustering to construct support vector-like classifiers. IEEE Trans Neural Netw 10(6):1474–1481
Article Google Scholar
Angiulli F, Astorino A (2010) Scaling up support vector machines using nearest neighbor condensation. IEEE Trans Neural Netw 21(2):351–357
Article Google Scholar
Li Y, Maguire L (2011) Selecting critical patterns based on local geometrical and statistical information. IEEE Trans Pattern Anal Mach Intell 33(6):1189–1201
Article Google Scholar
Wang J, Wonka P, Ye J (2013) Scaling SVM and least absolute deviations via exact data reduction. Comput Sci 2013:523–531
Google Scholar
Pan X, Yang Z, Xu Y, Wang L (2018a) Safe screening rules for accelerating twin support vector machine classification. IEEE Trans Neural Netw Learn Syst 29(5):1876–1887
Article MathSciNet Google Scholar
Pan X, Pang X, Wang H, Xu Y (2018b) A safe screening based framework for support vector regression. Neurocomputing 287:163–172
Article Google Scholar
Collobert R, Bengio S, Bengio Y (2002) A parallel mixture of SVMS for very large scale problems. Neural Comput 14(5):1105–1114
Article MATH Google Scholar
Graf HP, Cosatto E, Bottou L, Dourdanovic I, Vapnik V (2004) Parallel support vector machines: The cascade SVM. In: Advances in neural information processing systems, pp 521–528
Singh D, Roy D, Mohan CK (2017) Dip-SVM: distribution preserving kernel support vector machine for big data. IEEE Trans Big Data 3(1):79–90
Article Google Scholar
Keerthi SS, Chapelle O, DeCoste D (2006) Building support vector machines with reduced classifier complexity. J Mach Learn Res 7(Jul):1493–1515
MathSciNet MATH Google Scholar
Zhang K, Lan L, Wang Z, Moerchen F (2012) Scaling up kernel SVM on limited resources: A low-rank linearization approach. In: Artificial intelligence and statistics, pp 1425–1434
Le Q, Sarlós T, Smola A (2013) Fastfood-approximating kernel expansions in loglinear time. In: Proceedings of the 30th international conference on machine learning, pp 16–21
Jose C, Goyal P, Aggrwal P, Varma M (2013) Local deep kernel learning for efficient non-linear SVM prediction. In: Proceedings of the 30th international conference on machine learning, pp 486–494
Vapnik V (2013) The nature of statistical learning theory. Springer, New York
MATH Google Scholar
Joachims T (2006) Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 217–226
Shin H, Cho S (2007) Neighborhood property-based pattern selection for support vector machines. Neural Comput 19(3):816–855
Article MATH Google Scholar
García-Osorio C, de Haro-García A, García-Pedrajas N (2010) Democratic instance selection: a linear complexity instance selection algorithm based on classifier ensemble concepts. Artif Intell 174(5):410–441
Article MathSciNet Google Scholar
Garcia S, Derrac J, Cano J, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417–435
Article Google Scholar
Asimov D (1985) The grand tour: a tool for viewing multidimensional data. SIAM J Sci Stat Comput 6(1):128–143
Article MathSciNet MATH Google Scholar
Kleiner A, Talwalkar A, Sarkar P, Jordan MI (2014) A scalable bootstrap for massive data. J R Stat Soc Ser B (Stat Methodol) 76(4):795–816
Article MathSciNet Google Scholar
Zhang X (2004) Matrix analysis and application. Tsinghua University Press, Beijing
Google Scholar
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27
Article Google Scholar
Bache K, Lichman M (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets.html
Kugler M, Kuroyanagi S, Nugroho AS, Iwata A (2006) Combnet-iii: a support vector machine based large scale classifier with probabilistic framework. IEICE Trans Inf Syst 89(9):2533–2541
Article Google Scholar
Wang Z, Djuric N, Crammer K, Vucetic S (2011) Trading representability for scalability: adaptive multi-hyperplane machine for nonlinear classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 24–32
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Margan Kaufmann, San Francisco
MATH Google Scholar
Ben-David A (2007) A lot of randomness is hiding in accuracy. Eng Appl Artif Intell 20(7):875–885
Article Google Scholar
Wilcoxon F (1992) Individual comparisons by ranking methods. Springer, New York
Book Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61432011, No. U1435212, and No. 61876103), the Project of Key Research and Development Plan of Shanxi Province (201603D111014), and the 1331 Engineering Project of Shanxi Province, China.

Author information

Authors and Affiliations

College of Information Science and Engineering, Shandong Agricultural University, Taian, 271018, Shandong, China
Yunsheng Song
School of Computer and Information Technology, Shanxi University, Taiyuan, 030006, Shanxi, China
Yunsheng Song
Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan, 030006, Shanxi, China
Jiye Liang & Feng Wang

Authors

Yunsheng Song
View author publications
You can also search for this author in PubMed Google Scholar
Jiye Liang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiye Liang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, Y., Liang, J. & Wang, F. An accelerator for support vector machines based on the local geometrical information and data partition. Int. J. Mach. Learn. & Cyber. 10, 2389–2400 (2019). https://doi.org/10.1007/s13042-018-0877-7

Download citation

Received: 08 October 2017
Accepted: 18 September 2018
Published: 09 October 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s13042-018-0877-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An accelerator for support vector machines based on the local geometrical information and data partition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Big Data Classification: A Combined Approach Based on Parallel and Approx SVM

Incremental Parallel Support Vector Machines for Classifying Large-Scale Multi-class Image Datasets

Research on SVM environment performance of parallel computing based on large data set of machine learning

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now