skip to main content
10.1145/3494885.3494891acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsseConference Proceedingsconference-collections
research-article

Exploring Support Vector Machines for Big Data Analyses

Published: 20 December 2021 Publication History

Abstract

The traditional support vector machines perform well in classification and prediction on small and medium-sized data sets, but there are some problems such as the low training efficiency and the low accuracy in large sample number, high dimension and large-scale data sets. Meanwhile, with the rise of distributed computing platforms such as the Spark suitable for big data analyses, more and more scholars at home and abroad turn their research direction to the distributed machine learning algorithms Therefore, in order to carry out the research on support vector machine for big data analyses, this paper explores the related researches and current situations of support vector machine, including: in-depth analysis of the algorithm principle of support vector machines, systematical investigation of the improved methods of support vector machines for the big data analyses, and distributed support vector machines under the Spark platform. Then, combined with the parallelization mechanism of the Spark, the some future research directions of support vector machine are investigated: for optimizing the accuracy of training results, some special matrix calculation skills should be added; In term of the research on SVM under the Spark platform, some better optimization methods from the perspective of dimension and partition can be found.

References

[1]
Akpakwu, G. A., Silva, B. J., Hancke, G. P., & Abu-Mahfouz, A. M. (2017). A Survey on 5G Networks for the Internet of Things: Communication Technologies and Challenges. IEEE Access, 5(12), 3619-3647.
[2]
Cortes C, & Vapnik V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273-297.
[3]
White, T. (2012). Hadoop: The Definitive Guide. Sebastopol. CA, USA: O'Reilly Media.
[4]
Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2010). Spark: Cluster computing with working sets. HotCloud, 10-17.
[5]
Burges, C. J. C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining, 2(2), 121-167.
[6]
Scholkopf, B., & Smola, A. J. (2001). Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA, USA: MIT Press.
[7]
Zhou, Z. (2017). Machine learning. Tsinghua University Press, Peking.
[8]
Platt, J. (1998). Sequetial minimal optimization: A fast algorithm for training support vector machines, Technical Report MST-TR-98-14. Microsoft Research.
[9]
Suykens, J. A. K., & Vandewalle, J. (1999). Least Squares Support Vecto Machine Classifiers, Neural Processing Letters, 9(3), 293-300.
[10]
Osuna, E., & Girosi, F. (1999). Reducing the run-time complexity in support vector regression. Cambridge, MA, USA: MIT Press.
[11]
Smola, A. J., & Sch´olkopf, B. (2000). Sparse Greedy matrix approximation for machine learning. Proc. of the 17th International Conference on Machine Learning, Morgan Kaufman, San Francisco CA, 911-918.
[12]
Wang, Z., Crammer, K., Vucetic, S. (2012). Breaking the Curse of Kernelization: Budgeted Stochastic Gradient Descent for Large-Scale SVM Training. Journal of Machine Learning Research, 13(1), 3103-3131.
[13]
Navia-Vázquez, A. (2007). Compact multi-class support vector machine. Neurocomputing, 71(1-3), 400–405.
[14]
Parrado-Hernandez, E., Mora-Jimenez, I., Arenas-Garca, J., Figueiras-Vidal, A. R., & Navia-Vazquez, A. (2003). Growing support vector classifiers with controlled complexity. Pattern Recognition, 36(7), 1479-1488.
[15]
Robbins, H., & Monro S. (1951). A Stochastic Approxima Method. The Annals of Mathematical Statistics, 400-407.
[16]
Sch´olkopf, B., & Smola, A. (2002). Learning with kernels. Cambridge, MA: MIT Press.
[17]
Pérez-Cruz, F., Navia-Vázquez, A., Alarcón-Diana, P. L., & Artés-Rodríguez, A. (2000). An IRWLS procedure for SVR", Proc. EUSIPCO.
[18]
Perez-Cruz, F., Bousono-Calzon, C., & Artes-Rodriguez, A. (2005). Convergence of the IRWLS Procedure to the Support Vector Machine Solution. Neural Computation, 17(1), 7-18.
[19]
Wen, W., Hao, Z., & Shao, Z. (2010). Study on the fast training algorithm of iteratively re-weighted least squares support vector machine, Computer Science, 37(8), 224-228,297.
[20]
Burges, C. J. C. (1997). Improving the Accuracy and Speed of Support Vector Machines, Neural Information Processing Systems, 9(1), 4-5, author reply 5-6.
[21]
Apache Spark: .(2021). Machine Learning Library (MLlib) Guide Support Vector Machine-spark Mllib. https: //spark.apache.org/docs/2.4.0/ml-classification-regression.html #linear-support-vector-machine.
[22]
Díaz-Morales, R., & Navia-Vázquez, A. (2016). Efficient parallel implementation of kernel methods. Neurocomputing, 191, 175–186.
[23]
OpenMP, C and C++ application program interface, http://www. openmp.org
[24]
Gong, Y., & Jia, L. (2019). Research on SVM environment performance of parallel computing based on large data set of machine learning, The Journal of Supercomputing, 75(4).
[25]
Wang, H., Xiao, Y., & Long, Y. (2017). Research of intrusion detection algorithm based on parallel SVM on spark. 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC), 153-156.

Cited By

View all
  • (2023)Simulation of Hierarchical Parallel Computing Model for Fluid Machinery Based on Support Vector Machines2023 IEEE International Conference on Electrical, Automation and Computer Engineering (ICEACE)10.1109/ICEACE60673.2023.10442024(868-873)Online publication date: 29-Dec-2023
  • (2022)Defect Pattern Analysis, Yield Learning Modeling, and Yield PredictionProduction Planning and Control in Semiconductor Manufacturing10.1007/978-3-031-14065-5_4(63-76)Online publication date: 20-Sep-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSSE '21: Proceedings of the 4th International Conference on Computer Science and Software Engineering
October 2021
366 pages
ISBN:9781450390675
DOI:10.1145/3494885
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Distribution
  2. Machine Learning
  3. SVM
  4. Spark

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CSSE 2021

Acceptance Rates

Overall Acceptance Rate 33 of 74 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)2
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Simulation of Hierarchical Parallel Computing Model for Fluid Machinery Based on Support Vector Machines2023 IEEE International Conference on Electrical, Automation and Computer Engineering (ICEACE)10.1109/ICEACE60673.2023.10442024(868-873)Online publication date: 29-Dec-2023
  • (2022)Defect Pattern Analysis, Yield Learning Modeling, and Yield PredictionProduction Planning and Control in Semiconductor Manufacturing10.1007/978-3-031-14065-5_4(63-76)Online publication date: 20-Sep-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media