Skip to main content

Advertisement

Log in

Large-scale machine learning with synchronous parallel adaptive stochastic variance reduction gradient descent for high-dimensional blindness detection on spark

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The parallelization of optimization algorithms is of paramount importance in large-scale machine learning. In this paper, we explore the implementation of Adaptive learning rate Stochastic Gradient Descent (A-SGD) in a synchronized and parallelized manner. Additionally, we incorporate a Variance Reduction (VR) strategy to enhance the rate of convergence. Our approach addresses the complexity associated with high-dimensional datasets, particularly within the context of Logistic Regression (LR) and Support Vector Machine (SVM). Initially, we utilize the Histogram of Oriented Gradients (HOG) to extract high-dimensional sparse features from a dataset designed for Blindness Detection. Subsequently, we employ LR and SVM as our classifiers of choice. Finally, we apply the Synchronous A-SGD (SA-SGD) and Synchronous Adaptive Stochastic Variance Reduction (SA-SVRG) to the solutions of these classifiers. Our experimental results indicate that the performance of SA-SGD and SA-SVRG is notably superior when executed on a cluster as opposed to a single node.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Algorithm 2
Algorithm 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The APTOS 2019 Blindness Detection dataset used in the experiment is sourced from the official website of Kaggle: https://www.kaggle.com/c/aptos2019-blindness-detection.

Code Availability

Enquiries about Code availability should be directed to the authors.

References

  1. Karim MR, Shajalal M, Graß A, Döhmen T, Chala SA, Beecks C, Decker S (2022) Interpreting Black-box Machine Learning Models for High Dimensional Datasets

  2. Fournier Q, Aloise D (2021) Empirical comparison between autoencoders and traditional dimensionality reduction methods

  3. Bottou L (2018) Optimization methods for large-scale machine learning. SIAM Review

  4. Konen J, Jie L, Richtárik P, Taká M (2014) ms2gd: Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE Journal of Selected Topics in Signal Processing, 242–255

  5. Roux NL, Schmidt M, Bach F (2013) A stochastic gradient method with an exponential convergence rate for finite training sets. Adv Neural Inf Process Syst 25:2663–2671

    MATH  Google Scholar 

  6. Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. News Physiol Sci 26:315–323

    MATH  Google Scholar 

  7. Mustafin A, Olshevsky A, Paschalidis IC (2022) Closing the gap between SVRG and TD-SVRG with Gradient Splitting

  8. Pan H, Zheng L (2022) N-svrg:stochastic variance reduction gradient with noise reduction ability for small batch samples. CMES-Comput Model Eng Sci 004:000

    MATH  Google Scholar 

  9. Zinkevich M, Weimer M, Smola AJ, Li L (2011) Parallelized stochastic gradient descent. In: Advances in Neural Information Processing Systems 23: Conference on Neural Information Processing Systems A Meeting Held December

  10. Xing EP, Ho Q, Dai W, Kim J.K, Wei J, Lee S, Zheng X, Xie P, Kumar A, Yu Y (2015) Petuum: A new platform for distributed machine learning on big data. IEEE Transactions on Big Data, 1335–1344

  11. Niu F, Recht B, Re C, Wright SJ (2011) Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. Adv Neural Inf Process Syst 24:693–701

    MATH  Google Scholar 

  12. Sra S, Yu AW, Li M, Smola AJ (2016) Adadelay: Delay adaptive distributed stochastic optimization. In: International Conference on Artificial Intelligence and Statistics

  13. Dca B, Sla C, Yz A (2020) Wp-sgd: weighted parallel sgd for distributed unbalanced-workload training system. J Parallel Distrib Comput 145:202–216

    Article  MATH  Google Scholar 

  14. Shang F, Huang H, Fan J, Liu Y, Liu J (2021) Asynchronous parallel, sparse approximated svrg for high-dimensional machine learning. IEEE Trans Knowl Data Eng 35(12):12081–12094

    MATH  Google Scholar 

  15. Upadhyaya SR (2013) Parallel approaches to machine learning-a comprehensive survey. J Parallel Distrib Comput 73(3):284–292

    Article  MATH  Google Scholar 

  16. Lian H, Fan Z (2018) Divide-and-conquer for debiased l1-norm support vector machine in ultra-high dimensions. J Mach Learn Res 18(182):1–26

    MATH  Google Scholar 

  17. Melki G, Kecman V (2016) Speeding up online training of l1 support vector machines. In: IEEE Southeastcon 2016

  18. Jing L (2022) Prediction of rural investment and construction funds based on logistic regression and support vector machine combination model. Montreal, QC, Canada, pp 185–190

  19. Mu L (2014) Efficient mini-batch training for stochastic optimization. ACM

  20. Nesterov Yu (2013) Gradient methods for minimizing composite functions. Math Program 140(1):125–161

    Article  MathSciNet  MATH  Google Scholar 

  21. Khirirat S, Feyzmahdavian HR, Johansson M (2017) Mini-batch gradient descent: Faster convergence under data sparsity. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC)

  22. Dekel O, Ran GB, Shamir O, Xiao L (2010) Optimal distributed online prediction using mini-batches. Journal of Machine Learning Research

  23. Zaharia M, Chowdhury M, Franklin M.J, Shenker S, Stoica I (2010) Spark: Cluster computing with working sets. USENIX Association

  24. Cheng D, Zhou X, Lama P, Wu J, Jiang C (2017) Cross-platform resource scheduling for spark and mapreduce on yarn. IEEE Trans. Comput. 66(8):1341–1353

    Article  MathSciNet  MATH  Google Scholar 

  25. Cheng G, Ying S, Wang B, Li Y (2021) Efficient performance prediction for apache spark. J Parallel Distrib Comput 149:40–51

    Article  MATH  Google Scholar 

  26. Zhang H, Liu Z, Huang H, Wang L Ftsgd: An adaptive stochastic gradient descent algorithm for spark mllib. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)

  27. Yang Z, Bhimani J, Yao YI, Lin C, Wang J (2018) Autoadmin: Automatic and dynamic resource reservation admission control in hadoop yarn clusters. Scalable Computing

  28. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization, pp. 257–269

  29. Chilimbi T, Suzue Y, Apacible J, Kalyanaraman K (2014) Project adam: Building an efficient and scalable deep learning training system. USENIX Association

  30. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  31. Qin C, Li B, Han B (2023) Fast brain tumor detection using adaptive stochastic gradient descent on shared-memory parallel environment. Eng Appl Artif Intell 120:105816

    Article  MATH  Google Scholar 

  32. Rettkowski Jens, Boutros Andrew, Göhringer Diana (2017) Hw/sw co-design of the hog algorithm on a xilinx zynq soc. J Parallel Distrib Comput 109:50–62

    Article  MATH  Google Scholar 

  33. Venkataramana L, Jacob SG, Ramadoss R (2020) A parallel multilevel feature selection algorithm for improved cancer classification. J Parallel Distrib Comput 138:78–98

    Article  MATH  Google Scholar 

  34. Dang Q, Yang S, Liu Q, Ruan J (2024) Adaptive and communication-efficient zeroth-order optimization for distributed internet of things. IEEE Internet Things J 11(22):37200–37213. https://doi.org/10.1109/JIOT.2024.3441691

    Article  MATH  Google Scholar 

Download references

Funding

This work submitted by the author has received financial support from the project of Ningxia Higher Education Institutions (NYG2024093), Innovation Project for Postgraduate Students of North Minzu University (YCX24094).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Chuandong Qin, Yiqing Zhang. The first draft of the manuscript was written by Yiqing Zhang and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. All authors agreed with the content and that all gave explicit consent to submit. Chuandong Qin: Made substantial contributions to the conception or design of the work, and approved the version to be published. Yiqing Zhang: The acquisition, analysis, interpretation of data and the creation of new software used in the work, drafted the work or revised it critically for important intellectual content.

Corresponding author

Correspondence to Yiqing Zhang.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Large-scale Machine Learning with Synchronous Parallel Adaptive Stochastic Variance Reduction Gradient Descent for High-dimensional Blindness Detection on Spark”.

Ethical approval

This is an observational study. No ethical approval is required.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent to publish

Additional informed consent was obtained from all individual participants for whom identifying information is included in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, C., Zhang, Y. & Cao, Y. Large-scale machine learning with synchronous parallel adaptive stochastic variance reduction gradient descent for high-dimensional blindness detection on spark. J Supercomput 81, 590 (2025). https://doi.org/10.1007/s11227-025-07046-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-07046-8

Keywords

Profiles

  1. Yu Cao