Large-scale machine learning with synchronous parallel adaptive stochastic variance reduction gradient descent for high-dimensional blindness detection on spark

Qin, Chuandong; Zhang, Yiqing; Cao, Yu

doi:10.1007/s11227-025-07046-8

Large-scale machine learning with synchronous parallel adaptive stochastic variance reduction gradient descent for high-dimensional blindness detection on spark

Published: 09 March 2025

Volume 81, article number 590, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Chuandong Qin^1,2,
Yiqing Zhang¹ &
Yu Cao¹

67 Accesses
Explore all metrics

Abstract

The parallelization of optimization algorithms is of paramount importance in large-scale machine learning. In this paper, we explore the implementation of Adaptive learning rate Stochastic Gradient Descent (A-SGD) in a synchronized and parallelized manner. Additionally, we incorporate a Variance Reduction (VR) strategy to enhance the rate of convergence. Our approach addresses the complexity associated with high-dimensional datasets, particularly within the context of Logistic Regression (LR) and Support Vector Machine (SVM). Initially, we utilize the Histogram of Oriented Gradients (HOG) to extract high-dimensional sparse features from a dataset designed for Blindness Detection. Subsequently, we employ LR and SVM as our classifiers of choice. Finally, we apply the Synchronous A-SGD (SA-SGD) and Synchronous Adaptive Stochastic Variance Reduction (SA-SVRG) to the solutions of these classifiers. Our experimental results indicate that the performance of SA-SGD and SA-SVRG is notably superior when executed on a cluster as opposed to a single node.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel Learning of Local SVM Algorithms for Classifying Large Datasets

Big Data Classification: A Combined Approach Based on Parallel and Approx SVM

Random Local SVMs for Classifying Large Datasets

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The APTOS 2019 Blindness Detection dataset used in the experiment is sourced from the official website of Kaggle: https://www.kaggle.com/c/aptos2019-blindness-detection.

Code Availability

Enquiries about Code availability should be directed to the authors.

References

Karim MR, Shajalal M, Graß A, Döhmen T, Chala SA, Beecks C, Decker S (2022) Interpreting Black-box Machine Learning Models for High Dimensional Datasets
Fournier Q, Aloise D (2021) Empirical comparison between autoencoders and traditional dimensionality reduction methods
Bottou L (2018) Optimization methods for large-scale machine learning. SIAM Review
Konen J, Jie L, Richtárik P, Taká M (2014) ms2gd: Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE Journal of Selected Topics in Signal Processing, 242–255
Roux NL, Schmidt M, Bach F (2013) A stochastic gradient method with an exponential convergence rate for finite training sets. Adv Neural Inf Process Syst 25:2663–2671
MATH Google Scholar
Johnson R, Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. News Physiol Sci 26:315–323
MATH Google Scholar
Mustafin A, Olshevsky A, Paschalidis IC (2022) Closing the gap between SVRG and TD-SVRG with Gradient Splitting
Pan H, Zheng L (2022) N-svrg:stochastic variance reduction gradient with noise reduction ability for small batch samples. CMES-Comput Model Eng Sci 004:000
MATH Google Scholar
Zinkevich M, Weimer M, Smola AJ, Li L (2011) Parallelized stochastic gradient descent. In: Advances in Neural Information Processing Systems 23: Conference on Neural Information Processing Systems A Meeting Held December
Xing EP, Ho Q, Dai W, Kim J.K, Wei J, Lee S, Zheng X, Xie P, Kumar A, Yu Y (2015) Petuum: A new platform for distributed machine learning on big data. IEEE Transactions on Big Data, 1335–1344
Niu F, Recht B, Re C, Wright SJ (2011) Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. Adv Neural Inf Process Syst 24:693–701
MATH Google Scholar
Sra S, Yu AW, Li M, Smola AJ (2016) Adadelay: Delay adaptive distributed stochastic optimization. In: International Conference on Artificial Intelligence and Statistics
Dca B, Sla C, Yz A (2020) Wp-sgd: weighted parallel sgd for distributed unbalanced-workload training system. J Parallel Distrib Comput 145:202–216
Article MATH Google Scholar
Shang F, Huang H, Fan J, Liu Y, Liu J (2021) Asynchronous parallel, sparse approximated svrg for high-dimensional machine learning. IEEE Trans Knowl Data Eng 35(12):12081–12094
MATH Google Scholar
Upadhyaya SR (2013) Parallel approaches to machine learning-a comprehensive survey. J Parallel Distrib Comput 73(3):284–292
Article MATH Google Scholar
Lian H, Fan Z (2018) Divide-and-conquer for debiased l1-norm support vector machine in ultra-high dimensions. J Mach Learn Res 18(182):1–26
MATH Google Scholar
Melki G, Kecman V (2016) Speeding up online training of l1 support vector machines. In: IEEE Southeastcon 2016
Jing L (2022) Prediction of rural investment and construction funds based on logistic regression and support vector machine combination model. Montreal, QC, Canada, pp 185–190
Mu L (2014) Efficient mini-batch training for stochastic optimization. ACM
Nesterov Yu (2013) Gradient methods for minimizing composite functions. Math Program 140(1):125–161
Article MathSciNet MATH Google Scholar
Khirirat S, Feyzmahdavian HR, Johansson M (2017) Mini-batch gradient descent: Faster convergence under data sparsity. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC)
Dekel O, Ran GB, Shamir O, Xiao L (2010) Optimal distributed online prediction using mini-batches. Journal of Machine Learning Research
Zaharia M, Chowdhury M, Franklin M.J, Shenker S, Stoica I (2010) Spark: Cluster computing with working sets. USENIX Association
Cheng D, Zhou X, Lama P, Wu J, Jiang C (2017) Cross-platform resource scheduling for spark and mapreduce on yarn. IEEE Trans. Comput. 66(8):1341–1353
Article MathSciNet MATH Google Scholar
Cheng G, Ying S, Wang B, Li Y (2021) Efficient performance prediction for apache spark. J Parallel Distrib Comput 149:40–51
Article MATH Google Scholar
Zhang H, Liu Z, Huang H, Wang L Ftsgd: An adaptive stochastic gradient descent algorithm for spark mllib. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech)
Yang Z, Bhimani J, Yao YI, Lin C, Wang J (2018) Autoadmin: Automatic and dynamic resource reservation admission control in hadoop yarn clusters. Scalable Computing
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization, pp. 257–269
Chilimbi T, Suzue Y, Apacible J, Kalyanaraman K (2014) Project adam: Building an efficient and scalable deep learning training system. USENIX Association
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Qin C, Li B, Han B (2023) Fast brain tumor detection using adaptive stochastic gradient descent on shared-memory parallel environment. Eng Appl Artif Intell 120:105816
Article MATH Google Scholar
Rettkowski Jens, Boutros Andrew, Göhringer Diana (2017) Hw/sw co-design of the hog algorithm on a xilinx zynq soc. J Parallel Distrib Comput 109:50–62
Article MATH Google Scholar
Venkataramana L, Jacob SG, Ramadoss R (2020) A parallel multilevel feature selection algorithm for improved cancer classification. J Parallel Distrib Comput 138:78–98
Article MATH Google Scholar
Dang Q, Yang S, Liu Q, Ruan J (2024) Adaptive and communication-efficient zeroth-order optimization for distributed internet of things. IEEE Internet Things J 11(22):37200–37213. https://doi.org/10.1109/JIOT.2024.3441691
Article MATH Google Scholar

Download references

Funding

This work submitted by the author has received financial support from the project of Ningxia Higher Education Institutions (NYG2024093), Innovation Project for Postgraduate Students of North Minzu University (YCX24094).

Author information

Authors and Affiliations

School of Mathematics and Information Science, North Minzu University, Yinchuan, 750021, China
Chuandong Qin, Yiqing Zhang & Yu Cao
Ningxia Key Laboratory of Intelligent Information and big data processing, North Minzu University, Yinchuan, 750021, China
Chuandong Qin

Authors

Chuandong Qin
View author publications
You can also search for this author inPubMed Google Scholar
Yiqing Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yu Cao
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Chuandong Qin, Yiqing Zhang. The first draft of the manuscript was written by Yiqing Zhang and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. All authors agreed with the content and that all gave explicit consent to submit. Chuandong Qin: Made substantial contributions to the conception or design of the work, and approved the version to be published. Yiqing Zhang: The acquisition, analysis, interpretation of data and the creation of new software used in the work, drafted the work or revised it critically for important intellectual content.

Corresponding author

Correspondence to Yiqing Zhang.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Large-scale Machine Learning with Synchronous Parallel Adaptive Stochastic Variance Reduction Gradient Descent for High-dimensional Blindness Detection on Spark”.

Ethical approval

This is an observational study. No ethical approval is required.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent to publish

Additional informed consent was obtained from all individual participants for whom identifying information is included in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qin, C., Zhang, Y. & Cao, Y. Large-scale machine learning with synchronous parallel adaptive stochastic variance reduction gradient descent for high-dimensional blindness detection on spark. J Supercomput 81, 590 (2025). https://doi.org/10.1007/s11227-025-07046-8

Download citation

Accepted: 11 February 2025
Published: 09 March 2025
DOI: https://doi.org/10.1007/s11227-025-07046-8

Keywords

Profiles

Yu Cao View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large-scale machine learning with synchronous parallel adaptive stochastic variance reduction gradient descent for high-dimensional blindness detection on spark

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Parallel Learning of Local SVM Algorithms for Classifying Large Datasets

Big Data Classification: A Combined Approach Based on Parallel and Approx SVM

Random Local SVMs for Classifying Large Datasets

Explore related subjects

Data availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now