Abstract
The aim is to improve the efficiency of medical data processing and establish a sound medical data management system. To apply distributed parallel classification algorithms in the field of hospital intelligent guidance, a Parallel Random Forest (PRF) classification algorithm is proposed based on the Apache Spark cloud computing platform. Given sparse cluster loss in variable density distribution data sets, an Adaptive Domain Density Peak Clustering (ADDPC) method is proposed. Here, a Bilayer Parallel Training-Convolutional Neural Network (BPT-CNN) model based on distributed computing is proposed to detect and classify colon cancer nuclei more accurately through the large-scale parallel deep learning (DL) algorithm. Then, the performance of the proposed model is evaluated through case analysis. The results show that the PRF algorithm based on distributed cloud computing platform can independently design data-parallel tasks, thereby optimizing the data communication cost and efficiency. ADDPC algorithm can adaptively measure domain density and merge sparse clusters to prevent data loss and fragmentation. The BPT-CNN model improves the performance of the algorithm and balances the workload of each task in the algorithm. The results have a significant reference value for solving problems in medical data processing.
Similar content being viewed by others
References
Hashem IAT, Yaqoob I, Anuar NB et al (2015) The rise of ‘big data’ on cloud computing: review and open research issues. Inf Syst 47(Jan.):98–115
Islas MA, Rubio JJ, Muñiz S et al (2021) A fuzzy logic model for hourly electrical power demand modeling. Electronics 10(4):448–453
De Jesús RJ (2009) SOFMLS: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst 17(6):1296–1309
Chiang HS, Chen MY, Huang YJ (2019) Wavelet-based EEG processing for epilepsy detection using fuzzy entropy and associative petri net. IEEE Access 7:103255–103262
De Rubio JJ (2020) Stability analysis of the modified Levenberg–Marquardt algorithm for the artificial neural network training. IEEE Trans Neural Netw Learn Syst 32:124–131
Gao K, Jo SB, Shi X et al (2019) Over 12% efficiency non-fullerene all-small-molecule organic solar cells with sequentially evolved multilength scale morphologies. Adv Mater 31(12):1807842–1807849
Lv Z, Li X, Wang W et al (2018) Government affairs service platform for smart city. Futur Gener Comput Syst 81:443–451
Furlán F, Rubio E, Sossa H et al (2020) CNN based detectors on planetary environments: a performance evaluation. Front Neurorobot 14:85–91
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90
Lv Y, Duan Y, Kang W et al (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873
Carlin S, Curran K (2011) Cloud computing technology. Comput Secur 35(4):497–502
Polan D, Brady S, Kaufman R (2016) SU-C-207B-05: tissue segmentation of computed tomography images using a random forest algorithm: a feasibility study. Med Phys 43(6):3330–3331
Joshuva A, Sugumaran V (2017) Fault diagnosis for wind turbine blade through vibration signals using statistical features and random forest algorithm. Int J Pharmacy Technol 9(1):28684–28696
Zhang Q, Sun X, Feng K et al (2017) Predicting citrullination sites in protein sequences using mRMR method and random forest algorithm. Comb Chem High Throughput Screen 20(2):164–173
Jeon D, Kim W (2015) Random forest algorithm for linked data using a parallel processing environment. ICE Trans Inf Syst 98(2):372–380
Parmar M, Wang D, Zhang X et al (2019) REDPC: a residual error-based density peak clustering algorithm. Neurocomputing 348(JUL.5):82–96
Tu B, Zhang X, Kang X et al (2019) Spatial density peak clustering for hyperspectral image classification with noisy labels. IEEE Trans Geosci Remote Sens 57(7):5085–5097
Jiang J, Zhou W, Wang L et al (2019) HaloDPC: an improved recognition method on halo node for density peak clustering algorithm. Int J Pattern Recognit Artif Intell 33(8):1950012.1-1950012.19
Xie H, Zhao A, Huang S et al (2018) Unsupervised hyperspectral remote sensing image clustering based on adaptive density. IEEE Geosci Remote Sens Lett 15(4):632–636
Jin Z, Xu P (2018) An adaptive community detection algorithm of density peak clustering. Harbin Gongye Daxue Xuebao/J Harbin Inst Technol 50(5):44–51
Wan M, Ciardo G, Miner AS (2011) Approximate steady-state analysis of large Markov models based on the structure of their decision diagram encoding. Perform Eval 68(5):463–486
Moeskops P, Viergever MA, Mendrik AM et al (2017) Automatic segmentation of MR brain images with a convolutional neural network. IEEE Trans Med Imaging 35(5):1252–1261
Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108(Sep.15):42–49
Le Callet P, Viard-Gaudin C, Barba D (2006) A convolutional neural network approach for objective video quality assessment. IEEE Trans Neural Netw 17(5):1316–1327
Debotosh B, Basu DK, Mahantapas K et al (2010) A parallel framework for multilayer perceptron for human face recognition. Int J Comput Sci Secur 3(6):491–507
Kurihara K (2012) An execution time estimation method of functions and its application in a distributed computing environment. IEEE Trans Med Imaging 13(4):601–609
Gallager R (1977) A minimum delay routing algorithm using distributed computation. IEEE Trans Commun 25(1):73–85
Macía J, Posas F, Solé RV (2012) Distributed computation: the new wave of synthetic biology devices. Trends Biotechnol 30(6):342–349
Jiang L, Xu LD, Cai H et al (2014) An IoT-oriented data storage framework in cloud computing platform. IEEE Trans Industr Inf 10(2):1443–1451
Firdaus A, Anuar NB, Razak MFA et al (2018) Root exploit detection and features optimization: mobile device and blockchain based medical data management. J Med Syst 42(6):112
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, J., Liang, X., Ruan, W. et al. High-performance medical data processing technology based on distributed parallel machine learning algorithm. J Supercomput 78, 5933–5956 (2022). https://doi.org/10.1007/s11227-021-04060-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04060-4