Skip to main content
Log in

High-performance medical data processing technology based on distributed parallel machine learning algorithm

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The aim is to improve the efficiency of medical data processing and establish a sound medical data management system. To apply distributed parallel classification algorithms in the field of hospital intelligent guidance, a Parallel Random Forest (PRF) classification algorithm is proposed based on the Apache Spark cloud computing platform. Given sparse cluster loss in variable density distribution data sets, an Adaptive Domain Density Peak Clustering (ADDPC) method is proposed. Here, a Bilayer Parallel Training-Convolutional Neural Network (BPT-CNN) model based on distributed computing is proposed to detect and classify colon cancer nuclei more accurately through the large-scale parallel deep learning (DL) algorithm. Then, the performance of the proposed model is evaluated through case analysis. The results show that the PRF algorithm based on distributed cloud computing platform can independently design data-parallel tasks, thereby optimizing the data communication cost and efficiency. ADDPC algorithm can adaptively measure domain density and merge sparse clusters to prevent data loss and fragmentation. The BPT-CNN model improves the performance of the algorithm and balances the workload of each task in the algorithm. The results have a significant reference value for solving problems in medical data processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Hashem IAT, Yaqoob I, Anuar NB et al (2015) The rise of ‘big data’ on cloud computing: review and open research issues. Inf Syst 47(Jan.):98–115

    Article  Google Scholar 

  2. Islas MA, Rubio JJ, Muñiz S et al (2021) A fuzzy logic model for hourly electrical power demand modeling. Electronics 10(4):448–453

    Article  Google Scholar 

  3. De Jesús RJ (2009) SOFMLS: online self-organizing fuzzy modified least-squares network. IEEE Trans Fuzzy Syst 17(6):1296–1309

    Article  Google Scholar 

  4. Chiang HS, Chen MY, Huang YJ (2019) Wavelet-based EEG processing for epilepsy detection using fuzzy entropy and associative petri net. IEEE Access 7:103255–103262

    Article  Google Scholar 

  5. De Rubio JJ (2020) Stability analysis of the modified Levenberg–Marquardt algorithm for the artificial neural network training. IEEE Trans Neural Netw Learn Syst 32:124–131

    MathSciNet  Google Scholar 

  6. Gao K, Jo SB, Shi X et al (2019) Over 12% efficiency non-fullerene all-small-molecule organic solar cells with sequentially evolved multilength scale morphologies. Adv Mater 31(12):1807842–1807849

    Article  Google Scholar 

  7. Lv Z, Li X, Wang W et al (2018) Government affairs service platform for smart city. Futur Gener Comput Syst 81:443–451

    Article  Google Scholar 

  8. Furlán F, Rubio E, Sossa H et al (2020) CNN based detectors on planetary environments: a performance evaluation. Front Neurorobot 14:85–91

    Article  Google Scholar 

  9. Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90

    Article  Google Scholar 

  10. Lv Y, Duan Y, Kang W et al (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873

    Google Scholar 

  11. Carlin S, Curran K (2011) Cloud computing technology. Comput Secur 35(4):497–502

    Google Scholar 

  12. Polan D, Brady S, Kaufman R (2016) SU-C-207B-05: tissue segmentation of computed tomography images using a random forest algorithm: a feasibility study. Med Phys 43(6):3330–3331

    Article  Google Scholar 

  13. Joshuva A, Sugumaran V (2017) Fault diagnosis for wind turbine blade through vibration signals using statistical features and random forest algorithm. Int J Pharmacy Technol 9(1):28684–28696

    Google Scholar 

  14. Zhang Q, Sun X, Feng K et al (2017) Predicting citrullination sites in protein sequences using mRMR method and random forest algorithm. Comb Chem High Throughput Screen 20(2):164–173

    Article  Google Scholar 

  15. Jeon D, Kim W (2015) Random forest algorithm for linked data using a parallel processing environment. ICE Trans Inf Syst 98(2):372–380

    Article  Google Scholar 

  16. Parmar M, Wang D, Zhang X et al (2019) REDPC: a residual error-based density peak clustering algorithm. Neurocomputing 348(JUL.5):82–96

    Article  Google Scholar 

  17. Tu B, Zhang X, Kang X et al (2019) Spatial density peak clustering for hyperspectral image classification with noisy labels. IEEE Trans Geosci Remote Sens 57(7):5085–5097

    Article  Google Scholar 

  18. Jiang J, Zhou W, Wang L et al (2019) HaloDPC: an improved recognition method on halo node for density peak clustering algorithm. Int J Pattern Recognit Artif Intell 33(8):1950012.1-1950012.19

    Article  Google Scholar 

  19. Xie H, Zhao A, Huang S et al (2018) Unsupervised hyperspectral remote sensing image clustering based on adaptive density. IEEE Geosci Remote Sens Lett 15(4):632–636

    Article  Google Scholar 

  20. Jin Z, Xu P (2018) An adaptive community detection algorithm of density peak clustering. Harbin Gongye Daxue Xuebao/J Harbin Inst Technol 50(5):44–51

    Google Scholar 

  21. Wan M, Ciardo G, Miner AS (2011) Approximate steady-state analysis of large Markov models based on the structure of their decision diagram encoding. Perform Eval 68(5):463–486

    Article  Google Scholar 

  22. Moeskops P, Viergever MA, Mendrik AM et al (2017) Automatic segmentation of MR brain images with a convolutional neural network. IEEE Trans Med Imaging 35(5):1252–1261

    Article  Google Scholar 

  23. Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst 108(Sep.15):42–49

    Article  Google Scholar 

  24. Le Callet P, Viard-Gaudin C, Barba D (2006) A convolutional neural network approach for objective video quality assessment. IEEE Trans Neural Netw 17(5):1316–1327

    Article  Google Scholar 

  25. Debotosh B, Basu DK, Mahantapas K et al (2010) A parallel framework for multilayer perceptron for human face recognition. Int J Comput Sci Secur 3(6):491–507

    Google Scholar 

  26. Kurihara K (2012) An execution time estimation method of functions and its application in a distributed computing environment. IEEE Trans Med Imaging 13(4):601–609

    MathSciNet  Google Scholar 

  27. Gallager R (1977) A minimum delay routing algorithm using distributed computation. IEEE Trans Commun 25(1):73–85

    Article  MathSciNet  MATH  Google Scholar 

  28. Macía J, Posas F, Solé RV (2012) Distributed computation: the new wave of synthetic biology devices. Trends Biotechnol 30(6):342–349

    Article  Google Scholar 

  29. Jiang L, Xu LD, Cai H et al (2014) An IoT-oriented data storage framework in cloud computing platform. IEEE Trans Industr Inf 10(2):1443–1451

    Article  Google Scholar 

  30. Firdaus A, Anuar NB, Razak MFA et al (2018) Root exploit detection and features optimization: mobile device and blockchain based medical data management. J Med Syst 42(6):112

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Liang, X., Ruan, W. et al. High-performance medical data processing technology based on distributed parallel machine learning algorithm. J Supercomput 78, 5933–5956 (2022). https://doi.org/10.1007/s11227-021-04060-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-04060-4

Keywords

Navigation