Skip to main content
Log in

A novel parallel Markov clustering method in biological interaction network analysis under multi-GPU computing environment

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The various clustering methods are widely applied in analyzing biological interaction networks, such as the protein–protein interaction and the genetic interaction networks. With the rapid growth of these biological datasets in scale, much longer runtime is required to make cluster analyses on them. In this paper, we propose a novel parallel Markov clustering (MCL) method based on the ELLPACK-R sparse matrix format that can run on multiple graphic processing units (GPUs) equipped standalone computers. The method is implemented using the Compute Unified Device Architecture (CUDA) programming framework, and fine-grained warp-level optimization is introduced for improving the performance. The BioGRID, a large-scale and freely accessible database of protein and genetic interactions, is adopted as the dataset in the experiment. The method has been assessed on a desktop computer equipped with two NVIDIA GTX 1070 GPUs. The results show that the proposed multi-GPU method can conduct MCL clustering on the full-size BioGRID database with about 6.5 min, that is much faster than the CPU serial MCL implementation which needs almost an hour and a half execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Van Dongen SM (2000) Graph clustering by flow simulation. University of Utrecht, Utrecht

    Google Scholar 

  2. Brohee S, Van Helden J (2006) Evaluation of clustering algorithms for protein–protein interaction networks. BMC Bioinform 7(1):488

    Article  Google Scholar 

  3. Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3(1):88

    Article  Google Scholar 

  4. Vlasblom J, Wodak SJ (2009) Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC Bioinform 10(1):99

    Article  Google Scholar 

  5. Keckler SW, Dally WJ, Khailany B et al (2011) GPUs and the future of parallel computing. IEEE Micro 31(5):7–17

    Article  Google Scholar 

  6. NVIDIA (2019) CUDA C programming guide v10.1.243. https://docs.nvidia.com/cuda/cuda-c-programming-guide/

  7. AMD (2019) OpenCL programming guide. http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf

  8. Zhou W, Cai Z, Lian B et al (2017) Protein database search of hybrid alignment algorithm based on GPU parallel acceleration. J Supercomput 73(10):4517–4534

    Article  Google Scholar 

  9. Zhou W, Cai Z, Lian B et al (2018) A multi-GPU protein database search model with hybrid alignment manner on distributed GPU clusters. Concurr Comput Pract Exp 30(18):e4522

    Article  Google Scholar 

  10. Rani S, Gupta OP (2017) CLUS\_GPU-BLASTP: accelerated protein sequence alignment using GPU-enabled cluster. J Supercomput 73(10):4580–4595

    Article  Google Scholar 

  11. Anderson JA, Lorenz CD, Travesset A (2008) General purpose molecular dynamics simulations fully implemented on graphics processing units. J Comput Phys 227(10):5342–5359

    Article  Google Scholar 

  12. Bustamam A, Burrage K, Hamilton NA (2012) Fast parallel Markov clustering in bioinformatics using massively parallel computing on GPU with CUDA and ELLPACK-R sparse format. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 9(3):679–692

    Article  Google Scholar 

  13. Saadi H, Taboudjemat NN, Rahmoun A et al (2019) Efficient GPU-based parallelization of solvation calculation for the blind docking problem. J Supercomput. https://doi.org/10.1007/s11227-019-02834-5

    Article  Google Scholar 

  14. Escobar JJ, Ortega J, Díaz AF et al (2019) Time-energy analysis of multilevel parallelism in heterogeneous clusters: the case of EEG classification in BCI tasks. J Supercomput 75(7):3397–3425

    Article  Google Scholar 

  15. Cheng J, Grossman M, McKercher T (2014) Professional Cuda C programming. Wiley, Indianapolis

    Google Scholar 

  16. Saad Y (2003) Iterative methods for sparse linear systems. SIAM, Philadelphia

    Book  Google Scholar 

  17. Vazquez F, Ortega G, Fernández JJ et al (2010) Improving the performance of the sparse matrix vector product with GPUs. In: 2010 10th IEEE International Conference on Computer and Information Technology. IEEE, pp 1146–1151

  18. Wikipedia (2019) Markov chain. https://en.wikipedia.org/wiki/Markov_chain

  19. Stark C, Breitkreutz BJ, Reguly T et al (2006) BioGRID: a general repository for interaction datasets. Nucleic Acids Res 34(suppl-1):D535–D539

    Article  Google Scholar 

  20. He L, Lu L, Wang Q (2017) An optimal parallel implementation of Markov Clustering based on the coordination of CPU and GPU. J Intell Fuzzy Syst 32(5):3609–3617

    Article  Google Scholar 

  21. Azad A, Pavlopoulos GA, Ouzounis CA et al (2018) HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks. Nucleic Acids Res 46(6):e33–e33

    Article  Google Scholar 

  22. Liu Y, Schmidt B (2018) Lightspmv: faster cuda-compatible sparse matrix-vector multiplication using compressed sparse rows. J Signal Process Syst 90(1):69–86

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank all the reviewers for their precious comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, Y., Zhou, W. A novel parallel Markov clustering method in biological interaction network analysis under multi-GPU computing environment. J Supercomput 76, 7689–7706 (2020). https://doi.org/10.1007/s11227-020-03193-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03193-2

Keywords

Navigation