Abstract
The identification and analysis of functional modules in protein–protein interaction (PPI) networks provide insight into understanding the organization and function of biological systems. A lot of overlapping structures are shared by the functional modules in PPI networks, which indicates there are some proteins play indispensable roles in different biological processes. Markov clustering (MCL) is a popular algorithm for clustering networks in bioinformatics. In this paper, to identify the overlapping structures among the functional modules and find more modules with biological significance in PPI networks, we propose a Markov clustering algorithm based on link similarity (MLS). First of all, the weighted link similarity is calculated and the link similarity matrix which measures the association strength of the protein interactions can be gotten. Then, the link similarity matrix is divided by applying Markov clustering, and the clustering results are mapped to original networks to analyze the protein modules. The method has been experimented on three databases, including DIP, Gavin and Krogan. Our results show that the MLS cannot only accurately identify the functional modules, but also outperform the original MCL algorithm and the F-measure value improved 5–10% compared with it.
Similar content being viewed by others
References
Bader GD, Hogue CWV (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinf 4(1):1471–2105
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826
King AD, Pržulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20(17):3013–3020
Enright AJ, Van Dongen S, Van Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucl Acids Res 30(7):1575–1584
Samuel J, Yuan X, Yuan X, et al (2010) Mining online full-text literature for novel protein interaction discovery. In: IEEE international conference on bioinformatics and biomedicine workshops (BIBMW). IEEE, pp 277–282
Nepusz T, Yu H, Paccanaro A (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods 9(5):471–472
Brohée S, Helden JV (2006) Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinf 7(1602):2791–2797
Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows: applications to community discovery. In: ACM SIGKDD international conference on knowledge discovery and data mining, Paris, France, June 28–July, 2009, DBLP, pp 737–746
Shih YK, Parthasarathy S (2012) Identifying functional modules in interaction networks through overlapping Markov clustering. Bioinformatics 28(18):i473–i479
Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174
Wang Y, Wang G, Meng D, et al (2014) A Markov clustering based link clustering method for overlapping module identification in yeast protein-protein interaction networks. In: Proceedings of the 10th international symposium on bioinformatics research and applications, ISBRA, Zhangjiajie, China, June 28–30. Springer, 8492, p 385
Yao FY, Chen L (2014) Similarity propagation based link prediction in bipartite networks. In: Proceedings of the 2014 international conference on network security and communication engineering (NSCE 2014), Hong Kong, Dec 25–26. CRC Press, pp 295–297
Meyer AS, Garcia AAF, Souza AP et al (2004) Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L. Genet Mol Biol 27(1):83–91
Leger JB, Daudin JJ, Vacher C (2015) Clustering methods differ in their ability to detect patterns in ecological networks. Methods Ecol Evol 6(4):474–481
Xenarios I, Salwinski L, Duan XJ et al (2002) DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucl Acids Res 30(1):303–305
Gavin AC, Bösche M, Krause R et al (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868):141–147
Krogan NJ, Cagney G, Yu H, et al (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084):637–643. https://search.proquest.com/docview/204545168?accountid=45184
Pu S, Wong J, Turner B et al (2009) Up-to-date catalogues of yeast protein complexes. Nucl Acids Res 37(3):825–831
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Shen H, Cheng X, Cai K et al (2009) Detect overlapping and hierarchical community structure in networks. Physica A 388(8):1706–1712
Li M, Wang J, Chen J (2008) A fast agglomerate algorithm for mining functional modules in protein interaction networks. In: International Conference on BMEI. IEEE, 1:3–7
Li IH, Huang JY, Liao IE, et al (2013) A sequence classification model based on pattern coverage rate. In: International conference on grid and pervasive computing. Springer, Berlin, pp 737–745
Rhrissorrakrai K, Gunsalus KC (2011) MINE: module identification in networks. BMC Bioinformatics 12(1):192
Zhao B, Wang J, Li M et al (2016) A new method for predicting protein functions from dynamic weighted interactome networks. IEEE Trans Nanobiosci 15(2):131–139
Zuo YC, Su WX, Zhang SH et al (2015) Discrimination of membrane transporter protein types using K-nearest neighbor method derived from the similarity distance of total diversity measure. Mol BioSyst 11(3):950–957
Sætre R, Sagae K, Tsujii JI (2007) Syntactic features for protein-protein interaction extraction. In: Short paper proceedings of the international symposium on languages in biology and medicine, DBL
Zhao B, Wang J, Li M et al (2014) Detecting protein complexes based on uncertain graph model. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 11(3):486–497
Butz M, Steenbuck ID, van Ooyen A (2014) Homeostatic structural plasticity increases the efficiency of small-world networks. Front Synaptic Neurosci 6:7
Schuch B, Feigenbutz M, Makino DL et al (2014) The exosome-binding factors Rrp6 and Rrp47 form a composite surface for recruiting the Mtr4 helicase. EMBO J 33(23):2829–2846
Gu L, Wang C, Zhang Y et al (2014) Trust model in cloud computing environment based on fuzzy theory. Int J Comput Commun Control 9(5):570–583
Acknowledgements
We thank the reviewers for their thoughtful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant Nos. 31771679, 31371533), the Special Fund for Key Program of Science and Technology of Anhui Province of China (Grant No.15czz03131, 16030701092), Project supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China (Grant No. kJ2016A836).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gu, L., Han, Y., Wang, C. et al. Module overlapping structure detection in PPI using an improved link similarity-based Markov clustering algorithm. Neural Comput & Applic 31, 1481–1490 (2019). https://doi.org/10.1007/s00521-018-3508-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3508-z