Skip to main content
Log in

Mining representative maximal dense cohesive subnetworks

  • Original Article
  • Published:
Network Modeling Analysis in Health Informatics and Bioinformatics Aims and scope Submit manuscript

Abstract

Massive amounts of graph data have been generated in many areas, including computational biology and social networks. Often these graphs have attributes associated with nodes. One of the most intriguing questions in graphs representing complex data is to find communities or clusters. The use of attribute data in finding clusters is shown to be effective in many application areas, e.g., finding subnetwork biomarkers for cancer prediction and targeted advertising for a group of friends in social network. In this paper, we propose an algorithm for mining maximal dense cohesive clusters from node-attributed graphs. Typically the number of reported maximal dense cohesive clusters can be very large for relaxed constraints; therefore, we propose a post-processing algorithm for extracting a representative subset of these clusters. Experiments on real-world datasets show that the proposed approach is effective in mining meaningful biological clusters from protein–protein interaction network with attributes extracted from gene expression datasets. Furthermore, the proposed approach outperforms competitive algorithms in terms of the running time of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17(6):734–749

    Article  Google Scholar 

  • Aggarwal CC, Wang H (2010) Managing and mining graph data, vol 40. Springer Berlin

  • Asur S, Huberman BA (2010) Predicting the future with social media. In: 2010 IEEE/WIC/ACM International Conference on eeb intelligence and intelligent agent technology (WI-IAT), vol 1. IEEE, pp 492–499

  • Avis D, Fukuda K (1996) Reverse search for enumeration. Discret Appl Math 65(1):21–46

    Article  MathSciNet  MATH  Google Scholar 

  • Batada NN, Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hurst LD, Tyers M (2007) Still stratus not altocumulus: further evidence against the date/party hub distinction. PLoS Biol 5(6):e154

    Article  Google Scholar 

  • Chatr-aryamontri A, Breitkreutz BJ, Heinicke S, Boucher L, Winter A, Stark C, Nixon J, Ramage L, Kolas N, ODonnell L, et al (2013) The biogrid interaction database: 2013 update. Nucl Acids Res 41(D1):D816–D823

    Article  Google Scholar 

  • Chowdhury SA, Nibbe RK, Chance MR, Koyutürk M (2011) Subnetwork state functions define dysregulated subnetworks in cancer. J Comput Biol 18(3):263–281

    Article  MathSciNet  Google Scholar 

  • Chuang HY, Lee E, Liu YT, Lee D, Ideker T (2007) Network-based classification of breast cancer metastasis. Mol Syst Biol 3(1):140

  • Colak R, Moser F, Chu JSC, Schönhuth A, Chen N, Ester M (2010) Module discovery by exhaustive search for densely connected, co-expressed regions in biomolecular interaction networks. PloS One 5(10):e13348

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174

    Article  MathSciNet  Google Scholar 

  • Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York, NY

    MATH  Google Scholar 

  • Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11(12):4241–4257

    Article  Google Scholar 

  • Gavin AC, Bösche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM et al (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868):141–147

    Article  Google Scholar 

  • Georgii E, Dietmann S, Uno T, Pagel P, Tsuda K (2009) Enumeration of condition-dependent dense modules in protein interaction networks. Bioinformatics 25(7):933–940

    Article  Google Scholar 

  • Gunnemann S, Farber I, Boden B, Seidl T (2010) Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: 2010 IEEE 10th international conference on data mining (ICDM). IEEE, pp 845–850

  • Huang DW, Sherman BT, Lempicki RA (2009a) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1–13

    Article  Google Scholar 

  • Huang DW, Sherman BT, Lempicki RA (2009b) Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nat Protoc 4(1):44–57

    Article  Google Scholar 

  • Ito T, Tashiro K, Muta S, Ozawa R, Chiba T, Nishizawa M, Yamamoto K, Kuhara S, Sakaki Y (2000) Toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc Natl Acad Sci 97(3):1143–1147

    Article  Google Scholar 

  • Jin R, Mccallen S, Liu C, Xiang Y, Almaas E, Zhou X (2009) Identify dynamic network modules with temporal and spatial constraints. In: Pacific symposium on biocomputing

  • McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol:415–444 (2001)

  • Shiga M, Takigawa I, Mamitsuka H (2007) A spectral clustering approach to optimally combining numericalvectors with a modular network. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 647–656

  • Tong AHY, Drees B, Nardelli G, Bader GD, Brannetti B, Castagnoli L, Evangelista M, Ferracuti S, Nelson B, Paoluzi S et al (2002) A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295(5553):321–324

    Article  Google Scholar 

  • Uno T (2010) An efficient algorithm for solving pseudo clique enumeration problem. Algorithmica 56(1):3–16

    Article  MathSciNet  MATH  Google Scholar 

  • Vazirani VV (2001) Approximation algorithms. Springer, Berlin

Download references

Acknowledgment

This study was supported in part by the National Science Foundation (NSF) awards IIS-1423321.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saeed Salem.

Additional information

A. Goparaju and T. Brazier have contributed equally for this research.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goparaju, A., Brazier, T. & Salem, S. Mining representative maximal dense cohesive subnetworks. Netw Model Anal Health Inform Bioinforma 4, 29 (2015). https://doi.org/10.1007/s13721-015-0101-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13721-015-0101-6

Keywords

Navigation