Abstract
Disease comorbidity is an important aspect of phenotype associations and reflects overlapping pathogenesis between diseases. Existing comorbidity studies usually focused on specific diseases and patient populations. In this study, we systematically mined and analyzed disease comorbidity patterns without restricting disease types and patient populations. We presented a data mining approach and extracted comorbidity patterns from a patient-disease database in the drug adverse event reporting system. The database contains records of 3,354,043 patients. We first demonstrated that the data are not severely biased towards specific patient populations and valuable for comorbidity mining. Then we developed an automatic pipeline to process the data, and applied an association rule mining algorithm to mine comorbidity relationships among multiple diseases. Our approach extracted 8,576 comorbidity patterns for 613 diseases. We constructed a disease comorbidity network from these patterns and demonstrated that the comorbidity clusters reflect genetic associations between diseases. Different from previous studies based on relative risk, which tends to identify comorbidities for rare diseases, our approach extracted many patterns for common diseases. We applied the approach on colorectal cancer, and found interesting relationships between colorectal cancer and metabolic disorders, which may lead to promising pathogenesis discoveries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brunner, H.G., Van Driel, A.: From syndrome families to functional genomics. Nat. Rev. Genet. 5, 545–551 (2004)
Tiffin, N., Andrade-Navarro, M.A., Perez-Iratxeta, C.: Linking genes to diseases: it’s all in the data. Genome Med. 1(8), 77 (2009)
Houle, D., Govindaraju, R.D., Omholt, S.: Phenomics: the next challenge. Nat. Rev. Genet. 11(12), 855–866 (2010)
Wu, X., Jiang, R., Zhang, M.Q., Li, S.: Network-based global inference of human disease genes. Mol. Syst. Biol. 4, 189 (2008)
Li, Y., Patra, J.C.: Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network. Bioinformatics 26(9), 1219–1224 (2010)
Vanunu, O., Magger, O., Ruppin, E., Shlomi, T., Sharan, R.: Associating Genes and Protein Complexes with Disease via Network Propagation. PLoS Comput. Biol. 6(1), e1000641 (2010)
Lage, K., et al.: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nature Biotechnology 25(3), 309–316 (2007)
Hwang, T., Atluri, G., Xie, M., et al.: Co-clustering phenome-genome for phenotype classification and disease gene discovery. Nucleic Acids Research 40(19), e146 (2012)
Iorio, F., Bosotti, R., Scacheri, E., et al.: Discovery of drug mode of action and drug repositioning from transcriptional responses. Proc. Nat. Acad. Sci. 107(33), 14621–14626 (2011)
van Driel, M.A., Bruggerman, J., Vriend, G., Brunner, H.G., Leunissen, J.A.: A text-mining analysis of the human phenome. Eur. J. Hum. Genet. 14, 535–542 (2006)
Oti, M., Huynen, M.A., Brunner, H.G.: Phenome connections. Trends Genet. 24(3), 103–106
Blair, D.R., Lyttle, C.S., Mortensen, J.M., et al.: A nondegenerate code of deleterious variants in mendelian Loci contributes to complex disease risk. Cell 155(1), 70–80 (2013)
Avery, C.L., He, Q., North, K.E., et al.: A phenomics-based strategy identifies loci on APOC1, BRAP, and PLCG1 associated with metabolic syndrome phenotype domains. PLoS Genetics 7(10), e1002322 (2011)
Joseph, C.G., Darrah, E., Shah, A.A., et al.: Association of the Autoimmune Disease Scleroderma with an Immunologic Response to Cancer. Science 343(6167), 152–157 (2014)
Toffanin, S., Friedman, S.L., Llovet, J.M.: Obesity, inflammatory signaling, and hepatocellular carcinoma-an enlarging link. Cancer Cell 17(2), 115–117 (2010)
Park, J., Lee, D.S., Christakis, N.A., Barabási, A.L.: The impact of cellular networks on disease comorbidity. Mol. Syst. Biol. 5, 262 (2009)
Hidalgo, C.A., Blumm, N., Barabási, A.L., Christakis, N.A.: A dynamic network approach for the study of human phenotypes. PLoS Comput. Biol. 5, e1000353 (2009)
Rogue, F.S., Jensen, P.B., Schmock, H., et al.: Using Electronic Patient Records to Discover Disease Correlations and Stratify Patient Cohorts. PLoS Comput. Biol. 7(8), e1002141 (2011)
Rzhetsky, A., Wajngurt, D., Park, N., Zheng, T.: Probing genetic overlap among complex human phenotypes. Proc. Natl. Acad. Sci. 104, 11694–11699 (2007)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. ACM SIGMOD Int. Conf. Manag. of Data, New York, NY, USA, pp. 1–12 (2000)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proc. 20th Int. Conf. on VLDB, San Francisco, CA, USA, pp. 487–499
Luo, Z., Zhang, G.Q., Xu, R.: Mining Patterns of Adverse Events Using Aggregated Clinical Trial Results. In: AMIA Summits Transl. Sci. Proc., San Fransisco, CA, USA, pp. 112–116 (2013)
Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Physical Review EÂ 69(2), 026113 (2004)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)
Ahn, Y.Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)
McKusick, V.A.: Mendelian Inheritance in Man and its online version, OMIM. American Journal of Human Genetics 80(4), 588 (2007)
Ashburner, M., Ball, C.A., Blake, et al.: Gene Ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
Subramanian, A., Tamayo, P., Mootha, V.K., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 102(43), 15545–15550 (2005)
Oti, M., Huynen, M.A., Brunner, H.G.: The biological coherence of human phenome databases. The American Journal of Human Genetics 85(6), 801–808 (2009)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 111, 10–18 (2009)
Mielants, H., Veys, E.M., Cuvelier, C., et al.: The evolution of spondyloarthropathies in relation to gut histology II. Histological aspects. J. Rheumatol. 22, 2273–2278 (1995)
Elewaut, D.: Linking Crohn’s disease and ankylosing spondylitis: it’s all about genes! PLoS Genetics 6(12) (2010)
Vecchia, C.L., Negri, E., Decarli, A., Franceschi, S.: Diabetes mellitus and colorectal cancer risk. Cancer Epidemiol. Biomarkers Prev. 6(12), 1007–1010 (1997)
Schoen, R.E., Tangen, C.M., Kuller, L.H., et al.: Increased blood glucose and insulin, body size, and incident colorectal cancer. J. Natl. Cancer Inst. 91(13), 1147–1154 (1999)
Calle, E.E., Rodriguez, C., Walker-Thurmond, K., Thun, M.J.: Overweight, obesity, and mortality from cancer in a prospectively studied cohort of US adults. New England Journal of Medicine 348(17), 1625–1638 (2003)
Zoncu, R., Efeyan, A., Sabatini, D.M.: mTOR: from growth signal integration to cancer, diabetes and ageing. Nat. Revs. Mol. Cell Bio. 12(1), 21–35 (2011)
Haggar, F.A., Boushey, R.P.: Colorectal cancer epidemiology: incidence, mortality, survival, and risk factors. Clinics in Colon and Rectal Surgery 22(4), 191 (2009)
Komninou, D., Ayonote, A., Richie, J.P., Rigas, B.: Insulin resistance and its contribution to colon carcinogenesis. Experimental Biology and Medicine 228(4), 396–405 (2003)
Volkova, E., Willis, J.A., Wells, J.E., Robinson, B.A., Dachs, G.U., Currie, M.J.: Association of angiopoietin-2, C-reactive protein and markers of obesity and insulin resistance with survival outcome in colorectal cancer. British Journal of Cancer 104(1), 51–59 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, Y., Xu, R. (2014). Network Analysis of Human Disease Comorbidity Patterns Based on Large-Scale Data Mining. In: Basu, M., Pan, Y., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2014. Lecture Notes in Computer Science(), vol 8492. Springer, Cham. https://doi.org/10.1007/978-3-319-08171-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-08171-7_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08170-0
Online ISBN: 978-3-319-08171-7
eBook Packages: Computer ScienceComputer Science (R0)