Abstract
Cancer research is emerging as a complex orchestration of genomics, data-sciences, and network-sciences. For improving cancer diagnosis and treatment strategies, data across multiple scales, from molecules like DNA, RNA, metabolites, to the population, need to be integrated. This requires handling of large volumes of high complexity “Omics” data, requiring powerful computational algorithms and mathematical tools. Here we present an integrative analytics approach for cancer genomics. This approach takes the multi-scale biological interactions as key considerations for model development. We demonstrate the use of this approach on a publicly available lung cancer dataset collected for 109 individuals from an 18 years long clinical study. From this data, we discovered novel disease markers and drug targets that were validated using peer-reviewed literature. These results demonstrate the power of big data analytics for deriving disease actionable insight.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cytoscape. http://www.cytoscape.org/
AlQuraishi, M., Koytiger, G., Jenney, A., MacBeath, G., Sorger, P.K.: A multiscale statistical mechanical framework integrates biophysical and genomic data to assemble cancer networks. Nat. Genet. 46, 1363–1371 (2014)
Amer Desouki, A.: sybilcycleFreeFlux: cycle-Free Flux balance analysis: Efficient removal of thermodynamically infeasible cycles from metabolic flux distributions (2014). R package version 1.0.1
Amer Desouki, A.: sybilEFBA: Using Gene Expression Data to Improve Flux Balance Analysis Predictions (2015). R package version 1.0.2
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)
Bair, E., Hastie, T., Paul, D., Tibshirani, R.: Prediction by supervised principal components. J. Am. Stat. Assoc. 101(473), 119–137 (2006)
Bair, E., Tibshirani, R.: Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2(4), E108 (2004)
Becker, S.A., Palsson, B.O.: Context-specific metabolic networks are consistent with experiments. PLoS Comput. Biol. 4(5), e1000082 (2008)
Brambilla, C., Laffaire, J., Lantuejoul, S., Moro-Sibilot, D., Mignotte, H., Arbib, F., Toffart, A.C., Petel, F., Hainaut, P., Rousseaux, S., et al.: Lung squamous cell carcinomas with basaloid histology represent a specific molecular entity. Clin. Cancer Res. 20(22), 5777–5786 (2014)
Carlson, M.: GO.db: A set of annotation maps describing the entire Gene Ontology, R package version 3.1.2
Chen, J.S., Su, I.J., Leu, Y.W., Young, K.C., Sun, H.S.: Expression of t-cell lymphoma invasion and metastasis 2 (tiam2) promotes proliferation and invasion of liver cancer. Int. J. Cancer 130(6), 1302–1313 (2012)
Collins, F.S., Varmus, H.: A new initiative on precision medicine. N. Engl. J. Med. 372, 793–795 (2015)
Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695(5), 1–9 (2006)
Del Bufalo, D., Biroccio, A., Leonetti, C., Zupi, G.: Bcl-2 overexpression enhances the metastatic potential of a human breast cancer line. The FASEB J. 11(12), 947–953 (1997)
Gelius-Dietrich, G.: glpkAPI: R Interface to C API of GLPK (2015). R package version 1.3.0
Gelius-Dietrich, G., Desouki, A.A., Fritzemeier, C.J., Lercher, M.J.: sybil-efficient constraint-based modelling in R. BMC Syst. Biol. 7(1), 125 (2013)
Hansen, J., Iyengar, R.: Computation as the mechanistic bridge between precision medicine and systems therapeutics. Clin. Pharmacol. Ther. 93(1), 117–128 (2013)
Hucka, M., Finney, A., Sauro, H.M., Bolouri, H., Doyle, J.C., Kitano, H., Arkin, A.P., Bornstein, B.J., Bray, D., Cornish-Bowden, A., et al.: The systems biology markup language (sbml): a medium for representation and exchange of biochemical network models. Bioinformatics 19(4), 524–531 (2003)
Jerby, L., Ruppin, E.: Predicting drug targets and biomarkers of cancer via genome-scale metabolic modeling. Clin. Cancer Res. 18(20), 5572–5584 (2012)
Kanehisa, M., Goto, S.: Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)
Khazaei, T., McGuigan, A., Mahadevan, R.: Ensemble modeling of cancer metabolism. Front. Physiol. 3, 135 (2012)
Kitano, H.: Systems biology: a brief overview. Science 295(5560), 1662–1664 (2002)
Li, X., Cowell, J.K., Sossey-Alaoui, K.: CLCA2 tumour suppressor gene in 1p31 is epigenetically regulated in breast cancer. Oncogene 23(7), 1474–1480 (2004)
Li, Y., Chen, L.: Big biological data: challenges and opportunities. Genomics, Proteomics Bioinform. 12(5), 187–189 (2014)
Martın H, J.A., Bourdon, J.: Solving hard computational problems efficiently: asymptotic parametric complexity 3-coloring algorithm. PloS One 8(1), e53437 (2013)
Mazocchi, F.: Complexity in biology. Exceeding the limits of reductionism and determinism using complexity theory. EMBO Rep. 9, 10–14 (2008)
Mazocchi, F.: Complexity and the reductionism-holism debate in systems biology. Wiley Interdiscip. Rev. Syst. Biol. Med. 4, 413–427 (2012)
Miller, R., Halpern, J.: Regression with censored data. Biometrika 69(3), 521–531 (1982)
Moreno, J.D., Zhu, Z.I., Yang, P.C., Bankston, J.R., Jeng, M.T., Kang, C., Wang, L., Bayer, J.D., Christini, D.J., Trayanova, N.A., et al.: A computational model to predict the effects of class i anti-arrhythmic drugs on ventricular rhythms. Sci. Transl. Med. 3(98), 98ra83 (2011)
Oosting, J., Eilers, P., Menezes, R.: quantsmooth: Quantile smoothing and genomic visualization of array data. R package version 1.35.0 (2014)
Orchard, S., Ammari, M., Aranda, B., Breuza, L., Briganti, L., Broackes-Carter, F., Campbell, N.H., Chavali, G., Chen, C., Del-Toro, N., et al.: The mintact projectintact as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, 358–363 (2013)
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., De Bakker, P.I., Daly, M.J., et al.: Plink: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81(3), 559–575 (2007)
Ritchie, M.D., Holzinger, E.R., Li, R., Pendergrass, S.A., Kim, D.: Methods of integrating data to uncover genotype-phenotype interactions. Nat. Rev. Genet. 16(2), 85–97 (2015)
Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., Smyth, G.K.: limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015)
Segrè, A.V., Groop, L., Mootha, V.K., Daly, M.J., Altshuler, D., Consortium, D., Investigators, M., et al.: Common inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic traits. PLoS Genet. 6(8), e1001058 (2010)
Sîrbu, A., Ruskin, H.J., Crane, M.: Cross-platform microarray data normalisation for regulatory network inference. PLoS One 5(11), e13822 (2010)
Stephens, Z.D., Lee, S.Y., Faghri, F., Campbell, R.H., Zhai, C., Efron, M.J., Iyer, R., Schatz, M.C., Sinha, S., Robinson, G.E.: Big data: astronomical or genomical? PLoS Biol. 13(7), e1002195 (2015)
Talukder, A.K., Ravishankar, S., Sasmal, K., Gandham, S., Prabhukumar, J., Achutharao, P.H., Barh, D., Blasi, F.: Xomannotate: analysis of heterogeneous and complex exome-a step towards translational medicine. PLoS ONE 10, e0123569 (2015)
Thiele, I., Swainston, N., Fleming, R.M., Hoppe, A., Sahoo, S., Aurich, M.K., Haraldsdottir, H., Mo, M.L., Rolfsson, O., Stobbe, M.D., et al.: A community-driven global reconstruction of human metabolism. Nat. Biotechnol. 31(5), 419–425 (2013)
Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer Science & Business Media, New York (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Agarwal, M., Adhil, M., Talukder, A.K. (2015). Multi-omics Multi-scale Big Data Analytics for Cancer Genomics. In: Kumar, N., Bhatnagar, V. (eds) Big Data Analytics. BDA 2015. Lecture Notes in Computer Science(), vol 9498. Springer, Cham. https://doi.org/10.1007/978-3-319-27057-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-27057-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27056-2
Online ISBN: 978-3-319-27057-9
eBook Packages: Computer ScienceComputer Science (R0)