skip to main content
10.1145/2996890.3007896acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccConference Proceedingsconference-collections
short-paper

Clinical and genomics data integration using meta-dimensional approach

Published:06 December 2016Publication History

ABSTRACT

Clinical and genomics datasets contain humongous amount of information which are used in their respective environments independently to produce new science or better explain existing approaches. The interaction of data between these two domains is very limited and, hence, the information is disseminated. These disparate datasets need to be integrated to consolidate scattered pieces of information into a unified knowledge base to support new research challenges. However, there is no platform available that allows integration of clinical and genomics datasets into a consistent and coherent data source and produce analytics from it. We propose a data integration model here which will be capable of integrating clinical and genomics datasets using meta-dimensional approaches and machine learning methods. Bayesian Networks, which are based on meta-dimensional approach, will be used to design a probabilistic data model, and Neural Networks, which are based on machine learning, will be used for classification and pattern recognition from integrated data. This integration will help to coalesce the genetic background of clinical traits which will be immensely beneficial to derive new research insights for drug designing or precision medicine.

References

  1. Louie, B., Mork, P., Martin-Sanchez, F., Halevy, A. and Tarczy-Hornoch, P., 2007. Data integration and genomic medicine. Journal of biomedical informatics, 40(1), pp.5--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ritchie, M.D., Holzinger, E.R., Li, R., Pendergrass, S.A. and Kim, D., 2015. Methods of integrating data to uncover genotype-phenotype interactions.Nature Reviews Genetics, 16(2), pp.85--97.Google ScholarGoogle Scholar
  3. Hamid, J.S., Hu, P., Roslin, N.M., Ling, V., Greenwood, C.M. and Beyene, J., 2009. Data integration in genetics and genomics: methods and challenges. Human genomics and proteomics, 1(1).Google ScholarGoogle Scholar
  4. Nevins, J.R., Huang, E.S., Dressman, H., Pittman, J., Huang, A.T. and West, M., 2003. Towards integrated clinico-genomic models for personalized medicine: combining gene expression signatures and clinical factors in breast cancer outcomes prediction. Human molecular genetics, 12(suppl 2), pp.R153--R157.Google ScholarGoogle Scholar
  5. Schadt, E.E., Lamb, J., Yang, X., Zhu, J., Edwards, S., GuhaThakurta, D., Sieberts, S.K., Monks, S., Reitman, M., Zhang, C. and Lum, P.Y., 2005. An integrative genomics approach to infer causal associations between gene expression and disease. Nature genetics, 37(7), pp.710--717.Google ScholarGoogle Scholar
  6. Lenzerini, M., 2002, June. Data integration: A theoretical perspective. InProceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 233--246). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Doan, A., Halevy, A. and Ives, Z., 2012. Principles of data integration. Elsevier. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. https://www.techopedia.com/definition/28290/data-integrationGoogle ScholarGoogle Scholar
  9. Orechia, J., Pathak, A., Shi, Y., Nawani, A., Belozerov, A., Fontes, C., Lakhiani, C., Jawale, C., Patel, C., Quinn, D. and Botvinnik, D., 2015. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine. Applied & translational genomics, 6, pp.18--25.Google ScholarGoogle Scholar
  10. https://en.wikipedia.org/wiki/Clinical_data_repositoryGoogle ScholarGoogle Scholar
  11. https://wiki.nci.nih.gov/display/TCGA/Clinical+Data+OverviewGoogle ScholarGoogle Scholar
  12. Gilchrist, J., Frize, M., Ennett, C.M. and Bariciak, E., 2011. Performance evaluation of various storage formats for clinical data repositories. IEEE Transactions on Instrumentation and Measurement, 60(10), pp.3244--3252.Google ScholarGoogle ScholarCross RefCross Ref
  13. https://www.genomatix.de/online_help/help/sequence_formats.htmlGoogle ScholarGoogle Scholar
  14. https://faculty.washington.edu/browning/beagle/intro-to-vcf.htmlGoogle ScholarGoogle Scholar
  15. https://www.sas.com/content/dam/SAS/en_us/doc/factsheet/sas-clinical-data-integration-103961.pdfGoogle ScholarGoogle Scholar
  16. http://lumeris.com/wp-content/uploads/2014/05/Lumeris-SOL.CDI_.05-14.v1.pdfGoogle ScholarGoogle Scholar
  17. https://www.edifecs.com/downloads/Clinical_Data_Integration_Solution_Brief_2015.pdfGoogle ScholarGoogle Scholar
  18. Lee, E., Cho, S., Kim, K. and Park, T., 2009. An integrated approach to infer causal associations among gene expression, genotype variation, and disease. Genomics, 94(4), pp.269--277.Google ScholarGoogle ScholarCross RefCross Ref
  19. Fridley, B.L., Lund, S., Jenkins, G.D. and Wang, L., 2012. A Bayesian integrative genomic model for pathway analysis of complex traits. Genetic epidemiology, 36(4), pp.352--359.Google ScholarGoogle Scholar
  20. Akavia, U.D., Litvin, O., Kim, J., Sanchez-Garcia, F., Kotliar, D., Causton, H.C., Pochanard, P., Mozes, E., Garraway, L.A. and Pe'er, D., 2010. An integrated approach to uncover drivers of cancer. Cell, 143(6), pp.1005--1017.Google ScholarGoogle ScholarCross RefCross Ref
  21. Holzinger, E.R., Dudek, S.M., Frase, A.T., Pendergrass, S.A. and Ritchie, M.D., 2013. ATHENA: the analysis tool for heritable and environmental network associations. Bioinformatics, p.btt572.Google ScholarGoogle Scholar
  22. Kim, D., Li, R., Dudek, S.M. and Ritchie, M.D., 2013. ATHENA: Identifying interactions between different levels of genomic data associated with cancer clinical outcomes using grammatical evolution neural network. BioData mining, 6(1), p.1.Google ScholarGoogle Scholar
  23. http://transmartfoundation.orgGoogle ScholarGoogle Scholar
  24. Athey, B.D., Braxenthaler, M., Haas, M. and Guo, Y., 2013. tranSMART: an open source and community-driven informatics and data sharing platform for clinical and translational research. AMIA Summits on Translational Science Proceedings, 2013, p.6.Google ScholarGoogle Scholar
  25. Ben-Gal, I., 2007. Bayesian networks. Encyclopedia of statistics in quality and reliability.Google ScholarGoogle Scholar
  26. Singh, S. and Graepel, T., 2012. Compiling relational database schemata into probabilistic graphical models. arXiv preprint arXiv:1212.0967.Google ScholarGoogle Scholar
  27. Getoor, L., 2006. An Introduction to Probabilistic Graphical Models for Relational Data. IEEE Data Eng. Bull., 29(1), pp.32--39.Google ScholarGoogle Scholar
  28. Wang, L., Zhang, A. and Ramanathan, M., 2005. BioStar models of clinical and genomic data for biomedical data warehouse design. International journal of bioinformatics research and applications, 1(1), pp.63--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Du, N., Guo, S., Mahajan, S.D., Schwartz, S.A., Nair, B.B., Hsiao, C.B. and Zhang, A., 2012. BioStar+: a data warehouse schema for integrating clinical and genomic data from HIV patients. ACM SIGBioinformatics Record, 2(3), pp.6--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. https://www.opentargets.orgGoogle ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    UCC '16: Proceedings of the 9th International Conference on Utility and Cloud Computing
    December 2016
    549 pages
    ISBN:9781450346160
    DOI:10.1145/2996890

    Copyright © 2016 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 6 December 2016

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • short-paper

    Acceptance Rates

    Overall Acceptance Rate38of125submissions,30%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader