skip to main content
10.1145/3107411.3108225acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
poster

Integrative Sufficient Dimension Reduction Methods for Multi-Omics Data Analysis

Published: 20 August 2017 Publication History

Abstract

With the advent of high throughput genome-wide assays it has become possible to simultaneously measure multiple types of genomic data. Several projects like TCGA, ICGC, NCI-60 has generated comprehensive, multi-dimensional maps of the key genomic changes like MiRNA, MRNA, proteomics etc. from cancer samples[2,4]. These genomic data can be used for classifying tumour types[5]. Integrative analysis of these data from multiple sources can potentially provide additional biological insights, but methods to do any such analysis are lacking. One of the widely used solutions to handle high dimension data is by removing redundant information in the integrated sample. Most of the expressed genes are overlapped and can be projected onto lower dimension, and then be used to classify different tumor types, without the loss of any/much information. Sufficient dimension reduction (SDR) [1], a supervised dimension reduction approach, can be ideal to achieve such a goal. In this paper, we propose a novel integrative SDR method that can reduce dimensions of multiple data types simultaneously while sharing common latent structures to improve prediction and interpretation. In particular, we extend the sliced inverse regression (SIR) technique, a major SDR method, to integrate multiple omits data for simultaneous dimension reduction. SIR is a supervised dimension reduction method that assumes that the outcome variable Y depends on the predictor variable X through d unknown linear combinations of the predictor[3]. The predictor variable is replaced by its projection into a lower dimension subspace of the predictor space without the loss of information. The aim is to find the intersection of all the subspaces δ called the central susbspace (CS) of the predictor space satisfying the property Y ╨ X| Pδ X. To integrate multiple types of data, we propose and implement a new integrative sufficient dimension reduction method extending SIR[3], called integrative SIR. The main idea is that we take into account all the multi-omics data information simultaneously while finding a basis matrix for each data type with some sharing latent structures. Finally, we get d dimension data which is much smaller than the original data dimension. The reduced dimension d was achieved by cross validation. To demonstrate the integrated analysis of multi-omics data, we applied and compared conventional SIR and integrative SIR to analyze MRNA, MiRNA and proteomics expression profile of a subset of cell lines from the NCI-60 panel. The data used is taken from [6]. The outcomes we have to classify are CNS, Leukemia and Melanoma tumor types. We pre-screened 400 variables from each data type with the criteria of high variance. To find classification error, we performed random forest classification after we applied to each method with leave-one-out cross-validation. As a result, we found out that integrative SIR leads to less classification error as compared to conventional SIR. To summarize, we proposed a new integrative SIR method, a supervised dimension reduction technique for integrative analysis of multi-omics data types. Unlike conventional SDR methods, the new approach can reduce the dimensions of multiple omics data simultaneously while sharing common latent structures across data types without losing any information in prediction. By efficiently capturing the common information, our numerical study shows that integrative SIR classifies tumor types more accurately as compared to conventional SDR methods.

References

[1]
R Dennis Cook. 1996. Graphics for regressions with a binary response. J. Amer. Statist. Assoc. 91, 435 (1996), 983--992.
[2]
Amin Moghaddas Gholami et al. 2013. Global Proteome Analysis of the NCI-60 Cell Line Panel. Cell Reports 4, 3 (2013), 609 -- 620.
[3]
Ker-Chau Li. 1991. Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86, 414 (1991), 316--327.
[4]
Hongfang Liu, D'Andrade, et al. 2010. mRNA and microRNA expression profiles of the NCI-60 integrated with drug activities. Molecular cancer therapeutics 9, 5 (2010), 1080--1091.
[5]
Jun Lu, Getz, et al. 2005. MicroRNA expression profiles classify human cancers. nature 435, 7043 (2005), 834--838.
[6]
Chen Meng et al. 2016. Dimension reduction techniques for the integrative analysis of multi-omics data. Briefings in bioinformatics 17, 4 (2016), 628--641.

Cited By

View all
  • (2019)Sliced inverse regression for integrative multi-omics data analysisStatistical Applications in Genetics and Molecular Biology10.1515/sagmb-2018-002818:1Online publication date: 26-Jan-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
August 2017
800 pages
ISBN:9781450347228
DOI:10.1145/3107411
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2017

Check for updates

Author Tags

  1. bioinformatics
  2. integrative genomic analysis
  3. sliced inverse regression
  4. sufficient dimension reduction
  5. supervised learning

Qualifiers

  • Poster

Conference

BCB '17
Sponsor:

Acceptance Rates

ACM-BCB '17 Paper Acceptance Rate 42 of 132 submissions, 32%;
Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Sliced inverse regression for integrative multi-omics data analysisStatistical Applications in Genetics and Molecular Biology10.1515/sagmb-2018-002818:1Online publication date: 26-Jan-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media