research-article

Integrative Feature Ranking by Applying Deep Learning on Multi Source Genomic Data

Authors:
Fariba Khoshghalbvash

University of Texas at Arlington, Arlington, TX, USA

University of Texas at Arlington, Arlington, TX, USA
View Profile

,
Jean X. Gao

University of Texas at Arlington, Arlington, TX, USA

University of Texas at Arlington, Arlington, TX, USA
View Profile

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health InformaticsSeptember 2019Pages 207–216https://doi.org/10.1145/3307339.3342139

Published:04 September 2019Publication History

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Pages 207–216

ABSTRACT

Extracting cancer-related information from genomic data specially multi-source datasets has been an ever-growing challenge during the past years. The identification of subtype-specific genomic markers can lead to a sounder diagnosis and treatment. While several algorithms are proposed for feature extraction, to best of our knowledge, none of them consider between modality relations to discover modular disease associated biomarkers. In this paper, we represent an integrative deep learning approach to identify modular subtype-associated critical genes from three sets of input modalities for a better diagnosis of cancer subtypes. First, we train deep classifiers with different integration stages and distinct number of input modalities to predict cancer subtypes. Next, we use the optimized weight matrices of the classifier with the best performance to extract interactive top-ranked features among all input modalities. Lastly, we evaluate those ranks with other feature scoring methods according to their classification performance after feature extraction. Our results and analysis illustrate that the modular candidate biomarkers can be useful for cancer subtype detection.

References

George A Calin and CarloMCroce. 2006. MicroRNA signatures in human cancers. Nature reviews cancer 6, 11 (2006), 857.Google Scholar
S. Ceri, A. Kaitoua, M. Masseroli, P. Pinoli, and F. Venco. 2016. Data Management for Heterogeneous Genomic Datasets. IEEE/ACM Transactions on Computational Biology and Bioinformatics PP, 99 (2016), 1--1. Google ScholarDigital Library
Kumardeep Chaudhary, Olivier B Poirion, Liangqun Lu, and Lana X Garmire. 2018. Deep learning--based multi-omics integration robustly predicts survival in liver cancer. Clinical Cancer Research 24, 6 (2018), 1248--1259.Google ScholarCross Ref
Sean R Eddy. 2001. Non--coding RNA genes and the modern RNA world. Nature Reviews Genetics 2, 12 (2001), 919.Google ScholarCross Ref
Ewan A Gibb, Carolyn J Brown, and Wan L Lam. 2011. The functional role of long non-coding RNA in human carcinomas. Molecular cancer 10, 1 (2011), 38.Google Scholar
Nicolas Goossens, Shigeki Nakagawa, Xiaochen Sun, and Yujin Hoshida. 2015. Cancer biomarker discovery and validation. Translational cancer research 4, 3 (2015), 256.Google Scholar
Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. 2002. Gene selection for cancer classification using support vector machines. Machine learning 46, 1--3 (2002), 389--422. Google ScholarDigital Library
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In The IEEE International Conference on Computer Vision (ICCV). Google ScholarDigital Library
Miles F Jefferson, Neil Pendleton, Sam B Lucas, and Michael A Horan. 1997. Comparison of a genetic algorithm neural network with logistic regression for predicting outcome after surgery for patients with nonsmall cell lung carcinoma. Cancer: Interdisciplinary International Journal of the American Cancer Society 79, 7 (1997), 1338--1342.Google Scholar
Jun Li, Leng Han, Paul Roebuck, Lixia Diao, Lingxiang Liu, Yuan Yuan, John N Weinstein, and Han Liang. 2015. TANRIC: an interactive open platform to explore the function of lncRNAs in cancer. Cancer research (2015), canres--0273.Google Scholar
Muxuan Liang, Zhizhong Li, Ting Chen, and Jianyang Zeng. 2015. Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 12, 4 (2015), 928--937. Google ScholarDigital Library
Guanming Lu, Yueyong Li, Yanfei Ma, Jinlan Lu, Yongcheng Chen, Qiulan Jiang, Qiang Qin, Lifeng Zhao, Qianfang Huang, Zhizhai Luo, et al. 2018. Long noncoding RNA LINC00511 contributes to breast cancer tumourigenesis and stemness by inducing the miR-185--3p/E2F1/Nanog axis. Journal of Experimental & Clinical Cancer Research 37, 1 (2018), 289.Google ScholarCross Ref
John S Mattick and Igor V Makunin. 2006. Non-coding RNA. Human molecular genetics 15, suppl_1 (2006), R17--R29.Google Scholar
Cuong Nguyen, YongWang, and Ha Nam Nguyen. 2013. Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic. Journal of Biomedical Science and Engineering 6, 05 (2013), 551.Google ScholarCross Ref
Brian C Ross. 2014. Mutual information between discrete and continuous data sets. PloS one 9, 2 (2014), e87357.Google ScholarCross Ref
Ahmad Salameh, Xuejun Fan, Byung-Kwon Choi, Shu Zhang, Ningyan Zhang, and Zhiqiang An. 2017. HER3 and LINC00052 interplay promotes tumor growth in breast cancer. Oncotarget 8, 4 (2017), 6526.Google ScholarCross Ref
Stephan C Schuster. 2008. Next-generation sequencing transforms today's biology. Nature methods 5, 1 (2008), 16.Google Scholar
Jay Shendure and Hanlee Ji. 2008. Next-generation DNA sequencing. Nature biotechnology 26, 10 (2008), 1135--1145.Google Scholar
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarDigital Library
Nitish Srivastava and Ruslan R Salakhutdinov. 2012. Multimodal learning with deep boltzmann machines. In Advances in neural information processing systems. 2222--2230. Google ScholarDigital Library
Dongdong Sun, Minghui Wang, and Ao Li. 2018. A multimodal deep neural network for human breast cancer prognosis prediction by integrating multidimensional data. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2018). Google ScholarDigital Library
Erwin L van Dijk, Hélène Auger, Yan Jaszczyszyn, and Claude Thermes. 2014. Ten years of next-generation sequencing technology. Trends in genetics 30, 9 (2014), 418--426.Google Scholar
Lin Wei, Zhilin Jin, Shengjie Yang, Yanxun Xu, Yitan Zhu, and Yuan Ji. 2017. TCGA-assembler 2: software pipeline for retrieval and processing of TCGA/CPTAC data. Bioinformatics 34, 9 (2017), 1615--1617.Google ScholarCross Ref
Xiaoyi Xu, Ya Zhang, Liang Zou, Minghui Wang, and Ao Li. 2012. A gene signature for breast cancer prognosis using support vector machine. In 2012 5th International Conference on BioMedical Engineering and Informatics. IEEE, 928--931.Google ScholarCross Ref
Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).Google Scholar
Yitan Zhu, Peng Qiu, and Yuan Ji. 2014. TCGA-assembler: open-source software for retrieving and processing TCGA data. Nature methods 11, 6 (2014), 599.Google Scholar

Index Terms

Integrative Feature Ranking by Applying Deep Learning on Multi Source Genomic Data

Recommendations

Latent feature decompositions for integrative analysis of multi-platform genomic data

Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and ...
Read More
Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach

Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of ...
Read More
Use of Structural Properties of Underlying Graphs in Pathway Enrichment Analysis of Genomic Data
ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics

Common methods for the functional inference of genomic data, such as Gene Sent Enrichment Analysis (GSEA) and Over Representation Analysis (ORA), often discard the interactions between the biomolecular entities. Recent studies have explored this issue ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
September 2019
716 pages
ISBN:9781450366663
DOI:10.1145/3307339
General Chairs:
Xinghua (Mindy) Shi
Temple University, USA
,
Michael Buck
University of Buffalo, USA
,
Program Chairs:
Jian Ma
Carnegie Mellon University, USA
,
Pierangelo Veltri
University Magna Graecia of Catanzaro, Italy
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 September 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
biomarker discovery
data integration
deep learning
feature ranking
genomic data
neural networks
Qualifiers
- research-article
Conference

Acceptance Rates
BCB '19 Paper Acceptance Rate42of157submissions,27%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 149
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Integrative Feature Ranking by Applying Deep Learning on Multi Source Genomic Data

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Latent feature decompositions for integrative analysis of multi-platform genomic data

Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach

Use of Structural Properties of Underlying Graphs in Pathway Enrichment Analysis of Genomic Data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Integrative Feature Ranking by Applying Deep Learning on Multi Source Genomic Data

BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Latent feature decompositions for integrative analysis of multi-platform genomic data

Integrative data analysis of multi-platform cancer data with a multimodal deep learning approach

Use of Structural Properties of Underlying Graphs in Pathway Enrichment Analysis of Genomic Data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media