Discovery of Cancer Subtypes Based on Stacked Autoencoder

Zhang, Bo; Cao, Rui-Fen; Wang, Jing; Zheng, Chun-Hou

doi:10.1007/978-3-030-60796-8_38

Bo Zhang¹⁰,
Rui-Fen Cao¹⁰,
Jing Wang^10,11 &
…
Chun-Hou Zheng¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12465))

Included in the following conference series:

International Conference on Intelligent Computing

1126 Accesses

Abstract

The discovery of cancer subtypes has become one of the research hotspots in bioinformatics. Clustering can be used to divide the same cancer into different subtypes, which can provide a basis and guidance for precision medicine and personalized medicine, so as to improve the treatment effect. It was found that multi-omics clustering had better effect than single cluster of omics data. However, omics data is usually of high dimensionality and noisy, and there are some challenges in multi-omics clustering. In this paper, we first use a stacked autoencoder neural network to reduce the dimensionality of multi-omics data and obtain the feature representation of low dimension. Then the similarity matrix is constructed by scaled exponential similarity kernel. Finally, we use spectral clustering method to calculate the clustering results. The experimental results on three datasets show that our method is more effective than the traditional dimensionality reduction method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Pollack, J.R., et al.: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc. Natl. Acad. Sci. U.S.A. 99, 12963–12968 (2002)
Article Google Scholar
Stratton, M.R., Campbell, P.J., Futreal, P.A.: The cancer genome. Nature 458, 719–724 (2009)
Article Google Scholar
Yang, Y., Wang, H.: Multi-view clustering: a survey. Big Data Min. Anal. 1, 3–27 (2018)
Article Google Scholar
Rappoport, N., Shamir, R.: Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res. 46, 10546–10562 (2018)
Article Google Scholar
Wu, D., Wang, D., Zhang, M.Q., Gu, J.: Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification. BMC Genomics 16 (2015). Article number: 1022 https://doi.org/10.1186/s12864-015-2223-8
Shen, R., et al.: Integrative subtype discovery in glioblastoma using iCluster. PLoS ONE 7, e35236 (2012)
Article Google Scholar
Wang, B., Mezlini, A.M., Demir, F., Fiume, M.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337 (2014)
Article Google Scholar
Ding, H., Sharpnack, M., Wang, C., Huang, K., Machiraju, R.: Integrative cancer patient stratification via subspace merging. Bioinformatics 35(10), 1653–1659 (2019)
Article Google Scholar
Zabalza, J., Ren, J., Zheng, J., Zhao, H., Marshall, S.: Novel segmented stacked autoencoder for effective dimensionality reduction and feature extraction in hyperspectral imaging. Neurocomputing 185, 1–10 (2015)
Article Google Scholar
Zha, H., He, X., Ding, C., Ming, G., Simon, H.D.: Spectral relaxation for K-means clustering. In: Advances in Neural Information Processing Systems 14 (2001)
Google Scholar
Ding, C., He, X.: Cluster structure of K-means clustering via principal component analysis. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 414–418. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24775-3_50
Chapter Google Scholar
Mo, Q., et al.: Pattern discovery and cancer gene identification in integrated cancer genomic data. Proc. Natl. Acad. Sci. U.S.A. 110, 4245–4250 (2013)
Article Google Scholar
Hosmer, D.W., Lemeshow, S., May, S.: Applied survival analysis: regression modeling of time to event data. J. Stat. Plann. Infer. 95, 173–175 (2000)
MATH Google Scholar
Subik, K., et al.: The expression patterns of ER, PR, HER2, CK5/6, EGFR, Ki-67 and AR by immunohistochemical analysis in breast cancer cell lines. Breast Cancer Basic Clin. Res. (2010)
Google Scholar
Yang, G., Zheng, J., Shang, X., Li, Z.: A similarity regression fusion model for integrating multi-omics data to identify cancer subtypes. Genes 9, 314 (2018)
Article Google Scholar

Download references

Acknowledgments

This work was supported by grants from the National Natural Science Foundation of China (Nos. U19A2064, 61873001), the Key Project of Anhui Provincial Education Department (No. KJ2017ZD01), and the Natural Science Foundation of Anhui Province (No. 1808085QF209).

Author information

Authors and Affiliations

School of Computer Science and Technology, Anhui University, Hefei, China
Bo Zhang, Rui-Fen Cao, Jing Wang & Chun-Hou Zheng
School of Computer and Information Engineering, Fuyang Normal University, Fuyang, China
Jing Wang

Authors

Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Rui-Fen Cao
View author publications
You can also search for this author in PubMed Google Scholar
Jing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Hou Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun-Hou Zheng .

Editor information

Editors and Affiliations

Machine Learning and Systems Biology, Tongji University, Shanghai, China
De-Shuang Huang
School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, B., Cao, RF., Wang, J., Zheng, CH. (2020). Discovery of Cancer Subtypes Based on Stacked Autoencoder. In: Huang, DS., Premaratne, P. (eds) Intelligent Computing Methodologies. ICIC 2020. Lecture Notes in Computer Science(), vol 12465. Springer, Cham. https://doi.org/10.1007/978-3-030-60796-8_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-60796-8_38
Published: 05 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60795-1
Online ISBN: 978-3-030-60796-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics