research-article

Variational Autoencoder Based Network Embedding Algorithm For Protein Function Prediction

Authors:

MING YIAuthors Info & Claims

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

Pages 191 - 196

https://doi.org/10.1145/3529836.3529922

Published: 21 June 2022 Publication History

Abstract

The development of high-throughput technology has produced a large number of protein-protein interaction datasets, which provide an effective way to infer the functional annotation of proteins. However, how to make proper use of these datasets to extract effective low-dimensional feature representation of proteins for functional prediction is a challenge. Most existing network integration methods for protein function prediction have some limitations to capture complex and highly non-linear network structure information due to their design architecture. Therefore, we propose a novel multi-network embedding method deepVAE based on deep variational autoencoder (VAE), which uses the variational autoencoder to extract low-dimensional features of proteins from multiple various interactive network datasets and then trains a SVM classifier to predict protein function. Particularly, we denoise the original networks before network embedding, thus the new proposed method is called deepVAE-NE. The experiments are conducted on the yeast and human protein-protein interaction datasets and the experimental performance shows that our methods perform better than the other four compared advanced approaches, which greatly improves the accuracy of functional prediction.

References

[1]

Radivojac P, Clark W T, Oron T R, . 2013. A large-scale evaluation of computational protein function prediction. Nature Methods 10, 3 (January 2013), 221-227. https://doi.org/10.1038/nmeth.2340

[2]

Yue X, Wang Z, Huang J, . 2020. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics 36, 4 (February 2020), 1241-1251. https://doi.org/10.1093/bioinformatics/btz718

[3]

Fan K, Guan Y, Zhang Y. 2020. Graph2GO: a multi-modal attributed network embedding method for inferring protein functions. GigaScience 9, 8 (August 2020), giaa081. https://doi.org/10.1093/gigascience/giaa081%20

[4]

Franceschini A, Szklarczyk D, Frankild S, . 2012. STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Research 41, D1 (January 2013), D808-D815. https://doi.org/10.1093/gigascience/giaa081%20

[5]

Peng J, Xue H, Wei Z, . 2021. Integrating multi-network topology for gene function prediction using deep neural networks. Briefings in bioinformatics 22, 2(March 2021), 2096-2105. https://doi.org/10.1093/bib/bbaa036

[6]

Chen Q, Li Y, Tan K, . 2021. Network-based methods for gene function prediction. Briefings in Functional Genomics 20, 4 (July 2021), 249–257. https://doi.org/10.1093/bfgp/elab006

[7]

Chen B, Fan W, Liu J, . 2014. Identifying protein complexes and functional modules—from static PPI networks to dynamic PPI networks. Briefings in Bioinformatics 15, 2 (March 2014), 177-194. https://doi.org/10.4137/CIN.S680

[8]

Yu G, Fu G, Wang J, . 2015. Predicting protein function via semantic integration of multiple networks. IEEE/ACM Transactions On Computational Biology And Bioinformatics 13, 2 (July 2015), 220-232. https://doi.org/10.1109/TCBB.2015.2459713

Digital Library

[9]

Mostafavi S, Ray D, Warde-Farley D, . 2008. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biology 9, 1 (June 2008), 1-15. https://doi.org/10.1186/gb-2008-9-s1-s4

[10]

Wang B, Mezlini A M, Demir F, . 2014. Similarity network fusion for aggregating data types on a genomic scale. Nature Methods 11, 3 (January 2014), 333-337. https://doi.org/10.1038/nmeth.2810

[11]

Valentini G. 2014. Hierarchical ensemble methods for protein function prediction. International Scholarly Research Notices 2014, (May 2014). https://doi.org/10.1155/2014/901419

[12]

Cho H, Berger B, Peng J. 2016. Compact integration of multi-network topology for functional analysis of genes. Cell Systems 3, 6 (December 2016), 540-548. https://doi.org/10.1016/j.cels.2016.10.017

[13]

Gligorijević V, Barot M, Bonneau R. 2018. deepNF: deep network fusion for protein function prediction. Bioinformatics 34, 22 (June 2018), 3873-3881. https://doi.org/10.1093/bioinformatics/bty440

[14]

Kingma D P, Welling M. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

[15]

Wang B, Pourshafeie A, Zitnik M, . 2018. Network enhancement as a general method to denoise weighted biological networks. Nature Communications 9, 1 (August 2018), 1-8. https://doi.org/10.1038/s41467-018-05469-x

[16]

Vincent P, Larochelle H, Lajoie I, . 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11, (March 2010) 3371-3408. https://dl.acm.org/doi/10.5555/1756006.1953039

Digital Library

[17]

Chang C C, Lin C J. 2011. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3 (April 2011), 1-27. https://doi.org/10.1145/1961189.1961199

Digital Library

Recommendations

Protein Function Prediction Using Adaptive Swarm Based Algorithm
SEMCCO 2013: Proceedings of the 4th International Conference on Swarm, Evolutionary, and Memetic Computing - Volume 8298

The center of attention of the research in bioinformatics has been towards understanding the biological mechanisms and protein functions. Recently high throughput experimental methods have provided many protein-protein interaction networks which need to ...
Protein function prediction using neighbor relativity in protein-protein interaction network

There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology ...
A protein sequence meta-functional signature for calcium binding residue prediction

The diversity of characterized protein functions found amongst experimentally interrogated proteins suggests that a vast array of unknown functions remains undiscovered. These protein functions are imparted by specific geometric distributions of amino ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

February 2022

570 pages

ISBN:9781450395700

DOI:10.1145/3529836

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

the Central Universities, China University of Geosciences (Wuhan) Grant CUGGC02,the National Natural Science Foundation of China [11675060 and 91730301 to Yi],the Hubei Provincial Natural Science Foundation of China under Grant 2015CFA010,the Major Programs of National Statistical Science Research Foundation under Grant 2018LD02

Conference

ICMLC 2022

ICMLC 2022: 2022 14th International Conference on Machine Learning and Computing

February 18 - 21, 2022

Guangzhou, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
78
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents