Abstract
Protein-protein interactions (PPIs) are responsible for various biological processes and cellular functions of all living organisms. The detection of PPIs helps in understanding the roles of proteins and their complex structure. Proteins are commonly represented by amino acid sequences. The method of identifying PPIs is divided into two steps. Firstly, a feature vector from protein representation is extracted. Then, a model is trained on these extracted feature vectors to reveal novel interactions. These days, with the availability of multimodal biomedical data and the successful adoption of deep-learning algorithms in solving various problems of bioinformatics, we can obtain more relevant feature vectors, improving the model’s performance to predict PPIs. Current work utilizes multimodal data as tertiary structure information and sequence-based information. A deep learning-based model, ResNet50, is used to extract features from 3D voxel representation of proteins. To get a compact feature vector from amino acid sequences, stacked autoencoder and quasi-sequence-order (QSO) are utilized. QSO converts the symbolic representation (amino acid sequences) of proteins into their numerical representation. After extracting features from different modalities, these features are concatenated in pairs and then fed into the bi-directional GRU-based classifier to predict PPIs. Our proposed approach achieves an accuracy of 0.9829, which is the best accuracy of 3-fold cross-validation on the human PPI dataset. The results signify that the proposed approach’s performance is better than existing computational methods, such as state-of-the-art stacked autoencoder-based classifiers.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wang, L., et al.: Advancing the prediction accuracy of protein-protein interactions by utilizing evolutionary information from position-specific scoring matrix and ensemble classifier. J. Theor. Biol. 418, 105–110 (2017)
Khushi, M., Clarke, C.L., Graham, J.D.: Bioinformatic analysis of cis-regulatory interactions between progesterone and estrogen receptors in breast cancer. PeerJ 2, e654 (2014)
Khushi, M., Choudhury, N., Arthur, J.W., Clarke, C.L., Graham, J.D.: Predicting functional interactions among DNA-binding proteins. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11305, pp. 70–80. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04221-9_7
You, Z.H., Lei, Y.K., Gui, J., Huang, D.S., Zhou, X.: Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics 26(21), 2744–2751 (2010)
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. 98(8), 4569–4574 (2001)
Gavin, A.C., et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868), 141–147 (2002)
Ho, Y., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415(6868), 180–183 (2002)
Sun, T., Zhou, B., Lai, L., Pei, J.: Sequence-based prediction of protein protein interaction using a deep-learning algorithm. BMC Bioinform. 18(1), 277 (2017)
Du, X., Sun, S., Hu, C., Yao, Y., Yan, Y., Zhang, Y.: Deepppi: boosting prediction of protein-protein interactions with deep neural networks. J. Chem. Inf. Model. 57(6), 1499–1510 (2017)
Gonzalez-Lopez, F., Morales-Cordovilla, J.A., Villegas-Morcillo, A., Gomez, A.M., Sanchez, V.: End-to-end prediction of protein-protein interaction based on embedding and recurrent neural networks. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2344–2350. IEEE (2018)
Chen, C., Chen, L.X., Zou, X.Y., Cai, P.X.: Predicting protein structural class based on multi-features fusion. J. Theor. Biol. 253(2), 388–392 (2008)
Hegde, V., Zadeh, R.: Fusionnet: 3D object classification using multiple data representations. arXiv preprint arXiv:1607.05695 (2016)
Amidi, A., Amidi, S., Vlachakis, D., Megalooikonomou, V., Paragios, N., Zacharaki, E.I.: EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation. PeerJ 6, e4750 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Pan, X.Y., Zhang, Y.N., Shen, H.B.: Large-Scale prediction of human protein- protein interactions from amino acid sequence based on latent topic features. J. Proteome Res. 9(10), 4992–5001 (2010)
Smialowski, P., et al.: The negatome database: a reference set of non-interacting protein pairs. Nucleic Acids Res. 38(suppl\_1), D540–D544 (2010)
Acknowledgement
Dr. Sriparna Saha would like to acknowledge the support of Science and Engineering Research Board (SERB) of Department of Science and Technology India (Grant/Award Number: ECR/2017/001915) to carry out this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jha, K., Saha, S., Khushi, M. (2020). Protein-Protein Interactions Prediction Based on Bi-directional Gated Recurrent Unit and Multimodal Representation. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1333. Springer, Cham. https://doi.org/10.1007/978-3-030-63823-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-63823-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63822-1
Online ISBN: 978-3-030-63823-8
eBook Packages: Computer ScienceComputer Science (R0)