Abstract
Thermophilic proteins have widely used in food, medicine, tanning, and oil drilling. By analyzing the protein sequence, the superior structure and properties of the protein sequence are obtained, which is used to efficiently predict the protein species. In this paper, a voting algorithm was designed independently. Protein features and dimensions were extracted and reduced, respectively. Data was predicted by WEKA. Next, the voting algorithm was applied to the data obtained by the above processing. In this experiment, the highest accuracy rate of 93.03% was achieved. This experiment has at least two advantages: First, the voting algorithm was developed independently. Second, any optimization method was not used for this experiment, which prevents over-fitting. Therefore, voting is a very effective strategy for the thermal stability of proteins. The prediction data set used in this paper can be freely downloaded from http://lab.malab.cn/~lijing/thermo_data.html.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alshahrani, M., Khan, M.A., Maddouri, O., Kinjo, A.R., Queralt-Rosinach, N., Hoehndorf, R.: Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics 33(17), 2723–2730 (2017)
Cabarle, F.G.C., Adorna, H.N., Jiang, M., Zeng, X.: Spiking neural P systems with scheduled synapses. IEEE Trans. Nanobiosci. 16(8), 792–801 (2017)
Chen, W., Ding, H., Zhou, X., Lin, H., Chou, K.-C.: iRNA(m6A)-PseDNC: identifying N6-methyladenosine sites using pseudo dinucleotide composition. Anal. Biochem. 561, 59–65 (2018)
Chen, W., Yang, H., Feng, P., Ding, H., Lin, H.: iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 33(22), 3518–3523 (2017)
Chen, Z., et al.: iFeature: a python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 34(14), 2499–2502 (2018)
Cheng, L., Hu, Y., Sun, J., Zhou, M., Jiang, Q.: DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics 34(11), 1953–1956 (2018)
Cheng, L., et al.: InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genom. 19(1), 919 (2018)
Cheng, L., et al.: LncRNA2Target v2. 0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res. 47(D1), D140–D144 (2018)
Cheng, L., et al.: MetSigDis: a manually curated resource for the metabolic signatures of diseases. Briefings Bioinform. 20(1), 203–209 (2017)
Feng, C.-Q., et al.: iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics (2018)
Michael Gromiha, M., Xavier Suresh, M.: Discrimination of mesophilic and thermophilic proteins using machine learning algorithms. Proteins: Struct. Funct. Bioinform. 70(4), 1274–1279 (2008)
Hu, Y., Zhao, T., Zhang, N., Zang, T., Zhang, J., Cheng, L.: Identifying diseases-related metabolites using random walk. BMC Bioinform. 19(5), 116 (2018)
Li, Y., Russell Middaugh, C., Fang, J.: A novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting relative thermostability of protein mutants. BMC Bioinform. 11(1), 62 (2010)
Liao, Z., Li, D., Wang, X., Li, L., Zou, Q.: Cancer diagnosis through isomiR expression with machine learning method. Curr. Bioinform. 13(1), 57–63 (2018)
Liu, B., Yang, F., Chou, K.-C.: 2L-piRNA: a two-layer ensemble classifier for identifying Piwi-interacting RNAs and their function. Mol. Ther.-Nucleic Acids 7, 267–277 (2017)
Liu, B., Yang, F., Huang, D.-S., Chou, K.-C.: iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 34(1), 33–40 (2017)
Liu, X.-L., Lu, J.-L., Hu, X.-H.: Predicting thermophilic proteins with pseudo amino acid composition: approached from chaos game representation and principal component analysis. Protein Peptide Lett. 18(12), 1244–1250 (2011)
Montanucci, L., Fariselli, P., Martelli, P.L., Casadio, R.: Predicting protein thermostability changes from sequence upon multiple mutations. Bioinformatics 24(13), i190–i195 (2008)
Song, T., RodrÃguez-Patón, A., Zheng, P., Zeng, X.: Spiking neural P systems with colored spikes. IEEE Trans. Cogn. Dev. Syst. 10(4), 1106–1115 (2018)
Su, R., Wu, H., Xu, B., Liu, X., Wei, L.: Developing a multi-dose computational model for drug-induced hepatotoxicity prediction based on toxicogenomics data. IEEE/ACM Trans. Comput. Biol. Bioinform. (2018)
Tang, Y., Liu, D., Wang, Z., Wen, T., Deng, L.: A boosting approach for prediction of protein-RNA binding residues. BMC Bioinform. 18(13), 465 (2017)
Wei, L., Chen, H., Su, R.: M6APred-EL: a sequence-based predictor for identifying N6-methyladenosine sites using ensemble learning. Mol. Ther.-Nucleic Acids 12, 635–644 (2018)
Wei, L., Wan, S., Guo, J., Wong, K.K.L.: A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med. 83, 82–90 (2017)
Wei, L., Xing, P., Zeng, J., Chen, J.X., Su, R., Guo, F.: Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med. 83, 67–74 (2017)
Wei, L., Zhou, C., Chen, H., Song, J., Su, R.: ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 34(23), 4007–4016 (2018)
Xu, H., Zeng, W., Zeng, X., Yen, G.G.: An evolutionary algorithm based on Minkowski distance for many-objective optimization. IEEE Trans. Cybern. (99), 1–12 (2018)
Zeng, X., Ding, N., RodrÃguez-Patón, A., Zou, Q.: Probability-based collaborative filtering model for predicting gene-disease associations. BMC Med. Genom. 10(5), 76 (2017)
Zeng, X., Liao, Y., Liu, Y., Zou, Q.: Prediction and validation of disease genes using hetesim scores. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 14(3), 687–695 (2017)
Zeng, X., Lin, W., Guo, M., Zou, Q.: A comprehensive overview and evaluation of circular RNA detection tools. PLoS Comput. Biol. 13(6), e1005420 (2017)
Zhang, G., Fang, B.: Application of amino acid distribution along the sequence for discriminating mesophilic and thermophilic proteins. Process Biochem. 41(8), 1792–1798 (2006)
Zhang, G., Fang, B.: Discrimination of thermophilic and mesophilic proteins via pattern recognition methods. Process Biochem. 41(3), 552–556 (2006)
Zhang, G., Fang, B.: Logitboost classifier for discriminating thermophilic and mesophilic proteins. J. Biotechnol. 127(3), 417–424 (2007)
Zhang, J., Feng, P., Lin, H., Chen, W.: Identifying RNA N6-methyladenosine sites in escherichia coli genome. Front. Microbiol. 9, 955 (2018)
Zhang, J., Zhang, Z., Chen, Z., Deng, L.: Integrating multiple heterogeneous networks for novel LncRNA-disease association inference. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017)
Zhang, W., Liu, X., Chen, Y., Wu, W., Wang, W., Li, X.: Feature-derived graph regularized matrix factorization for predicting drug side effects. Neurocomputing 287, 154–162 (2018)
Zhang, W., Qu, Q., Zhang, Y., Wang, W.: The linear neighborhood propagation method for predicting long non-coding RNA-protein interactions. Neurocomputing 273, 526–534 (2018)
Zhang, X., Zou, Q., Rodriguez-Paton, A., et al.: Meta-path methods for prioritizing candidate disease miRNAs. IEEE/ACM Trans. Comput. Biol. Bioinform (2017)
Zhang, Z., Zhang, J., Fan, C., Tang, Y., Deng, L.: KATZLGO: large-scale prediction of LncRNA functions by using the KATZ measure based on multiple networks. IEEE/ACM Trans. Comput. Biol. Bioinform (2017)
Zhu, X.-J., Feng, C.-Q., Lai, H.-Y., Chen, W., Hao, L.: Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl.-Based Syst. 163, 787–793 (2019)
Zou, Q., Li, J., Song, L., Zeng, X., Wang, G.: Similarity computation strategies in the microrna-disease network: a survey. Briefings Func. Genom. 15(1), 55–64 (2015)
Zou, Q., Wan, S., Zeng, X., Ma, Z.S.: Reconstructing evolutionary trees in parallel for massive sequences. BMC Syst. Biol. 11(6), 100 (2017)
Zou, Q., Zeng, J., Cao, L., Ji, R.: A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 173, 346–354 (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J., Zhu, P., Zou, Q. (2019). Prediction of Thermophilic Proteins Using Voting Algorithm. In: Rojas, I., Valenzuela, O., Rojas, F., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2019. Lecture Notes in Computer Science(), vol 11465. Springer, Cham. https://doi.org/10.1007/978-3-030-17938-0_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-17938-0_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17937-3
Online ISBN: 978-3-030-17938-0
eBook Packages: Computer ScienceComputer Science (R0)