Skip to main content

Prediction of Thermophilic Proteins Using Voting Algorithm

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2019)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11465))

Abstract

Thermophilic proteins have widely used in food, medicine, tanning, and oil drilling. By analyzing the protein sequence, the superior structure and properties of the protein sequence are obtained, which is used to efficiently predict the protein species. In this paper, a voting algorithm was designed independently. Protein features and dimensions were extracted and reduced, respectively. Data was predicted by WEKA. Next, the voting algorithm was applied to the data obtained by the above processing. In this experiment, the highest accuracy rate of 93.03% was achieved. This experiment has at least two advantages: First, the voting algorithm was developed independently. Second, any optimization method was not used for this experiment, which prevents over-fitting. Therefore, voting is a very effective strategy for the thermal stability of proteins. The prediction data set used in this paper can be freely downloaded from http://lab.malab.cn/~lijing/thermo_data.html.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alshahrani, M., Khan, M.A., Maddouri, O., Kinjo, A.R., Queralt-Rosinach, N., Hoehndorf, R.: Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics 33(17), 2723–2730 (2017)

    Article  Google Scholar 

  2. Cabarle, F.G.C., Adorna, H.N., Jiang, M., Zeng, X.: Spiking neural P systems with scheduled synapses. IEEE Trans. Nanobiosci. 16(8), 792–801 (2017)

    Article  Google Scholar 

  3. Chen, W., Ding, H., Zhou, X., Lin, H., Chou, K.-C.: iRNA(m6A)-PseDNC: identifying N6-methyladenosine sites using pseudo dinucleotide composition. Anal. Biochem. 561, 59–65 (2018)

    Article  Google Scholar 

  4. Chen, W., Yang, H., Feng, P., Ding, H., Lin, H.: iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 33(22), 3518–3523 (2017)

    Article  Google Scholar 

  5. Chen, Z., et al.: iFeature: a python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics 34(14), 2499–2502 (2018)

    Article  Google Scholar 

  6. Cheng, L., Hu, Y., Sun, J., Zhou, M., Jiang, Q.: DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics 34(11), 1953–1956 (2018)

    Article  Google Scholar 

  7. Cheng, L., et al.: InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genom. 19(1), 919 (2018)

    Article  Google Scholar 

  8. Cheng, L., et al.: LncRNA2Target v2. 0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res. 47(D1), D140–D144 (2018)

    Article  Google Scholar 

  9. Cheng, L., et al.: MetSigDis: a manually curated resource for the metabolic signatures of diseases. Briefings Bioinform. 20(1), 203–209 (2017)

    Article  Google Scholar 

  10. Feng, C.-Q., et al.: iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics (2018)

    Google Scholar 

  11. Michael Gromiha, M., Xavier Suresh, M.: Discrimination of mesophilic and thermophilic proteins using machine learning algorithms. Proteins: Struct. Funct. Bioinform. 70(4), 1274–1279 (2008)

    Article  Google Scholar 

  12. Hu, Y., Zhao, T., Zhang, N., Zang, T., Zhang, J., Cheng, L.: Identifying diseases-related metabolites using random walk. BMC Bioinform. 19(5), 116 (2018)

    Article  Google Scholar 

  13. Li, Y., Russell Middaugh, C., Fang, J.: A novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting relative thermostability of protein mutants. BMC Bioinform. 11(1), 62 (2010)

    Article  Google Scholar 

  14. Liao, Z., Li, D., Wang, X., Li, L., Zou, Q.: Cancer diagnosis through isomiR expression with machine learning method. Curr. Bioinform. 13(1), 57–63 (2018)

    Article  Google Scholar 

  15. Liu, B., Yang, F., Chou, K.-C.: 2L-piRNA: a two-layer ensemble classifier for identifying Piwi-interacting RNAs and their function. Mol. Ther.-Nucleic Acids 7, 267–277 (2017)

    Article  Google Scholar 

  16. Liu, B., Yang, F., Huang, D.-S., Chou, K.-C.: iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 34(1), 33–40 (2017)

    Article  Google Scholar 

  17. Liu, X.-L., Lu, J.-L., Hu, X.-H.: Predicting thermophilic proteins with pseudo amino acid composition: approached from chaos game representation and principal component analysis. Protein Peptide Lett. 18(12), 1244–1250 (2011)

    Article  Google Scholar 

  18. Montanucci, L., Fariselli, P., Martelli, P.L., Casadio, R.: Predicting protein thermostability changes from sequence upon multiple mutations. Bioinformatics 24(13), i190–i195 (2008)

    Article  Google Scholar 

  19. Song, T., Rodríguez-Patón, A., Zheng, P., Zeng, X.: Spiking neural P systems with colored spikes. IEEE Trans. Cogn. Dev. Syst. 10(4), 1106–1115 (2018)

    Article  Google Scholar 

  20. Su, R., Wu, H., Xu, B., Liu, X., Wei, L.: Developing a multi-dose computational model for drug-induced hepatotoxicity prediction based on toxicogenomics data. IEEE/ACM Trans. Comput. Biol. Bioinform. (2018)

    Google Scholar 

  21. Tang, Y., Liu, D., Wang, Z., Wen, T., Deng, L.: A boosting approach for prediction of protein-RNA binding residues. BMC Bioinform. 18(13), 465 (2017)

    Article  Google Scholar 

  22. Wei, L., Chen, H., Su, R.: M6APred-EL: a sequence-based predictor for identifying N6-methyladenosine sites using ensemble learning. Mol. Ther.-Nucleic Acids 12, 635–644 (2018)

    Article  Google Scholar 

  23. Wei, L., Wan, S., Guo, J., Wong, K.K.L.: A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med. 83, 82–90 (2017)

    Article  Google Scholar 

  24. Wei, L., Xing, P., Zeng, J., Chen, J.X., Su, R., Guo, F.: Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med. 83, 67–74 (2017)

    Article  Google Scholar 

  25. Wei, L., Zhou, C., Chen, H., Song, J., Su, R.: ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 34(23), 4007–4016 (2018)

    Google Scholar 

  26. Xu, H., Zeng, W., Zeng, X., Yen, G.G.: An evolutionary algorithm based on Minkowski distance for many-objective optimization. IEEE Trans. Cybern. (99), 1–12 (2018)

    Google Scholar 

  27. Zeng, X., Ding, N., Rodríguez-Patón, A., Zou, Q.: Probability-based collaborative filtering model for predicting gene-disease associations. BMC Med. Genom. 10(5), 76 (2017)

    Article  Google Scholar 

  28. Zeng, X., Liao, Y., Liu, Y., Zou, Q.: Prediction and validation of disease genes using hetesim scores. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 14(3), 687–695 (2017)

    Article  Google Scholar 

  29. Zeng, X., Lin, W., Guo, M., Zou, Q.: A comprehensive overview and evaluation of circular RNA detection tools. PLoS Comput. Biol. 13(6), e1005420 (2017)

    Article  Google Scholar 

  30. Zhang, G., Fang, B.: Application of amino acid distribution along the sequence for discriminating mesophilic and thermophilic proteins. Process Biochem. 41(8), 1792–1798 (2006)

    Article  Google Scholar 

  31. Zhang, G., Fang, B.: Discrimination of thermophilic and mesophilic proteins via pattern recognition methods. Process Biochem. 41(3), 552–556 (2006)

    Article  Google Scholar 

  32. Zhang, G., Fang, B.: Logitboost classifier for discriminating thermophilic and mesophilic proteins. J. Biotechnol. 127(3), 417–424 (2007)

    Article  Google Scholar 

  33. Zhang, J., Feng, P., Lin, H., Chen, W.: Identifying RNA N6-methyladenosine sites in escherichia coli genome. Front. Microbiol. 9, 955 (2018)

    Article  Google Scholar 

  34. Zhang, J., Zhang, Z., Chen, Z., Deng, L.: Integrating multiple heterogeneous networks for novel LncRNA-disease association inference. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017)

    Google Scholar 

  35. Zhang, W., Liu, X., Chen, Y., Wu, W., Wang, W., Li, X.: Feature-derived graph regularized matrix factorization for predicting drug side effects. Neurocomputing 287, 154–162 (2018)

    Article  Google Scholar 

  36. Zhang, W., Qu, Q., Zhang, Y., Wang, W.: The linear neighborhood propagation method for predicting long non-coding RNA-protein interactions. Neurocomputing 273, 526–534 (2018)

    Article  Google Scholar 

  37. Zhang, X., Zou, Q., Rodriguez-Paton, A., et al.: Meta-path methods for prioritizing candidate disease miRNAs. IEEE/ACM Trans. Comput. Biol. Bioinform (2017)

    Google Scholar 

  38. Zhang, Z., Zhang, J., Fan, C., Tang, Y., Deng, L.: KATZLGO: large-scale prediction of LncRNA functions by using the KATZ measure based on multiple networks. IEEE/ACM Trans. Comput. Biol. Bioinform (2017)

    Google Scholar 

  39. Zhu, X.-J., Feng, C.-Q., Lai, H.-Y., Chen, W., Hao, L.: Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl.-Based Syst. 163, 787–793 (2019)

    Article  Google Scholar 

  40. Zou, Q., Li, J., Song, L., Zeng, X., Wang, G.: Similarity computation strategies in the microrna-disease network: a survey. Briefings Func. Genom. 15(1), 55–64 (2015)

    Google Scholar 

  41. Zou, Q., Wan, S., Zeng, X., Ma, Z.S.: Reconstructing evolutionary trees in parallel for massive sequences. BMC Syst. Biol. 11(6), 100 (2017)

    Article  Google Scholar 

  42. Zou, Q., Zeng, J., Cao, L., Ji, R.: A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 173, 346–354 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Pengfei Zhu or Quan Zou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Zhu, P., Zou, Q. (2019). Prediction of Thermophilic Proteins Using Voting Algorithm. In: Rojas, I., Valenzuela, O., Rojas, F., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2019. Lecture Notes in Computer Science(), vol 11465. Springer, Cham. https://doi.org/10.1007/978-3-030-17938-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17938-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17937-3

  • Online ISBN: 978-3-030-17938-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics