Skip to main content

RF_Bert: A Classification Model of Golgi Apparatus Based on TAPE_BERT Extraction Features

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12837))

Included in the following conference series:

  • 1350 Accesses

Abstract

Golgi is an important eukaryotic organelle. Golgi plays a key role in protein synthesis in eukaryotic cells, and its dysfunction will lead to various genetic and neurodegenerative diseases. In order to better develop drugs to treat diseases, one of the key problems is to identify the protein category of Golgi apparatus. In the past, the physical and chemical properties of Golgi proteins have often been used as feature extraction methods, but more accurate sub-Golgi protein identification is still challenged by existing methods. In this article, we use the Tape-Bert model to extract the features of Golgi body. To create a balanced dataset from an unbalanced Golgi dataset, we used the SMOTE oversampling method. In addition, we screened out the important eigenvalues of 300 dimensions to identify the types of Golgi proteins. In 10-fold cross validation and independent test set test, the accuracy rate reached 90.6% and 95.31%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fujita, Y., et al.: Fragmentation of Golgi apparatus of nigral neurons with α-synuclein-positive inclusions in patients with Parkinson’s disease. Acta Neuropathol. 112(3), 261–265 (2006)

    Article  Google Scholar 

  2. Hoyer, S.: Is sporadic Alzheimer disease the brain type of non-insulin dependent diabetes mellitus? A challenging hypothesis. J. Neural Transm. 105(4–5), 415–422 (1998)

    Article  Google Scholar 

  3. Rose, D.R.: Structure, mechanism and inhibition of Golgi α-mannosidase II. Curr. Opin. Struct. Biol. 22(5), 558–562 (2012)

    Article  Google Scholar 

  4. Gonatas, N.K., Gonatas, J.O., Stieber, A.: The involvement of the Golgi apparatus in the pathogenesis of amyotrophic lateral sclerosis, Alzheimer’s disease, and ricin intoxication. Histochem. Cell Biol. 109(5–6), 591–600 (1998)

    Article  Google Scholar 

  5. Yang, W., et al.: A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform. 14(3), 234–240 (2019)

    Article  Google Scholar 

  6. Wang, Z., Ding, H., Zou, Q.: Identifying cell types to interpret scRNA-seq data: how, why and more possibilities. Brief. Funct. Genomics 19(4), 286–291 (2020)

    Article  Google Scholar 

  7. Yuan, L., Guo, F., Wang, L., Zou, Q.: Prediction of tumor metastasis from sequencing data in the era of genome sequencing. Brief. Funct. Genomics 18(6), 412–418 (2019)

    Article  Google Scholar 

  8. Hummer, B.H., Maslar, D., Gutierrez, M.S., de Leeuw, N.F., Asensio, C.S.: Differential sorting behavior for soluble and transmembrane cargoes at the trans-Golgi network in endocrine cells. Mol. Biol. Cell 31(3), 157–166 (2020)

    Article  Google Scholar 

  9. Deng, S., Liu, H., Qiu, K., You, H., Lei, Q., Lu, W.: Role of the Golgi apparatus in the blood-brain barrier: Golgi protection may be a targeted therapy for neurological diseases. Mol. Neurobiol. 55(6), 4788–4801 (2018)

    Article  Google Scholar 

  10. Villeneuve, J., Duran, J., Scarpa, M., Bassaganyas, L., Van Galen, J., Malhotra, V.: Golgi enzymes do not cycle through the endoplasmic reticulum during protein secretion or mitosis. Mol. Biol. Cell 28(1), 141–151 (2017)

    Article  Google Scholar 

  11. Hou, Y., Dai, J., He, J., Niemi, A.J., Peng, X., Ilieva, N.: Intrinsic protein geometry with application to non-proline cis peptide planes. J. Math. Chem. 57(1), 263–279 (2019)

    Article  MathSciNet  Google Scholar 

  12. Wei, L., Xing, P., Tang, J., Zou, Q.: PhosPred-RF: a novel sequence-based predictor for phosphorylation sites using sequential information only. IEEE Trans. Nanobiosci. 16(4), 240–247 (2017)

    Article  Google Scholar 

  13. Du, X., et al.: DeepPPI: boosting prediction of protein–protein interactions with deep neural networks. J. Chem. Inf. Model. 57(6), 1499–1510 (2017)

    Article  Google Scholar 

  14. van Dijk, A.D.J., et al.: Predicting sub-Golgi localization of type II membrane proteins. Bioinformatics 24(16), 1779–1786 (2008)

    Article  Google Scholar 

  15. Ding, H., et al.: Identify Golgi protein types with modified mahalanobis discriminant algorithm and pseudo amino acid composition. Protein Pept. Lett. 18(1), 58–63 (2011)

    Article  Google Scholar 

  16. Ding, H., et al.: Prediction of Golgi-resident protein types by using feature selection technique. Chemom. Intell. Lab. Syst. 124, 9–13 (2013)

    Article  Google Scholar 

  17. Jiao, Y.-S., Pu-Feng, D.: Predicting Golgi-resident protein types using pseudo amino acid compositions: approaches with positional specific physicochemical properties. J. Theor. Biol. 391, 35–42 (2016)

    Article  Google Scholar 

  18. Jiao, Y.-S., Pu-Feng, D.: Prediction of Golgi-resident protein types using general form of Chou’s pseudo-amino acid compositions: approaches with minimal redundancy maximal relevance feature selection. J. Theor. Biol. 402, 38–44 (2016)

    Article  Google Scholar 

  19. Lv, Z., et al.: A random forest sub-Golgi protein classifier optimized via dipeptide and amino acid composition features. Front. Bioeng. Biotechnol. 7, 215 (2019)

    Article  Google Scholar 

  20. Zhao, W., et al.: Predicting protein sub-Golgi locations by combining functional domain enrichment scores with pseudo-amino acid compositions. J. Theor. Biol. 473, 38–43 (2019)

    Article  Google Scholar 

  21. Yang, R., Zhang, C., Gao, R., Zhang, L.: A novel feature extraction method with feature selection to identify Golgi–resident protein types from imbalanced data. Int. J. Mol. Sci. 17(2), 218 (2016)

    Article  Google Scholar 

  22. Jia, J., Liu, Z., Xiao, X., Liu, B., Chou, K.-C.: IPPBS-Opt: a sequence based ensemble classifier for identifying protein–protein binding sites by optimizing imbalanced training datasets. Molecules 21(1), 95 (2016)

    Article  Google Scholar 

  23. Jia, J., Liu, Z., Xiao, X., Liu, B., Chou, K.-C.: IPPI-Esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J. Theor. Biol. 377, 47–56 (2015)

    Article  Google Scholar 

  24. Liu, B., Fang, L., Wang, S., Wang, X., Li, H., Chou, K.-C.: Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. J. Theor. Biol. 385, 153–159 (2015)

    Article  Google Scholar 

  25. Liu, B., Long, R., Chou, K.-C.: IDHS-EL: Identifying DNase I hyper sensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics 32(16), 2411–2418 (2016)

    Article  Google Scholar 

  26. Ding, H., et al.: ICTX-type: A sequence–based predictor for identifying the types of conotoxins in targeting ion channels. Biomed. Res. Int. 2014, 1–10 (2014)

    Google Scholar 

  27. Liu, B., Gao, X., Zhang, H.: BioSeq–Analysis2.0: An updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res. 47(20), e127 (2019)

    Article  Google Scholar 

  28. Chen, W., Feng, P., Liu, T., Jin, D.: Recent advances in machine learning methods for predicting heat shock proteins. Curr. Drug Metab. 20(3), 224–228 (2019)

    Article  Google Scholar 

  29. Rao, R., et al.: Evaluating protein transfer learning with tape. Adv. Neural Inf. Process. Syst. 32, 9689 (2019)

    Google Scholar 

  30. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  31. Zeng, X., Lin, W., Guo, M., Zou, Q.: A comprehensive overview and evaluation of circular RNA detection tools. PLoS Comput. Biol. 13(6), e1005420 (2017)

    Article  Google Scholar 

  32. Wei, L., Xing, P., Su, R., Shi, G., Ma, Z.S., Zou, Q.: CPPred–RF: a sequence-based predictor for identifying cell–penetrating peptides and their uptake efficiency. J. Proteome Res. 16(5), 2044–2053 (2017)

    Article  Google Scholar 

  33. Wei, L., Xing, P., Zeng, J., Chen, J., Su, R., Guo, F.: Improved prediction of protein–protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med. 83, 67–74 (2017)

    Article  Google Scholar 

  34. Hu, Y., Zhao, T., Zhang, N., Zang, T., Zhang, J., Cheng, L.: Identifying diseases-related metabolites using random walk. BMC Bioinf. 19(S5), 116 (2018)

    Article  Google Scholar 

  35. Zhang, M., et al.: MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters. Bioinformatics 35(17), 2957–2965 (2019)

    Article  Google Scholar 

  36. Song, T., Rodriguez-Paton, A., Zheng, P., Zeng, X.: Spiking neural P systems with colored spikes. IEEE Trans. Cogn. Dev. Syst. 10(4), 1106–1115 (2018)

    Article  Google Scholar 

  37. Lin, X., Quan, Z., Wang, Z.-J., Huang, H., Zeng, X.: A novel molecular representation with BiGRU neural networks for learning atom. Briefings Bioinf. Art. no. bbz125 (2019)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by the University Innovation Team Project of Jinan (2019GXRC015), and in part by Key Science & Technology Innovation Project of Shandong Province (2019JZZY010324), the Natural Science Foundation of China (No. 61902337), the talent project of “Qingtan scholar” of Zaozhuang University, Natural Science Fund for Colleges and Universities in Jiangsu Province (No. 19KJB520016), Jiangsu Provincial Natural Science Foundation (No. SBK2019040953), Young talents of science and technology in Jiangsu.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenzheng Bao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cui, Q., Bao, W., Cao, Y., Yang, B., Chen, Y. (2021). RF_Bert: A Classification Model of Golgi Apparatus Based on TAPE_BERT Extraction Features. In: Huang, DS., Jo, KH., Li, J., Gribova, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2021. Lecture Notes in Computer Science(), vol 12837. Springer, Cham. https://doi.org/10.1007/978-3-030-84529-2_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-84529-2_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-84528-5

  • Online ISBN: 978-3-030-84529-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics