Abstract
Messenger Ribonucleic acid (mRNA) vaccine faces a challenge of structural instability, due to which the production of vaccine becomes a big challenge. The sequence information of the mRNA vaccine can provide possible degradation sites. Recently, Deep learning areas like Natural Language Processing have shown great promise in understanding these sequences. An appropriate sequence to vector representation is necessary to apply deep learning methods effectively. In this paper, pre-trained dna2vec, rna2vec, and lshvec gene embeddings are compared to identify the best vector representation for predicting the amount of degradation given the mRNA vaccine sequences. The comparison shows that dna2vec embedding performs best.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation 06
Choy CT, Wong CH, Chan SL (2019) Embedding of genes using cancer gene expression data: biological relevance and potential application on biomarker discovery. Front Genet 9:682
Rachlin MWE (2017) mrna vaccines: disruptive innovation in vaccination. Moderna 17:05
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hu S, Ma R, Wang H (2019) An improved deep learning method for predicting dna-binding proteins based on contextual features in amino acid sequences. PLOS ONE 14:1–21, 11
Joulin A, Grave E, Bojanowski P, Douze M, Jégou H, Mikolov T (2016) Fasttext.zip: compressing text classification models. arXiv:1612.03651
Alexey V Lobanov, Anton A Turanov, Dolph L Hatfield, and Vadim N Gladyshev. Dual functions of codons in the genetic code. Critical reviews in biochemistry and molecular biology, 45(4):257–265, 2010
Mostavi M, Salekin S, Huang Y. Deep-2’-o-me: Predicting 2’-o-methylation sites by convolutional neural networks. In: 2018 40th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 2394–2397
Ng P (2017) dna2vec: Consistent vector representations of variable-length k-mers 01
Pan X, Shen H-B (2018) Learning distributed representations of RNA sequences and its application for predicting RNA-protein binding sites with a convolutional neural network. Neurocomputing 305:51–58
Pardi N, Hogan M, Porter F, Weissman D (2018) mRNA vaccines—a new era in vaccinology. Nat Rev Drug Discov 17:01
Pardi N, Hogan MJ, Weissman D (2020) Recent advances in mRNA vaccine technology. Curr Opin Immunol 65:14–20
Premjith B, Soman KP, Kumar MA (2018) A deep learning approach for malayalam morphological analysis at character level. Procedia Comput Sci 132:47–54
Premjith B, Soman KP, Poornachandran P (2018) A deep learning based part-of-speech (POS) tagger for sanskrit language by embedding character level features. In: Proceedings of the 10th annual meeting of the forum for information retrieval evaluation, pp 56–60. ACM
Ramos J (2003) Using tf-idf to determine word relevance in document queries, 01
Rehurek R (2011) Scalability of semantic analysis in natural language processing
Rizzo R, Fiannaca A, La Rosa M, Urso A (2016) A deep learning approach to DNA sequence classification 9874:129–140, 07
Sasidhar TT, Premjith B, Soman KP (2020) Emotion detection in hinglish (hindi+ english) code-mixed social media text. Procedia Comput Sci 171:1346–1352
Shi L, Chen B (2019) A vector representation of DNA sequences using locality sensitive hashing. BioRxiv
Stanford University (2016) Openvaccine: Covid-19 mrna vaccine degradation prediction. https://www.kaggle.com/c/stanford-covid-vaccine/data
Watson JD, Crick FHC (1953) Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature 171(4356):737–738
Zhang N-N, Li X-F, Deng Y-Q, Zhao H, Huang Y-J, Yang G, Huang W-J, Gao P, Zhou C, Zhang R-R, Guo Y, Sun S-H, Fan H, Shu-Long Z, Chen Q, He Q, Cao T-S, Huang X-Y, Qiu H-Y, Nie J-H, Jiang Y, Yan H-Y, Ye Q, Zhong X, Xue X-L, Zha Z-Y, Zhou D, Yang X, Wang Y-C, Ying B, Qin C-F (2020) A thermostable mRNA vaccine against covid-19. Cell 182(5):1271-1283.e16
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Krishna, U.V., Premjith, B., Soman, K.P. (2022). A Comparative Study of Pre-trained Gene Embeddings for COVID-19 mRNA Vaccine Degradation Prediction. In: Giri, D., Raymond Choo, KK., Ponnusamy, S., Meng, W., Akleylek, S., Prasad Maity, S. (eds) Proceedings of the Seventh International Conference on Mathematics and Computing . Advances in Intelligent Systems and Computing, vol 1412. Springer, Singapore. https://doi.org/10.1007/978-981-16-6890-6_22
Download citation
DOI: https://doi.org/10.1007/978-981-16-6890-6_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6889-0
Online ISBN: 978-981-16-6890-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)