Abstract
Examining the genome sequences of the novel coronavirus (COVID-19) strains is critical to properly understand this disease and its functionalities. In bioinformatics, alignment-free (AF) sequence analysis methods offer a natural framework to investigate and understand the patterns and inherent properties of biological sequences. Thus, AF methods are used in this paper for the analysis and comparison of COVID-19 genome sequences. First, frequent patterns of nucleotide base(s) in COVID-19 genome sequences are extracted. Second, the similarity/dissimilarity between COVID-19 genome sequences are measured with different AF methods. This allows to compare sequences and evaluate the performance of various distance measures employed in AF methods. Lastly, the phylogeny for the COVID-19 genome sequences are constructed with various AF methods as well as the consensus tree that shows the level of support (agreement) among phylogenetic trees built by various AF methods. Obtained results show that AF methods can be used efficiently for the analysis of COVID-19 genome sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Wu, F., et al.: New coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020)
Cucinotta, D., Vanelli, M.: WHO declares COVID-19 a pandemic. Acta Biomed. 91(1), 157–160 (2020)
Perlman, S.: Another decade, another coronavirus. N. Engl. J. Med. 382(8), 760–762 (2020)
Lu, R., et al.: Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding. The Lancet 395(10224), 565–574 (2020)
Kang, Y., et al.: PVTree: a sequential pattern mining method for alignment independent phylogeny reconstruction. Genes 10(2), 73 (2019)
Lu, Y.Y., et al.: CAFE: a C celerated A lignment-FrEe sequence analysis. Nucleic Acids Res. 45(Web Server issue), W554–W559 (2017)
Zielezinski, A., et al.: Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol. 18, 186 (2017)
Vinga, S.: Information theory applications for biological sequence analysis. Brief. Bioninform. 15(3), 376–389 (2014)
Vinga, S., Almeida, J.: Alignment-free sequence comparison–a review. Bioinformatics 19, 513–523 (2003)
Zielezinski, A., et al.: Benchmarking of alignment-free sequence comparison methods. Genome Biol. 20, 144 (2019)
Nawaz, M.S., Fournier-Viger, P., Shojaee, A., Fujita, H.: Using artificial intelligence techniques for COVID-19 genome analysis. Appl. Intell. 51(5), 3086–3103 (2021). https://doi.org/10.1007/s10489-021-02193-w
Cascella, M., et al.: Features, evaluation, and treatment of coronavirus. In: StatPearls [Internet], NBK554776. https://www.ncbi.nlm.nih.gov/books/NBK554776/
Xu, H., et al.: High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa. Int. J. Oral Res. 12(8), 1–5 (2019)
Khailany, R.A., Safdar, M., Ozaslanc, M.: Genomic characterization of a novel SARS-CoV-2. Gene Rep. 19, 100682 (2020)
Shu, J.-J.: A new integrated symmetrical table for genetic codes. Biosystems 151, 21–26 (2017)
Ren, J., et al.: Alignment free sequence analysis and applications. Ann. Rev. Biomed. Data Sci. 1, 93–114 (2018)
Bonham-Carter, O., et al.: Alignment-free genetic sequence comparisons: a review of recent approaches by word analysis. Brief. Bioinform. 15(6), 890–905 (2014)
Song, J., et al.: New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. Brief. Bioinform. 15(3), 343–353 (2014)
Otu, H.H., Sayood, K.A.: A new sequence distance measure for phylogenetic tree construction. Bioinformatics 19(1), 2122–2130 (2003)
Li, M., et al.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250–64 (2004)
Giancarlo, R., Rombo, S.E., Utro, F.: Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies. Brief. Bioinform. 15(3), 390–406 (2014)
Sayers, E.W., et al.: Genbank. Nucleic Acids Res. 48(D1), D84–D86 (2019)
Dong, R., et al.: Analysis of the hosts and transmission paths of SARS-CoV-2 in the COVID-19 outbreak. Genes 11(6), 637 (2020)
Ahsan, M.A., et al.: Bioinformatics resources facilitate understanding and harnessing clinical research of SARS-CoV-2. Briefings Bioinform. bbaa416 (2020)
Noor, S., et al.: Analysis of public reaction to the novel coronavirus (COVID-19) outbreak on Twitter. Kybernetes (2020). https://doi.org/10.1108/K-05-2020-0258
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nawaz, M.S., Fournier-Viger, P., Niu, X., Wu, Y., Lin, J.CW. (2021). COVID-19 Genome Analysis Using Alignment-Free Methods. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12798. Springer, Cham. https://doi.org/10.1007/978-3-030-79457-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-79457-6_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79456-9
Online ISBN: 978-3-030-79457-6
eBook Packages: Computer ScienceComputer Science (R0)