Skip to main content

Unsupervised Anomaly Detection Algorithms Unveil Relevant Temporal and Spatial Patterns in the SARS COV2 Codon Usage in México

  • Conference paper
  • First Online:
Advances in Soft Computing (MICAI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15247))

Included in the following conference series:

  • 184 Accesses

Abstract

Genomes are complex biological structures that encode information that can be translated onto several levels, such as genes and proteins. Identification of relevant patterns in genomes is of paramount importance, as they may indicate states of biological or medical relevance. Among the patterns that can be detected, anomalies are especially relevant. Anomalies are instances that do not resemble, under certain metrics, the rest of the observations under study. Anomalies and their detection are relevant since their presence may indicate a systematic error in some stage of the analyzed process or structure, or may indicate that the studied system or phenomenon is undergoing a phase transition or other relevant drift in its dynamics. Here, we applied unsupervised anomaly detection algorithms to the codon usage of the genomes of thousands of SARS COV2 virus isolated in Mexico. Codon usage condenses the relative frequency of appearance of nucleotide triplets, or codons, which code for amino acids, the basic blocks of proteins. By applying several algorithms, we detected patterns that are of epidemiological relevance. The detected patterns are anomalous genomes based on their codon usage. Anomalous patterns are relevant not only because they have not been previously detected in data from Mexico, but also because they allow identification of one of the possible sources of the anomalies. Most of these anomalies were identified in two neighboring states in Mexico, namely Puebla and Tlaxcala. In addition, we identified that almost all anomalies come from subjects who were treated in the same laboratory. Based on the evidence we present here, we conclude that anomaly detection algorithms are relevant in the surveillance of epidemics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jain, A., et al.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000). https://doi.org/10.1109/34.824819

  2. Vogt, J.: Unsupervised structure detection in biomedical data. IEEE Trans. Comp. Biol. Bioinform. (2015). https://doi.org/10.1109/TCBB.2015.2394408

  3. Markou, M., Singh, M.: Novelty detection: a review-Part 1: statistical approaches. Signal Proc. 83(12), 2481–2497 (2003). https://doi.org/10.1016/j.sigpro.2003.07.0

    Article  Google Scholar 

  4. Markou, M., Singh, M.: Novelty detection: a review-Part 2: neural network based approaches. Signal Proc. 83(12), 2499–2521 (2003). https://doi.org/10.1016/j.sigpro.2003.07.019

    Article  Google Scholar 

  5. Wu, F., et al.: A new coronavirus associated with human respiratory disease in China. Nature 7798, 265–269 (2020). https://doi.org/10.1038/s41586-020-2008-3

    Article  Google Scholar 

  6. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/director-general/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020

  7. Hou, W.: Characterization of codon usage pattern in SARS-CoV-2. Virol. J. (2020). https://doi.org/10.1186/s12985-020-01395-x

    Article  Google Scholar 

  8. Gordon, D., et al.: A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature 583(7816) (2020). https://doi.org/10.1038/s41586-020-2286-9

  9. Davidson, A.: Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein. Gen. Med. (2020). https://doi.org/10.1186/s13073-020-00763-0

  10. Maloy, S., Hughes, K.: Brenner’s Encyclopedia of Genetics (Second Edition). Academic Press, San Diego (2013). ISBN: 978-0-08-096156-9

    Google Scholar 

  11. Simón, D., et al.: Nucleotide composition and codon usage across viruses and their respective hosts. Front. Microbiol. (2021). https://doi.org/10.3389/fmicb.2021.64630

    Article  Google Scholar 

  12. Posani, E., et al.: Temporal evolution and adaptation of SARS-CoV-2 codon usage. Front. Biosci. 27(1) (2022). https://doi.org/10.31083/j.fbl2701013

  13. Pimentel, M., et al.: A review on novelty detection. Signal Proc., 215–249 (2014)

    Google Scholar 

  14. Legaria, U., et al.: Anomaly detection in the probability simplex under different geometries. Info. Geo. 6, 385–412 (2023). https://doi.org/10.1007/s41884-023-00107-y

    Article  MathSciNet  Google Scholar 

  15. Irfan, A., et al.: Anomaly detection using K-Means and long-short term memory for predictive maintenance of large-scale photovoltaic plant. Energy Rep. (2023). https://doi.org/10.1016/j.egyr.2023.09.159

    Article  Google Scholar 

  16. Liu, F.T., et al.: Isolation forest. In: Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008). https://doi.org/10.1109/ICDM.2008.17

  17. Vincent, P.L.H.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    Google Scholar 

  18. Welling, M., Kingma, D.: An introduction to variational autoencoders. Found. Trends Mach. Learn. 12(4), 307–392 (2019)

    Article  Google Scholar 

  19. Chen, Z., Yeo, C., Lee, B., Lau, C.: Autoencoder-based network anomaly detection. In: 2018 Wireless Telecommunications Symposium (WTS), pp. 1–5 (2018)

    Google Scholar 

  20. Ferré, Q., Chèneby, J., Puthier, D., Capponi, C., Ballester, B.: Anomaly detection in genomic catalogues using unsupervised multi-view autoencoders. BMC Bioinform. 22 (2021)

    Google Scholar 

  21. Tenenbaum, J., Silva, V., Langdord, J.: A global geometric framework for nonlinear dimensionality reduction. Science 290 (2000)

    Google Scholar 

Download references

Acknowledgments

AN thanks PAPIIT under project TA101323 for financial support. SM and BS received CONAHCYT scholarships for their postgraduate studies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Neme .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Martínez, S., Salas, B., Pérez, N., Neme, A. (2025). Unsupervised Anomaly Detection Algorithms Unveil Relevant Temporal and Spatial Patterns in the SARS COV2 Codon Usage in México. In: Martínez-Villaseñor, L., Ochoa-Ruiz, G. (eds) Advances in Soft Computing. MICAI 2024. Lecture Notes in Computer Science(), vol 15247. Springer, Cham. https://doi.org/10.1007/978-3-031-75543-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-75543-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-75542-2

  • Online ISBN: 978-3-031-75543-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics