Skip to main content

Convolutional Neural Networks for Biological Sequence Taxonomic Classification: A Comparative Study

  • Conference paper
  • First Online:
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2019 (AISI 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1058))

Abstract

Biological sequence classification is a key task in Bioinformatics. For research labs today, the classification of unknown biological sequences is essential for facilitating the identification, grouping and study of organisms and their evolution. This paper compares three of the most recent deep learning works on the 16S rRNA barcode dataset for taxonomic classification. Three different CNN architectures are compared together with three different feature representations, namely: k-mer spectral representation, Frequency Chaos Game Representation (FCGR) and character-level integer encoding. Experimental results and comparisons have shown that representations that hold positional information about the nucleotides in a sequence perform much better with accuracies reaching 91.6% on the most fine-grained classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brandenberg, O., et al.: Introduction to Molecular Biology and Genetic Engineering (2011)

    Google Scholar 

  2. National Human Genome Research Institute: The Human Genome Project (HGP). https://www.genome.gov/human-genome-project. Accessed 17 June 2019

  3. Reece, J.B., et al.: Biology: Concepts & Connections, 7th edn. Pearson Benjamin Cummings, San Francisco (2012)

    Google Scholar 

  4. Kristensen, T., Guillaume, F.: Classification of DNA sequences by a MLP and SVM network. In: Proceedings of the International Conference on Bioinformatics and Computational Biology, The Steering Committee of The World Congress in Computer Science (2013)

    Google Scholar 

  5. Kristensen, T., Guillaume, F.: Different regimes for classification of DNA sequences. In: IEEE 7th International Conference on Cybernetics and Intelligent Systems and IEEE Conference on Robotics, Automation and Mechatronics, IEEE, pp. 114–119 (2015)

    Google Scholar 

  6. Alhersh, T., et al.: Species identification using part of DNA sequence: evidence from machine learning algorithms. In: Proceedings of the 9th EAI International Conference on Bio-Inspired Information and Communications Technologies, ICST, pp. 490–494 (2016)

    Google Scholar 

  7. Dakhli, A., Bellil, W.: Wavelet neural networks for DNA sequence classification using the genetic algorithms and the least trimmed square. Procedia Comput. Sci. 96, 418–427 (2016)

    Article  Google Scholar 

  8. Pashaei, E., Aydin, N.: Frequency difference based DNA encoding methods in human splice site recognition. In: International Conference on Computer Science and Engineering, IEEE, pp. 586–591 (2017)

    Google Scholar 

  9. Huang, J., et al.: An approach of encoding for prediction of splice sites using SVM. Biochimie 88(7), 923–929 (2006)

    Article  Google Scholar 

  10. Rizzo, R., et al.: A deep learning approach to DNA sequence classification. In: International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, pp. 129–140. Springer (2015)

    Google Scholar 

  11. Rizzo, R., et al.: Classification experiments of DNA sequences by using a deep neural network and chaos game representation, pp. 222–228 (2016)

    Google Scholar 

  12. Lo Bosco, G., Di Gangi, M.A.: Deep learning architectures for DNA sequence classification, pp. 162–171 (2017)

    Google Scholar 

  13. Nguyen, N.G., et al.: DNA sequence classification by convolutional neural network. J. Biomed. Sci. Eng. 9, 280–286 (2016)

    Article  Google Scholar 

  14. Yin, B., et al.: An image representation based convolutional network for DNA classification. arXiv preprint arXiv:1806.04931 (2018)

  15. Min, X., et al.: DeepEnhancer: predicting enhancers by convolutional neural networks. In: IEEE International Conference on Bioinformatics and Biomedicine, IEEE, pp. 637–644 (2016)

    Google Scholar 

  16. Ghandi, M., et al.: Enhanced regulatory sequence prediction using gapped k-mer features. PLoS Comput. Biol. 10(7) (2014)

    Google Scholar 

  17. Michigan State University Center for Microbial Ecology. Ribosomal Database Project (RDP). https://rdp.cme.msu.edu/. Accessed 18 June 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sherine Rady .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Helaly, M.A., Rady, S., Aref, M.M. (2020). Convolutional Neural Networks for Biological Sequence Taxonomic Classification: A Comparative Study. In: Hassanien, A., Shaalan, K., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2019. AISI 2019. Advances in Intelligent Systems and Computing, vol 1058. Springer, Cham. https://doi.org/10.1007/978-3-030-31129-2_48

Download citation

Publish with us

Policies and ethics