Skip to main content

SpliceSCANNER: An Accurate and Interpretable Deep Learning-Based Method for Splice Site Prediction

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)
  • The original version of this chapter was revised: the link in the abstract section is not valid anymore. This has been corrected. The correction to this chapter is available at https://doi.org/10.1007/978-981-99-4749-2_69

Abstract

The identification of splice sites is significant to the delineation of gene structure and the understanding of complicated alternative mechanisms underlying gene transcriptional regulation. Currently, most of the existing approaches predict splice sites utilizing deep learning-based strategies. However, they may fail to assign high weights to important segments of sequences to capture distinctive features. Moreover, they often only apply neural network as a ‘black box’, arising criticism for scarce reasoning behind their decision-making. To address these issues, we present a novel method, SpliceSCANNER, to predict canonical splice sites via integration of attention mechanism with convolutional neural network (CNN). Furthermore, we adopted gradient-weighted class activation mapping (Grad-CAM) to interpret the results derived from models. We trained ten models for donor and acceptor on five species. Experiments demonstrate that SpliceSCANNER outperforms state-of-the-art methods on most of the datasets. Taking human data for instance, it achieves accuracy of 96.36% and 95.77% for donor and acceptor respectively. Finally, the cross-organism validation results illustrate that it has outstanding generalizability, indicating its powerful ability to annotate canonical splice sites for poorly studied species. We anticipate that it can mine potential splicing patterns and bring new advancements to the bioinformatics community. SpliceSCANNER is freely available as a web server at http://www.bioinfo-zhanglab.com/SpliceSCANNER/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 02 September 2023

    A correction has been published.

References

  1. Wang, G.-S., Cooper, T.A.: Splicing in disease: disruption of the splicing code and the decoding machinery. Nat. Rev. Genet. 8, 749–761 (2007)

    Article  Google Scholar 

  2. Burset, M., Seledtsov, I.A., Solovyev, V.V.: SpliceDB: database of canonical and non-canonical mammalian splice sites. Nucl. Acids Res. 29, 255–259 (2001)

    Article  Google Scholar 

  3. Pertea, M., Lin, X., Salzberg, S.L.: GeneSplicer: a new computational method for splice site prediction. Nucl. Acids Res. 29, 1185–1190 (2001)

    Article  Google Scholar 

  4. Trapnell, C., Pachter, L., Salzberg, S.L.: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009)

    Article  Google Scholar 

  5. Kim, D., Langmead, B., Salzberg, S.L.: HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015)

    Article  Google Scholar 

  6. Li, H.: Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018)

    Article  Google Scholar 

  7. Liu, B., Liu, Y., Li, J., Guo, H., Zang, T., Wang, Y.: deSALT: fast and accurate long transcriptomic read alignment with de Bruijn graph-based index. Genome Biol. 20, 1–14 (2019)

    Article  Google Scholar 

  8. Wang, S., et al.: CnnPOGTP: a novel CNN-based predictor for identifying the optimal growth temperatures of prokaryotes using only genomic k-mers distribution. Bioinformatics 38, 3106–3108 (2022)

    Article  Google Scholar 

  9. Hernández, D., Jara, N., Araya, M., Durán, R.E., Buil-Aranda, C.: PromoterLCNN: a light CNN-based promoter prediction and classification model. Genes 13, 1126 (2022)

    Article  Google Scholar 

  10. Zuallaert, J., Godin, F., Kim, M., Soete, A., Saeys, Y., De Neve, W.: SpliceRover: interpretable convolutional neural networks for improved splice site prediction. Bioinformatics 34, 4180–4188 (2018)

    Article  Google Scholar 

  11. Wang, R., Wang, Z., Wang, J., Li, S.: SpliceFinder: ab initio prediction of splice sites using convolutional neural network. BMC Bioinform. 20, 1–13 (2019)

    Article  Google Scholar 

  12. Akpokiro, V., Oluwadare, O., Kalita, J.: DeepSplicer: an improved method of splice sites prediction using deep learning. In: 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 606–609. IEEE (2021)

    Google Scholar 

  13. Akpokiro, V., Martin, T., Oluwadare, O.: EnsembleSplice: ensemble deep learning model for splice site prediction. BMC Bioinform. 23, 413 (2022)

    Article  Google Scholar 

  14. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  15. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  16. Shun, K.T.T., Limanta, E.E., Khan, A.: An evaluation of backpropagation interpretability for graph classification with deep learning. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 561–570. IEEE (2020)

    Google Scholar 

  17. Albaradei, S., et al.: Splice2Deep: an ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA. Gene 763, 100035 (2020)

    Article  Google Scholar 

  18. Teng, Q., Liu, Z., Song, Y., Han, K., Lu, Y.: A survey on the interpretability of deep learning in medical diagnosis. Multimed. Syst. 28, 1–21 (2022)

    Article  Google Scholar 

  19. Nazari, I., Tayara, H., Chong, K.T.: Branch point selection in RNA splicing using deep learning. IEEE Access 7, 1800–1807 (2018)

    Article  Google Scholar 

  20. Blumenthal, T., Spieth, J.: Gene structure and organization in Caenorhabditis elegans. Curr. Opin. Genet. Dev. 6, 692–698 (1996)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by Guangxi Key Laboratory of Image and Graphic Intelligent Processing (GIIP2004), National Natural Science Foundation of China (61862017), and Innovation Project of GUET (Guilin University of Electronic Technology) Graduate Education (2022YCXS063).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanju Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, R., Xu, J., Huang, X., Qi, W., Zhang, Y. (2023). SpliceSCANNER: An Accurate and Interpretable Deep Learning-Based Method for Splice Site Prediction. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_38

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4749-2_38

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4748-5

  • Online ISBN: 978-981-99-4749-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics