Skip to main content

IGUANER - DIfferential Gene Expression and fUnctionAl aNalyzER

  • Conference paper
  • First Online:
Big Data Analytics in Astronomy, Science, and Engineering (BDA 2023)

Abstract

In the past fifteen years, the advent of Next-Generation Sequencing technologies, characterized by high efficiency and reduced costs, has marked a pivotal turn for research across various fields including molecular biology, genetics, and molecular medicine. Projects that would have previously required extensive timeframes and significant investments can now be completed swiftly at a fraction of the cost. A direct consequence of the proliferation of these systems is the exponential increase in data generated by RNA-Seq experiments. Much of this data originates from biological samples (cells, tissues, mucus, etc.) of organisms with either absent or incomplete genomic annotations. Compounding this issue is the fact that the surge in data has not been matched by the development of adequate software tools capable of analyzing RNA-Seq data for such organisms. Currently available tools have several limitations: a) they operate in silos, so they only support certain types of analyses, thus complicating the biological interpretation of results; b) they are often executable only via Web interfaces, overlooking the parallelism and efficiency offered by modern supercomputers; c) functional analysis tools rely on outdated functional annotations or support only a limited set of organisms with genomic annotation; d) only one comparison (between two different experimental conditions) can be tested at each run. In order to overcome these limitations, we present IGUANER - (DIfferential Gene expression and fUnctionAl aNalyzER), a software aimed at ensuring the capability for integrated and up-to-date analysis of RNA-Seq data from any organism, regardless of the level of genomic annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.mongodb.com/.

References

  1. Alonso, A., et al.: aRNApipe: a balanced, efficient and distributed pipeline for processing RNA-Seq data in high-performance computing environments. Bioinformatics 33(11), 1727–1729 (2017). https://doi.org/10.1093/bioinformatics/btx023

    Article  Google Scholar 

  2. Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000). https://doi.org/10.1038/75556

    Article  Google Scholar 

  3. Bolis, M., et al.: Network-guided modeling allows tumor-type independent prediction of sensitivity to all-trans-retinoic acid. Ann. Oncol. 28(3), 611–621 (2017). https://doi.org/10.1093/annonc/mdw660

    Article  Google Scholar 

  4. Cantalapiedra, C.P., et al.: eggnog-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38(12), 5825–5829 (2021). https://doi.org/10.1093/molbev/msab293

    Article  Google Scholar 

  5. Castrignanò, T., et al.: ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization. Nucleic Acids Res. 34(WEB. SERV. ISS.), W440–W443 (2006). https://doi.org/10.1093/nar/gkl324

  6. Castrignanò, T., et al.: ASPicDB: a database resource for alternative splicing analysis. Bioinformatics 24(10), 1300–1304 (2008). https://doi.org/10.1093/bioinformatics/btn113

    Article  Google Scholar 

  7. Castrignanò, T., et al.: ELIXIR-IT HPC@CINECA: high performance computing resources for the bioinformatics community. BMC Bioinform. 21 (2020). https://doi.org/10.1186/s12859-020-03565-8

  8. Chiara, M., et al.: CoVaCS: a consensus variant calling system. BMC Genom. 19(1) (2018). https://doi.org/10.1186/s12864-018-4508-1

  9. Chiocchio, A., et al.: Brain de novo transcriptome assembly of a toad species showing polymorphic anti-predatory behavior. Sci. Data 9(1) (2022). https://doi.org/10.1038/s41597-022-01724-5

  10. Cirilli, M., et al.: PeachVar-DB: a curated collection of genetic variations for the interactive analysis of peach genome data. Plant Cell Physiol. 59(1) (2018). https://doi.org/10.1093/pcp/pcx183

  11. Consortium, T.U.: UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51(D1), D523–D531 (2022). https://doi.org/10.1093/nar/gkac1052

  12. Consortium The Gene Ontology: The gene ontology knowledgebase in 2023. Genetics 224(1), iyad031 (2023). https://doi.org/10.1093/genetics/iyad031

  13. Costa-Silva, J., Domingues, D., Lopes, F.M.: RNA-Seq differential expression analysis: an extended review and a software tool. PLoS ONE 12(12), e0190152 (2017). https://doi.org/10.1371/journal.pone.0190152

    Article  Google Scholar 

  14. Flati, T., et al.: A gene expression atlas for different kinds of stress in the mouse brain. Sci. Data 7(1) (2020). https://doi.org/10.1038/s41597-020-00772-z

  15. Flati, T., et al.: HPC-REDItools: a novel HPC-aware tool for improved large scale RNA-editing analysis. BMC Bioinform. 21 (2020). https://doi.org/10.1186/s12859-020-03562-x

  16. Ge, S.X., Son, E.W., Yao, R.: iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinform. 19(1) (2018). https://doi.org/10.1186/s12859-018-2486-6

  17. Gillespie, M., et al.: The reactome pathway knowledgebase 2022. Nucleic Acids Res. 50(D1), D687–D692 (2022). https://doi.org/10.1093/nar/gkab1028

    Article  Google Scholar 

  18. Huang, Q., et al.: RNA-Seq analyses generate comprehensive transcriptomic landscape and reveal complex transcript patterns in hepatocellular carcinoma. PLoS ONE 6(10), e26168 (2011). https://doi.org/10.1371/journal.pone.0026168

    Article  Google Scholar 

  19. Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/MCSE.2007.55

    Article  Google Scholar 

  20. Jimenez-Jacinto, V., Sanchez-Flores, A., Vega-Alvarado, L.: Integrative differential expression analysis for multiple experiments (IDEAMEX): a web server tool for integrated RNA-Seq data analysis. Front. Genet. 10(MAR) (2019). https://doi.org/10.3389/fgene.2019.00279

  21. Kalari, K.R., et al.: MAP-RSeq: mayo analysis pipeline for RNA sequencing. BMC Bioinform. 15(1) (2014). https://doi.org/10.1186/1471-2105-15-224

  22. Kanehisa, M., Goto, S.: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000). https://doi.org/10.1093/nar/28.1.27

    Article  Google Scholar 

  23. Kanehisa, M., Sato, Y., Morishima, K.: BlastKOALA and ghostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J. Mol. Biol. 428(4), 726–731 (2016). https://doi.org/10.1016/j.jmb.2015.11.006

    Article  Google Scholar 

  24. Kanehisa, M., et al.: KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45(D1), D353–D361 (2017). https://doi.org/10.1093/nar/gkw1092

    Article  Google Scholar 

  25. Kanehisa, M., et al.: KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 51(D1), D587–D592 (2023). https://doi.org/10.1093/nar/gkac963

    Article  Google Scholar 

  26. Kohen, R., et al.: UTAP: user-friendly transcriptome analysis pipeline. BMC Bioinform. 20(1) (2019). https://doi.org/10.1186/s12859-019-2728-2

  27. Langfelder, P., Horvath, S.: WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9(1), 559 (2008). https://doi.org/10.1186/1471-2105-9-559

    Article  Google Scholar 

  28. Libro, P., et al.: First brain de novo transcriptome of the Tyrrhenian tree frog, Hyla sarda, for the study of dispersal behavior. Front. Ecol. Evol. 10 (2022). https://doi.org/10.3389/fevo.2022.947186

  29. Libro, P., et al.: De novo transcriptome assembly and annotation for gene discovery in salamandra salamandra at the larval stage. Sci. Data 10(1) (2023). https://doi.org/10.1038/s41597-023-02217-9

  30. Lohse, M., et al.: RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Res. 40(W1), W622–W627 (2012). https://doi.org/10.1093/nar/gks540

    Article  Google Scholar 

  31. Lombardozzi, V., et al.: An interactive database for an ecological analysis of stone biopitting. Int. Biodeterior. Biodegrad. 73, 8–15 (2012). https://doi.org/10.1016/j.ibiod.2012.04.016

    Article  Google Scholar 

  32. Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2. Genome Biol. 15(12), 550 (2014). https://doi.org/10.1186/s13059-014-0550-8

    Article  Google Scholar 

  33. Marguerat, S., Bähler, J.: RNA-Seq: from technology to biology. Cell. Mol. Life Sci. 67(4), 569–579 (2010). https://doi.org/10.1007/s00018-009-0180-6

    Article  Google Scholar 

  34. McKinney, W.: Data structures for statistical computing in python. In: van der Walt, S., Millman, J. (eds.) Proceedings of the 9th Python in Science Conference, pp. 56–61 (2010). https://doi.org/10.25080/Majora-92bf1922-00a

  35. Mistry, J., et al.: Pfam: the protein families database in 2021. Nucleic Acids Res. 49(D1), D412–D419 (2021). https://doi.org/10.1093/nar/gkaa913

    Article  Google Scholar 

  36. Monier, B., et al.: IRIS-EDA: an integrated RNA-Seq interpretation system for gene expression data analysis. PLoS Comput. Biol. 15(2) (2019). https://doi.org/10.1371/journal.pcbi.1006792

  37. Palomba, M., et al.: De novo transcriptome assembly and annotation of the third stage larvae of the zoonotic parasite Anisakis pegreffii. BMC Res. Notes 15(1) (2022). https://doi.org/10.1186/s13104-022-06099-9

  38. Palomba, M., et al.: De novo transcriptome assembly of an Antarctic nematode for the study of thermal adaptation in marine parasites. Sci. Data 10(1) (2023). https://doi.org/10.1038/s41597-023-02591-4

  39. Patro, R., et al.: Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14(4), 417–419 (2017). https://doi.org/10.1038/nmeth.4197

    Article  Google Scholar 

  40. Pertea, M., et al.: StringTie enables improved reconstruction of a transcriptome from RNA-Seq reads. Nat. Biotechnol. 33(3), 290–295 (2015). https://doi.org/10.1038/nbt.3122

    Article  Google Scholar 

  41. Picardi, E., et al.: ExpEdit: a webserver to explore human RNA editing in RNA-Seq experiments. Bioinformatics 27(9), 1311–1312 (2011). https://doi.org/10.1093/bioinformatics/btr117

    Article  Google Scholar 

  42. Reyes, A., et al.: GENAVi: a shiny web application for gene expression normalization, analysis and visualization. BMC Genom. 20(1) (2019). https://doi.org/10.1186/s12864-019-6073-7

  43. Schmidt, B., Hildebrandt, A.: Next-generation sequencing: big data meets high performance computing. Drug Discov. Today 22(4), 712–717 (2017). https://doi.org/10.1016/j.drudis.2017.01.014

    Article  Google Scholar 

  44. Su, W., Sun, J., Shimizu, K., Kadota, K.: TCC-GUI: a shiny-based application for differential expression analysis of RNA-Seq count data. BMC Res. Notes 12(1) (2019). https://doi.org/10.1186/s13104-019-4179-2

  45. Surachat, K., et al.: aTAP: automated transcriptome analysis platform for processing RNA-Seq data by de novo assembly. Heliyon 8(8) (2022). https://doi.org/10.1016/j.heliyon.2022.e10255

  46. Tripathi, R., et al.: Next-generation sequencing revolution through big data analytics. Front. Life Sci. 9(2), 119–149 (2016). https://doi.org/10.1080/21553769.2016.1178180

    Article  Google Scholar 

  47. Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009). https://doi.org/10.1038/nrg2484

    Article  Google Scholar 

  48. Weaver, K., et al.: An Introduction to Statistical Analysis in Research: With Applications in the Biological and Life Sciences. Wiley, Hoboken (2017). https://doi.org/10.1002/9781119454205

    Book  Google Scholar 

  49. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2016), https://ggplot2.tidyverse.org

  50. Wickham, H., Vaughan, D., Girlich, M.: tidyr: tidy messy data (2023). https://tidyr.tidyverse.org

  51. Wickham H., et al.: dplyr: a grammar of data manipulation (2023). https://dplyr.tidyverse.org

  52. Wu, T., et al.: clusterprofiler 4.0: a universal enrichment tool for interpreting omics data. Innov. (Camb.) 2(3), 100141 (2021). https://linkinghub.elsevier.com/retrieve/pii/S2666675821000667

  53. Yu, G., et al.: clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: J. Integr. Biol. 16(5), 284–287 (2012). https://doi.org/10.1089/omi.2011.0118

    Article  Google Scholar 

Download references

Acknowledgments

Part of this research is based on the Cooperative Research Project at the Research Center for Biomedical Engineering CRP-BE-2057.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paolo Bottoni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pinna, V., Di Martino, J., Liberati, F., Bottoni, P., Castrignanò, T. (2024). IGUANER - DIfferential Gene Expression and fUnctionAl aNalyzER. In: Sachdeva, S., Watanobe, Y. (eds) Big Data Analytics in Astronomy, Science, and Engineering. BDA 2023. Lecture Notes in Computer Science, vol 14516. Springer, Cham. https://doi.org/10.1007/978-3-031-58502-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-58502-9_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-58501-2

  • Online ISBN: 978-3-031-58502-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics