Skip to main content

It’s DE-licious: A Recipe for Differential Expression Analyses of RNA-seq Experiments Using Quasi-Likelihood Methods in edgeR

  • Protocol
  • First Online:
Statistical Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1418))

Abstract

RNA sequencing (RNA-seq) is widely used to profile transcriptional activity in biological systems. Here we present an analysis pipeline for differential expression analysis of RNA-seq experiments using the Rsubread and edgeR software packages. The basic pipeline includes read alignment and counting, filtering and normalization, modelling of biological variability and hypothesis testing. For hypothesis testing, we describe particularly the quasi-likelihood features of edgeR. Some more advanced downstream analysis steps are also covered, including complex comparisons, gene ontology enrichment analyses and gene set testing. The code required to run each step is described, along with an outline of the underlying theory. The chapter includes a case study in which the pipeline is used to study the expression profiles of mammary gland cells in virgin, pregnant and lactating mice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mortazavi A et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5.7:621–628

    Article  CAS  PubMed  Google Scholar 

  2. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10.1:57–63

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Shendure J, Aiden EL (2012) The expanding scope of DNA sequencing. Nat Biotechnol 30.11:1084–1094

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Liao Y, Smyth GK, Shi W (2013) The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res 41.10:e108

    Article  PubMed  PubMed Central  Google Scholar 

  5. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26.1:139–140

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. McCarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40.10:4288–4297

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Lund et al SP (2012) Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol 11.5:Article 8

    Google Scholar 

  8. Robinson MD, Smyth GK (2008) Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9.2:321–332

    Article  PubMed  Google Scholar 

  9. Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23.21:2881–2887

    Article  CAS  PubMed  Google Scholar 

  10. Anders S et al (2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat Protoc 8:1765–1786

    Google Scholar 

  11. Fu NY, Rios A, Pal B, Soetanto R, Lun ATL, Liu K, Beck T, Best S, Vaillant F, Bouillet P, Strasser A, Preiss T, Smyth GK, Lindeman G, Visvader J (2015) EGF-mediated induction of Mcl-1 at the switch to lactation is essential for alveolar cell survival. Nat Cell Biol 17.4:365–375

    Google Scholar 

  12. Huber W et al (2015) Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods 12.2:115–121

    Google Scholar 

  13. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25.9:1105–1111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general-purpose read summarization program. Bioinformatics 30:923–930

    Article  CAS  PubMed  Google Scholar 

  15. Anders S, Pyl PT, Huber W (2015) HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31.2:166–169

    Article  PubMed  PubMed Central  Google Scholar 

  16. Smyth GK (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3.1:Article 3

    Google Scholar 

  17. Phipson B et al (2013) Empirical Bayes in the presence of exceptional cases, with application to microarray data. Tech. rep. Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia, May 2013. http://www.statsci.org/smyth/pubs/RobustEBayesPreprint.pdf

  18. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11.3:R25

    Article  PubMed  PubMed Central  Google Scholar 

  19. Wu D et al (2010) ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics 26.17:2176–2182

    Google Scholar 

Download references

Acknowledgements

This worked was funded by the University of Melbourne (Elizabeth and Vernon Puzey Scholarship to Aaron T.L. Lun), by the National Health and Medical Research Council (NHMRC) (Fellowship 1058892 and Program 1054618 to Gordon K. Smyth), by the NHMRC Independent Research Institutes Infrastructure Support (IRIIS) Scheme, and by a Victorian State Government Operational Infrastructure Support (OIS) Grant.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Lun, A.T.L., Chen, Y., Smyth, G.K. (2016). It’s DE-licious: A Recipe for Differential Expression Analyses of RNA-seq Experiments Using Quasi-Likelihood Methods in edgeR. In: Mathé, E., Davis, S. (eds) Statistical Genomics. Methods in Molecular Biology, vol 1418. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3578-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3578-9_19

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3576-5

  • Online ISBN: 978-1-4939-3578-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics