Skip to main content

NGS-QC Generator: A Quality Control System for ChIP-Seq and Related Deep Sequencing-Generated Datasets

  • Protocol
  • First Online:
Statistical Genomics

Abstract

The combination of massive parallel sequencing with a variety of modern DNA/RNA enrichment technologies provides means for interrogating functional protein–genome interactions (ChIP-seq), genome-wide transcriptional activity (RNA-seq; GRO-seq), chromatin accessibility (DNase-seq, FAIRE-seq, MNase-seq), and more recently the three-dimensional organization of chromatin (Hi-C, ChIA-PET). In systems biology-based approaches several of these readouts are generally cumulated with the aim of describing living systems through a reconstitution of the genome-regulatory functions. However, an issue that is often underestimated is that conclusions drawn from such multidimensional analyses of NGS-derived datasets critically depend on the quality of the compared datasets. To address this problem, we have developed the NGS-QC Generator, a quality control system that infers quality descriptors for any kind of ChIP-sequencing and related datasets. In this chapter we provide a detailed protocol for (1) assessing quality descriptors with the NGS-QC Generator; (2) to interpret the generated reports; and (3) to explore the database of QC indicators (www.ngs-qc.org) for >21,000 publicly available datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mendoza-Parra MA, Van Gool W, Saleem MAM, Ceschin DG, Gronemeyer H (2013) A quality control system for profiles obtained by ChIP sequencing. Nucleic Acids Res 41, e196

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Bernstein BE, Birney E, Dunham I, Green ED, Gunter C, Snyder M (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57–74. doi:10.1038/nature11247

    Article  Google Scholar 

  3. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M et al (2013) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41:D991–D995. doi:10.1093/nar/gks1193

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Kodama Y, Shumway M, Leinonen R (2012) The Sequence Read Archive: explosive growth of sequencing data. Nucleic Acids Res 40:D54–D56. doi:10.1093/nar/gkr854

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. doi:10.1093/bioinformatics/btp352

    Article  PubMed  PubMed Central  Google Scholar 

  6. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. doi:10.1093/bioinformatics/btq033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Andrews S. FastQC: a quality control tool for high throughput sequence data [Internet]. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. citeulike-article-id:11583827

  8. Patel RK, Jain M (2012) NGS QC toolkit: a toolkit for quality control of next generation sequencing data. PLoS One 7, e30619. doi:10.1371/journal.pone.0030619

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17(1). Next Gener Seq Data Anal. http://journal.embnet.org/index.php/embnetjournal/article/view/200/479

  10. Goecks J, Nekrutenko A, Taylor J (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11:R86. doi:10.1186/gb-2010-11-8-r86

    Article  PubMed  PubMed Central  Google Scholar 

  11. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P et al (2005) Galaxy: a platform for interactive large-scale genome analysis. Genome Res 15:1451–1455. doi:10.1101/gr.4086505

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M et al (2001) Galaxy: a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol. doi:10.1002/0471142727.mb1910s89

    Google Scholar 

  13. Helt GA, Nicol JW, Erwin E, Blossom E, Blanchard SG, Chervitz SA et al (2009) Genoviz Software Development Kit: Java tool kit for building genomics visualization applications. BMC Bioinformatics 10:266. doi:10.1186/1471-2105-10-266

    Article  PubMed  PubMed Central  Google Scholar 

  14. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM et al (2002) The Human Genome Browser at UCSC. Genome Res 12:996–1006. doi:10.1101/gr.229102

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported by funds from SATT/Conectus, the Fondation pour la Recherche Médicale (FRM), the Alliance Nationale pour les Sciences de la Vie et de la Santé–Institut Thématique Multi-organismes Cancer–Institut National du Cancer (INCa) grant “Epigenomics of breast cancer” and “EpiPCa,” the Ligue National Contre le Cancer (to H.G.; Equipe Labellisée).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Antonio Mendoza-Parra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Mendoza-Parra, M.A., Saleem, MA.M., Blum, M., Cholley, PE., Gronemeyer, H. (2016). NGS-QC Generator: A Quality Control System for ChIP-Seq and Related Deep Sequencing-Generated Datasets. In: Mathé, E., Davis, S. (eds) Statistical Genomics. Methods in Molecular Biology, vol 1418. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3578-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3578-9_13

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3576-5

  • Online ISBN: 978-1-4939-3578-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics