The Poisson Margin Test for Normalisation Free Significance Analysis of NGS Data

Kowalczyk, Adam; Bedo, Justin; Conway, Thomas; Beresford-Smith, Bryan

doi:10.1007/978-3-642-12683-3_19

Adam Kowalczyk^20,21,
Justin Bedo^20,21,
Thomas Conway^20,22 &
…
Bryan Beresford-Smith^20,21

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6044))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

2609 Accesses
2 Citations

Abstract

Motivation: The current methods for the determination of the statistical significance of peaks and regions in NGS data require an explicit normalisation step to compensate for (global or local) imbalances in the sizes of sequenced and mapped libraries. There are no canonical methods for performing such compensations, hence a number of different procedures serving this goal in different ways can be found in the literature. Unfortunately, the normalisation has a significant impact on the final results. Different methods yield very different numbers of detected “significant peaks” even in the simplest scenario of ChIP-Seq experiments which compare the enrichment in a single sample relative to a matching control. This becomes an even more acute issue in the more general case of the comparison of multiple samples, where a number of arbitrary design choices will be required in the data analysis stage, each option resulting in possibly (significantly) different outcomes.

Results: In this paper we investigate a principled statistical procedure which eliminates the need for a normalisation step. We outline its basic properties, in particular the scaling upon depth of sequencing. For the sake of illustration and comparison we report the results of re-analysing a ChIP-Seq experiment for transcription factor binding site detection. In order to quantify the differences between outcomes we use a novel method based on the accuracy of in silico prediction by SVM-models trained on part of the genome and tested on the remainder.

Availability: The supplementary material is available at [1].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kowalczyk, A., Bedo, J., Conway, T., Beresford-Smith, B.: Poisson Margin Test for Normalisation Free Significance Analysis of NGS Data - Supplementary Materials (2009), http://www.genomics.csse.unimelb.edu.au/peakfiltsup
Rozowsky, J., Euskirchen, G., Auerbach, R., Zhang, Z., Gibson, T., Bjornson, R., Carriero, N., Snyder, M., Gerstein, M.: Peakseq enables systematic scoring of chip-seq experiments relative to controls. Nature Biotechnology 27, 66–75 (2009)
Article Google Scholar
Nix, D., Courdy, S., Boucher, K.: Empirical methods for controlling false positives and estimating confidence in chip-seq peaks. BMC Bioinformatics 9, 523 (2008)
Article Google Scholar
Robertson, G., Hirst, M., Bainbridge, M., Bilenky, M., Zhao, Y., Zeng, T., Euskirchen, G., Bernier, B., Varhol, R., Delaney, A., et al.: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nature Methods 4, 651–657 (2007)
Article Google Scholar
Kowalczyk, A.: Some Formal Results for Significance of Short Read Concentrations (2009), http://www.genomics.csse.unimelb.edu.au/shortreadtheory
Baggerly, K.A., Deng, L., Morris, J.S., Aldaz, C.M.: Differential expression in SAGE: accounting for normal between-library variation. Bioinformatics 19, 1477–1483 (2003)
Article Google Scholar
Robinson, M., Smyth, G.: Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23(21), 2881–2887 (2007)
Article Google Scholar
Bloushtain-Qimron, N., Yao, J., Snyder, E.: Cell type-specific dna methylation patterns in the human breast. PANS 105, 14076–14081 (2008)
Article Google Scholar
Zang, C., Schones, D.E., Zeng, C., Cui, K., Zhao, K., Peng, W.: A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25, 1952–1958 (2009)
Article Google Scholar
Keeping, E.: Introduction to Statistical Infernce. Dover, New York (1995) ISBN 0-486-68502-0; Reprint of 1962 edition by D. Van Nostrand Co., Princeton, New Jersey
Google Scholar
Zhang, Y., Liu, T., Meyer, C., Eeckhoute, J., Johnson, D., Bernstein, B., Nussbaum, C., Myers, R., Brown, M., Li, W., Liu, X.S.: Model-based analysis of chip-seq (macs). Genome Biology 9(9), R137 (2008)
Google Scholar
Ji, H., Jiang, H., Ma, W., Johnson, D., Myers, R., Wong, W.: An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nature Biotechnology 26, 1293–1300 (2008)
Article Google Scholar
Sonnenburg, S., Zien, A., Ratsch, G.: Arts: accurate recognition of transcription starts in human. Bioinformatics 22, e423–e480 (2006)
Google Scholar
Abeel, T., Van de Peer, Y., Saeys, Y.: Toward a gold standard for promoter prediction evaluation. Bioinformatics 25, i313–i320 (2009)
Google Scholar
Bedo, J., MacIntyre, G., Haviv, I., Kowalczyk, A.: Simple SVM based whole-genome Segmentation (2009), Available from Nature Precedings http://dx.doi.org/10.1038/npre.2009.3811.1

Download references

Author information

Authors and Affiliations

Victoria Research Laboratory, NICTA,
Adam Kowalczyk, Justin Bedo, Thomas Conway & Bryan Beresford-Smith
Department of Electrical and Electronic Engineering,
Adam Kowalczyk, Justin Bedo & Bryan Beresford-Smith
Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, VIC, 3010, Australia
Thomas Conway

Authors

Adam Kowalczyk
View author publications
You can also search for this author in PubMed Google Scholar
Justin Bedo
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Conway
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Beresford-Smith
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, 02139, Cambridge, MA, USA
Bonnie Berger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kowalczyk, A., Bedo, J., Conway, T., Beresford-Smith, B. (2010). The Poisson Margin Test for Normalisation Free Significance Analysis of NGS Data. In: Berger, B. (eds) Research in Computational Molecular Biology. RECOMB 2010. Lecture Notes in Computer Science(), vol 6044. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12683-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-12683-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12682-6
Online ISBN: 978-3-642-12683-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics