Abstract
Polyadenylation is an essential post-transcriptional processing step in the maturation of eukaryotic mRNA. The coming flood of next-generation sequencing (NGS) data creates new opportunities for intensive study of polyadenylation. We present an automated flow called PATMAP to identify polyadenylation sites (poly(A) sites) by integrating NGS data cleaning, processing, mapping, normalizing and clustering. The ambiguous region was introduced to parse the genome annotation by first. Then a series of Perl scripts were seamlessly integrated to iteratively map the single-end or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same coordinate were grouped into one cleavage site, and the internal priming artifacts were removed. Finally, these cleavage sites from different samples were normalized by a MA-based method and clustered into poly(A) clusters (PACs) by empirical Bayesian method. The effectiveness of PATMAP was demonstrated by identifying thousands of reliable PACs from millions of NGS sequences in Arabidopsis and yeast.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Xing, D., Li, Q.Q.: Alternative Polyadenylation and Gene Expression Regulation in Plants. Wiley Interdisciplinary Reviews: RNA 2, 445–458 (2010)
Shen, Y., Ji, G., Haas, B.J., Wu, X., Zheng, J., Reese, G.J., Li, Q.Q.: Genome Level Analysis of Rice mRNA 3’-End Processing Signals and Alternative Polyadenylation. Nucleic Acids Res. 36, 3150–3161 (2008)
Tian, B., Hu, J., Zhang, H.B., Lutz, C.S.: A Large-Scale Analysis of mRNA Polyadenylation of Human and Mouse Genes. Nucleic Acids Res. 33, 201–212 (2005)
Wu, X., Liu, M., Downie, B., Liang, C., Ji, G., Li, Q.Q., Hunt, A.G.: Genome-Wide Landscape of Polyadenylation in Arabidopsis Provides Evidence for Extensive Alternative Polyadenylation. Proc. Natl. Acad. Sci. USA. 108, 12533–12538 (2011)
Shen, Y., Liu, Y., Liu, L., Liang, C., Li, Q.Q.: Unique Features of Nuclear mRNA Poly(a) Signals and Alternative Polvadenylation in Chlamydomonas Reinhardtii. Genetics 179, 167–176 (2008)
Shen, Y., Venu, R.C., Nobuta, K., Wu, X., Notibala, V., Demirci, C., Meyers, B.C., Wang, G.-L., Ji, G., Li, Q.Q.: Transcriptome Dynamics through Alternative Polyadenylation in Developmental and Environmental Responses in Plants Revealed by Deep Sequencing. Genome Res. 21, 1478–1486 (2011)
Meyers, B.C., Vu, T.H., Tej, S.S., Ghazal, H., Matvienko, M., Agrawal, V., Ning, J.C., Haudenschild, C.D.: Analysis of the Transcriptional Complexity of Arabidopsis Thaliana by Massively Parallel Signature Sequencing. Nat. Biotechnol. 22, 1006–1011 (2004)
Jin, Y., Bian, T.: Nontemplated Nucleotide Addition Prior to Polyadenylation: A Comparison of Arabidopsis cDNA and Genomic Sequences. RNA 10, 1695–1697 (2004)
Liang, C., Liu, Y.S., Liu, L., Davis, A.C., Shen, Y.J., Li, Q.Q.: Expressed Sequence Tags with cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas Reinhardtii. Genetics 179, 83–93 (2008)
Tian, B., Pan, Z.H., Lee, J.Y.: Widespread mRNA Polyadenylation Events in Introns Indicate Dynamic Interplay between Polyadenylation and Splicing. Genome Res. 17, 156–165 (2007)
Levin, J.Z., Yassour, M., Adiconis, X., Nusbaum, C., Thompson, D.A., Friedman, N., Gnirke, A., Regev, A.: Comprehensive Comparative Analysis of Strand-Specific Rna Sequencing Methods. Nat. Methods. 7, 709–767 (2010)
Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome. Genome Biol. 10 (2009)
Hardcastle, T.J., Kelly, K.A.: Bayseq: Empirical Bayesian Methods for Identifying Differential Expression in Sequence Count Data. BMC Bioinformatics 11 (2010)
Dudoit, S., Yang, Y.H., Callow, M.J., Speed, T.P.: Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments. Statistica Sinica 12, 111–139 (2002)
Bullard, J.H., Purdom, E., Hansen, K.D., Dudoit, S.: Evaluation of Statistical Methods for Normalization and Differential Expression in mRNA-Seq Experiments. BMC Bioinformatics 11 (2010)
Smyth, G.K.: Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments. Statistical Applications in Genetics and Molecular Biology 3, article3 (2004)
Graber, J.H., Cantor, C.R., Mohr, S.C., Smith, T.F.: Genomic Detection of New Yeast Pre-mRNA 3 ’-End-Processing Signals. Nucleic Acids Res. 27, 888–894 (1999)
Jan, C.H., Friedman, R.C., Ruby, J.G., Bartel, D.P.: Formation, Regulation and Evolution of Caenorhabditis Elegans 3’utrs. Nature 469, 97–101 (2011)
Lee, A., Hansen, K.D., Bullard, J., Dudoit, S., Sherlock, G.: Novel Low Abundance and Transient Rnas in Yeast Revealed by Tiling Microarrays and Ultra High-Throughput Sequencing Are Not Conserved across Closely Related Yeast Species. PLoS Genet. 4, e1000299 (2008)
Wu, T.D., Watanabe, C.K.: Gmap: A Genomic Mapping and Alignment Program for mRNA and EST Sequences. Bioinformatics 21, 1859–1875 (2005)
Abraham, A., Corchado, E., Corchado, J.M.: Hybrid Learning Machines. Neurocomputing 72, 2729–2730 (2009)
Garcia, S., Fernandez, A., Luengo, J., Herrera, F.: Advanced Nonparametric Tests for Multiple Comparisons in the Design of Experiments in Computational Intelligence and Data Mining: Experimental Analysis of Power. Information Sciences 180, 2044–2064 (2010)
Corchado, E., Graña, M., Woźniak, M.: New Trends and Applications on Hybrid Artificial Intelligence Systems. Neurocomputing 75, 61–63 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, X., Tang, M., Yao, J., Lin, S., Xiang, Z., Ji, G. (2012). PATMAP: Polyadenylation Site Identification from Next-Generation Sequencing Data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-28942-2_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28941-5
Online ISBN: 978-3-642-28942-2
eBook Packages: Computer ScienceComputer Science (R0)