Skip to main content

Advertisement

Log in

Characterization and prediction of mRNA polyadenylation sites in human genes

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

The accurate identification of potential poly(A) sites has contributed to all many studies with regard to alternative polyadenylation. The aim of this study was the development of a machine-learning methodology that will help to discriminate real polyadenylation signals from randomly occurring signals in genomic sequence. Since previous studies have revealed that RNA secondary structure in certain genes has significant impact, the authors tried to computationally pinpoint common structural patterns around the poly(A) sites and to investigate how RNA secondary structure may influence polyadenylation. This involved an initial study on the impact of RNA structure and it was found using motif search tools that hairpin structures might be important. Thus, it was propose that, in addition to the sequence pattern around poly(A) sites, there exists a widespread structural pattern that is also employed during human mRNA polyadenylation. In this study, the authors present a computational model that uses support vector machines to predict human poly(A) sites. The results show that this predictive model has a comparable performance to the current prediction tool. In addition, it was identified common structural patterns associated with polyadenylation using several motif finding programs and this provides new insight into the role of RNA secondary structure plays in polyadenylation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Arhin GK et al (2002) Downstream sequence elements with different affinities for the hnRNP H/H’ protein influence the processing efficiency of mammalian polyadenylation signals. Nucleic Acids Res 30(8):1842–1850

    Article  PubMed  CAS  Google Scholar 

  2. Beaudoing E et al (2000) Patterns of variant polyadenylation signal usage in human genes. Genome Res 10(7):1001–1010

    Article  PubMed  CAS  Google Scholar 

  3. Bennett CL et al (2001) A rare polyadenylation signal mutation of the FOXP3 gene (AAUAAA– > AAUGAA) leads to the IPEX syndrome. Immunogenetics 53(6):435–439

    Article  PubMed  CAS  Google Scholar 

  4. Brockman JM et al (2005) PACdb: polya cleavage site and 3′-UTR database. Bioinformatics 21(18):3691–3693

    Article  PubMed  CAS  Google Scholar 

  5. Brown PH, Tiley LS, Cullen BR (1991) Efficient polyadenylation within the human immunodeficiency virus type 1 long terminal repeat requires flanking U3-specific sequences. J Virol 65(6):3340–3343

    PubMed  CAS  Google Scholar 

  6. Carswell S, Alwine JC (1989) Efficiency of utilization of the simian virus 40 late polyadenylation site: effects of upstream sequences. Mol Cell Biol 9(10):4248–4258

    PubMed  CAS  Google Scholar 

  7. Chen CY, Shyu AB (1995) AU-rich elements: characterization and importance in mRNA degradation. Trends Biochem Sci 20(11):465–470

    Article  PubMed  CAS  Google Scholar 

  8. Cheng Y, Miura RM, Tian B (2006) Prediction of mRNA polyadenylation sites by support vector machine. Bioinformatics 22(19):2320–2325

    Article  PubMed  CAS  Google Scholar 

  9. Colgan DF, Manley JL (1997) Mechanism and regulation of mRNA polyadenylation. Genes Dev 11(21):2755–2766

    Article  PubMed  CAS  Google Scholar 

  10. Ding Y, Chan CY, Lawrence CE (2004) Sfold web server for statistical folding and rational design of nucleic acids. Nucleic Acids Res 32(Web Server issue):W135–W141

    Article  PubMed  CAS  Google Scholar 

  11. Gehring NH et al (2001) Increased efficiency of mRNA 3′ end formation: a new genetic mechanism contributing to hereditary thrombophilia. Nat Genet 28(4):389–392

    Article  PubMed  CAS  Google Scholar 

  12. Graber JH et al (1999) In silico detection of control signals: mRNA 3′-end-processing sequences in diverse species. Proc Natl Acad Sci USA 96(24):14055–14060

    Article  PubMed  CAS  Google Scholar 

  13. Hall-Pogar T et al (2005) Alternative polyadenylation of cyclooxygenase-2. Nucleic Acids Res 33(8):2565–2579

    Article  PubMed  CAS  Google Scholar 

  14. Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31(13):3429–3431

    Article  PubMed  CAS  Google Scholar 

  15. Lee JY et al (2007) PolyA_DB 2: mRNA polyadenylation sites in vertebrate genes. Nucleic Acids Res 35(Database issue):D165–D168

    Article  PubMed  CAS  Google Scholar 

  16. Legendre M, Gautheret D (2003) Sequence determinants in human polyadenylation site selection. BMC Genomics 4(1):7

    Article  PubMed  Google Scholar 

  17. Liu H et al (2003) An in-silico method for prediction of polyadenylation signals in human sequences. Genome Inform 14:84–93

    PubMed  CAS  Google Scholar 

  18. MacDonald CC, Redondo JL (2002) Reexamining the polyadenylation signal: were we wrong about AAUAAA? Mol Cell Endocrinol 190(1–2):1–8

    Article  PubMed  CAS  Google Scholar 

  19. Macke TJ et al (2001) RNAMotif, an RNA secondary structure definition and search algorithm. Nucleic Acids Res 29(22):4724–4735

    Article  PubMed  CAS  Google Scholar 

  20. Mignone F et al (2005) UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Res 33(Database issue):D141–D146

    Article  PubMed  CAS  Google Scholar 

  21. Moreira A et al (1995) Upstream sequence elements enhance poly(A) site efficiency of the C2 complement gene and are phylogenetically conserved. EMBO J 14(15):3809–3819

    PubMed  CAS  Google Scholar 

  22. Natalizio BJ et al (2002) Upstream elements present in the 3′-untranslated region of collagen genes influence the processing efficiency of overlapping polyadenylation signals. J Biol Chem 277(45):42733–42740

    Article  PubMed  CAS  Google Scholar 

  23. Pruitt KD, Maglott DR (2001) RefSeq and locuslink: NCBI gene-centered resources. Nucleic Acids Res 29(1):137–140

    Article  PubMed  CAS  Google Scholar 

  24. Shaw G, Kamen R (1986) A conserved AU sequence from the 3′ untranslated region of GM-CSF mRNA mediates selective mRNA degradation. Cell 46(5):659–667

    Article  PubMed  CAS  Google Scholar 

  25. Tabaska JE, Zhang MQ (1999) Detection of polyadenylation signals in human DNA sequences. Gene 231(1–2):77–86

    Article  PubMed  CAS  Google Scholar 

  26. Tian B et al (2005) A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res 33(1):201–212

    Article  PubMed  CAS  Google Scholar 

  27. Valsamakis A et al (1991) The human immunodeficiency virus type 1 polyadenylylation signal: a 3’ long terminal repeat element upstream of the AAUAAA necessary for efficient polyadenylylation. Proc Natl Acad Sci USA 88(6):2108–2112

    Article  PubMed  CAS  Google Scholar 

  28. Wahle E (1995) 3′-end cleavage and polyadenylation of mRNA precursors. Biochim Biophys Acta 1261(2):183–194

    PubMed  Google Scholar 

  29. Yan J, Marr TG (2005) Computational analysis of 3′-ends of ESTs shows four classes of alternative polyadenylation in human mouse, and rat. Genome Res 15(3):369–375

    Article  PubMed  CAS  Google Scholar 

  30. Yeo G et al (2004) Variation in alternative splicing across human tissues. Genome Biol 5(10):R74

    Article  PubMed  Google Scholar 

  31. Zarudnaya MI et al (2003) Downstream elements of mammalian pre-mRNA polyadenylation signals: primary, secondary and higher-order structures. Nucleic Acids Res 31(5):1375–1386

    Article  PubMed  CAS  Google Scholar 

  32. Zhang MQ (2000) Discriminant analysis and its application in DNA sequence motif recognition. Brief Bioinform 1(4):331–342

    Article  PubMed  CAS  Google Scholar 

  33. Zhang XH et al (2003) Sequence information for the splicing of human pre-mRNA identified by support vector machine classification. Genome Res 13(12):2637–2650

    Article  PubMed  CAS  Google Scholar 

  34. Zien A et al (2000) Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 16(9):799–807

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorng-Tzong Horng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, TH., Wu, LC., Chen, YT. et al. Characterization and prediction of mRNA polyadenylation sites in human genes. Med Biol Eng Comput 49, 463–472 (2011). https://doi.org/10.1007/s11517-011-0732-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-011-0732-4

Keywords

Navigation