Skip to main content
Log in

Input encoding method for identifying transcription start sites in RNA polymerase II promoters by neural networks

  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

We present a novel approach to encode inputs to neural networks for the recognition of transcription start sites in RNA polymerase II promoter regions. The approach is based on Markov models that represent TATA-box and Inr transcription binding sites, characterizing a promoter. The Markovian parameters are used as inputs to three neural networks which learn potential distant relationships between the nucleotides at promoter regions. Such an approach allows for incorporating the biological contextual information of the promoter sites into neural network systems and implementing higher-order Markov models of the promoters. Our experiments on a human promoter data set, available at [19], showed an increased correlation coefficient rate of 0.69 on average, which is better than the earlier reported best rate of 0.65 by NNPP 2.1 method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bajic VB, Seah SH, Chong A, Krishnan SPT, Koh JLY, Brusic V (2003) Computer model for recognition of functional transcription start sites in RNA polymerase II promoters of vertebrates. J Mol Graph Modelling 21:323–332

    Google Scholar 

  2. Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94

    Google Scholar 

  3. Burset M, Guigo R (1996) Evaluation of gene structure prediction programs. Genomic 34:353–367

    Google Scholar 

  4. Corne D, Meade A, Sibly R (2001) Evolving core promoter signal motifs. In: Proceedings of the 2001 congress on evolutionary computation (1999). IEEE press 1162–1169

  5. Fickett JW, Hatzigeorgious AG (1997) Eukaryotic promoter recognition. Genome Res 861–878

  6. Haykin S (1999) Neural networks: a compreshensive foundation, 2nd edn.

  7. Ho SL, Rajapakse JC (2003) Splice site detection with a higher-order Markov model implemented on a neural network. Genome Inf 14:64–72

    Google Scholar 

  8. Howard D, Benson K (2003) Evolutionary computation method for promoter site prediction in DNA. Genetic and evolutionary computation conference, Chicago 1690–1701

  9. Nguyen D, Widrow B (1990) Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. International joint conference on neural networks, San Diego 3:21–26

  10. Ohler U, Harback S, Niemann H, Noth E, Rubin GM (2001) Joint modeling of DNA sequence and physical properties to improve eukaryotic promoter recognition. Bioinformatics 17:199–206

    Google Scholar 

  11. Ohler U, Niemann H, Liao G, Reese MG (1999) Interpolated Markov chains for eukaryotic promoter recognition. Bioinformatics 15:362–369

    Google Scholar 

  12. Perdersen AG, Baldi P, Chauvin Y, Brunak S (1999) The biology of eukaryotic promoter prediction – a review. Comput Chem 23:191–207

    Google Scholar 

  13. Plagianakos VP, Magoulas GD, Vrahatis MN (2000) Learning rate adaptation in stochastic gradient descent. `Advances in convex analysis and global optimization' chap 2, pp 15–26

  14. Pinkus A (1999) Approximation theory of the MLP model in neural networks. Acta Numerica 143–195

  15. Reese MG (2001) Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Comput Chem 26:51–56

    Google Scholar 

  16. Salzberg SL, Delcher AL, Fasman K, Henderson J (1998) A decision tree system for finding genes in DNA. J Comput Biol 5:667–680

    Google Scholar 

  17. Scherf M, Klingenhoff A, Werner T (2000) Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel analysis approach. J Mol Biol 297:599–606

    Google Scholar 

  18. Zhang MQ (2002) Computational methods for promoter prediction. `Current topics in computational molecular biology' chap 10, pp 249–267

  19. Promoter dataset: http://www.fruitfly.org/seq_tools/datasets /Human/promoter/

  20. Genie dataset: http://www.fruitfly.org/seq_tools/datasets/Human/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. C. Rajapakse.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ho, L., Rajapakse, J. Input encoding method for identifying transcription start sites in RNA polymerase II promoters by neural networks. Soft Comput 10, 331–337 (2006). https://doi.org/10.1007/s00500-005-0491-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-005-0491-y

Keywords

Navigation