Abstract
Genomic studies have become one of the useful aspects of Bioinformatics since it provides important information about an organism’s genome once it has been sequenced. Gene finding and promoter predictions are common strategies used in modern Bioinformatics which helps in the provision of an organism’s genomic information. Many works has been carried out on promoter prediction by various scientists and therefore many prediction tools are available. However, there is a high demand for novel prediction tools due to low level of prediction accuracy and sensitivity which are the important features of a good prediction tool. In this paper, we have developed the new algorithm Novel Approach to Promoter Prediction (NAPPR) to predict eukaryotic promoter region using the python programming, which can meet today’s demand to some extent. We have developed the parameters for Singlet (4\(^{1}\)) to nanoplets (4\(^{9}\)) in analyzing short range interactions between the four nucleotide bases in DNA sequences. Using this parameters NAPPR tool was developed to predict promoters with high level of Accuracy, Sensitivity and Specificity after comparing it with other known prediction tools. An Accuracy of 74 % and Specificity of 78 % was achieved after testing it on test sequences from the EPD database. The length of DNA sequence used as input has no limit and can therefore be used to predict promoters even in the whole human genome. At the end, it was found out that NAPPR can predict eukaryotic promoter with high level of accuracy and sensitivity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Smale, S., Kadonaga, J.T.: The RNA polymerase II core promoter. Ann. Rev. Biochem. 72, 449–479 (2003)
Elgar, G., Vavouri, T.: Tuning in to the signals, non-coding sequence conservation in vertebrate genomes. Trends Ganet. 24(7), 344–352 (2008)
Gene Structure.[http://genome.wellcome.ac.uk/doc_WTD020755.html]
Lander, E.S.: The new genomics, global views of biology. Science 3, 536–539 (1996)
Lander, E.S.: Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
Azad, A.K.M., Saima, S., Nasimul, N., Hyunju, L.: Prediction of plant promoters based on hexamers and random triplet pair analysis. Algorithms Mol. Biol. 6, 19 (2011)
Kornev, A.P., Taylor, S.S., Ten, E.L.F.: A helix scaffold for the assembly of active protein kinases. Proc. Natl. Acad. Sci. 105(38), 14377–14382 (2008)
Ten, E.L.F., Taylor, S.S., Kornev, A.P.: Conserved spatial patterns across the protein kinase family. Biochim. Biophys. Acta 1784(1), 238–243 (2008)
Shuqin, W., Yan, W., Wei, D., Fangxun, S., Xiumei, W., Yanchun, L., Chunguang, Z.: Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms ICANNGA ’07, Pp. 296–305 (2007)
Resse, M.G.: Application of a time-dalay neural network to promoter annotation in the Drosophila melanogaster genome. Comput. Chem. 26, 51–56 (2001)
Prestridge, D.S.: Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 249, 923–932 (1995)
Christoph, D., Schmid, Viviane, P., Mauro, D., Rouaïda, P., Philipp, B.: Nucl. Acids Res. 32 (suppl 1), D82–D85. (2004). doi:10.1093/nar/gkh122
Arul, M.S.: Sequence, structure and conformational analysis of protein databases. J. Adv. Bioinform. Appl. Res. 2, 183–192 (2011)
Mugilan, S.A., Veluraja, K.: Generation of deviation parameters for amino acid singlets, doublets and triplets from three-dimensional structures of proteins and its implications in secondary structure prediction from amino acid sequences. J. Bioscience. 5, 81–91 (2000)
Doherty, K., Adams, R., Davey, N.: Non-Euclidean norms and data normalization. Verleysen. 6, 181–186 (2004)
Óscar, B., Santiago, B.: CNN-PROMOTER, new consensus promoter prediction program based on neural networks. Revista EIA 15, 153–164 (2011)
Callahan, J.L., Andrews, K.J., Zakian, V.A., Freudenreich, C.H.: Mutations in yeast replication proteins that increase CAG/CTG expansions also increase repeat fragility. Mol. Cell. Biol. 23(21), 7849–7860 (2003)
Wang, G., Vasquez, K.M.: Models for chromosomal replication-independent non-B DNA structure-induced genetic instability. Mol. Carcinog. 48(4), 286–298 (2009)
Kiran, J.A., Veeraraghavulu, P.C., Yellapu, N.K., Somesula, S.R., Srinivasan, S.K., Matcha, B.: Comparison and correlation of simple sequence repeats. Bioinformation 6(5), 179–182 (2011)
Gardiner, G.M., Frommer, M.: CpG islands in vertebrate genomes. J. Mol. Biol. 196, 261–287 (1987)
Ioshikhes, I.P., Zhang, M.Q.: Large-scale human promoter mapping using CpG islands. Nat. Benet. 26, 61–63 (2000)
Acknowledgments
Glory be to God almighty for making it possible for us to come out with this kind of project. Much appreciation is also rendered to Karunya University for the support and provision of facilities toward this research project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer India
About this paper
Cite this paper
Mugilan, A., Nartey, A. (2014). Novel Approach to Predict Promoter Region Based on Short Range Interaction Between DNA Sequences. In: Babu, B., et al. Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. Advances in Intelligent Systems and Computing, vol 236. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1602-5_103
Download citation
DOI: https://doi.org/10.1007/978-81-322-1602-5_103
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1601-8
Online ISBN: 978-81-322-1602-5
eBook Packages: EngineeringEngineering (R0)