Novel Approach to Predict Promoter Region Based on Short Range Interaction Between DNA Sequences

Mugilan, Arul; Nartey, Abraham

doi:10.1007/978-81-322-1602-5_103

Arul Mugilan⁹ &
Abraham Nartey¹⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 236))

1682 Accesses

Abstract

Genomic studies have become one of the useful aspects of Bioinformatics since it provides important information about an organism’s genome once it has been sequenced. Gene finding and promoter predictions are common strategies used in modern Bioinformatics which helps in the provision of an organism’s genomic information. Many works has been carried out on promoter prediction by various scientists and therefore many prediction tools are available. However, there is a high demand for novel prediction tools due to low level of prediction accuracy and sensitivity which are the important features of a good prediction tool. In this paper, we have developed the new algorithm Novel Approach to Promoter Prediction (NAPPR) to predict eukaryotic promoter region using the python programming, which can meet today’s demand to some extent. We have developed the parameters for Singlet (4\(^{1}\)) to nanoplets (4\(^{9}\)) in analyzing short range interactions between the four nucleotide bases in DNA sequences. Using this parameters NAPPR tool was developed to predict promoters with high level of Accuracy, Sensitivity and Specificity after comparing it with other known prediction tools. An Accuracy of 74 % and Specificity of 78 % was achieved after testing it on test sequences from the EPD database. The length of DNA sequence used as input has no limit and can therefore be used to predict promoters even in the whole human genome. At the end, it was found out that NAPPR can predict eukaryotic promoter with high level of accuracy and sensitivity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Smale, S., Kadonaga, J.T.: The RNA polymerase II core promoter. Ann. Rev. Biochem. 72, 449–479 (2003)
Article Google Scholar
Elgar, G., Vavouri, T.: Tuning in to the signals, non-coding sequence conservation in vertebrate genomes. Trends Ganet. 24(7), 344–352 (2008)
Article Google Scholar
Gene Structure.[http://genome.wellcome.ac.uk/doc_WTD020755.html]
Lander, E.S.: The new genomics, global views of biology. Science 3, 536–539 (1996)
Article Google Scholar
Lander, E.S.: Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
Article Google Scholar
Azad, A.K.M., Saima, S., Nasimul, N., Hyunju, L.: Prediction of plant promoters based on hexamers and random triplet pair analysis. Algorithms Mol. Biol. 6, 19 (2011)
Google Scholar
Kornev, A.P., Taylor, S.S., Ten, E.L.F.: A helix scaffold for the assembly of active protein kinases. Proc. Natl. Acad. Sci. 105(38), 14377–14382 (2008)
Article Google Scholar
Ten, E.L.F., Taylor, S.S., Kornev, A.P.: Conserved spatial patterns across the protein kinase family. Biochim. Biophys. Acta 1784(1), 238–243 (2008)
Article Google Scholar
Shuqin, W., Yan, W., Wei, D., Fangxun, S., Xiumei, W., Yanchun, L., Chunguang, Z.: Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms ICANNGA ’07, Pp. 296–305 (2007)
Google Scholar
Resse, M.G.: Application of a time-dalay neural network to promoter annotation in the Drosophila melanogaster genome. Comput. Chem. 26, 51–56 (2001)
Article Google Scholar
Prestridge, D.S.: Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 249, 923–932 (1995)
Article Google Scholar
Christoph, D., Schmid, Viviane, P., Mauro, D., Rouaïda, P., Philipp, B.: Nucl. Acids Res. 32 (suppl 1), D82–D85. (2004). doi:10.1093/nar/gkh122
Arul, M.S.: Sequence, structure and conformational analysis of protein databases. J. Adv. Bioinform. Appl. Res. 2, 183–192 (2011)
Google Scholar
Mugilan, S.A., Veluraja, K.: Generation of deviation parameters for amino acid singlets, doublets and triplets from three-dimensional structures of proteins and its implications in secondary structure prediction from amino acid sequences. J. Bioscience. 5, 81–91 (2000)
Article Google Scholar
Doherty, K., Adams, R., Davey, N.: Non-Euclidean norms and data normalization. Verleysen. 6, 181–186 (2004)
Google Scholar
Óscar, B., Santiago, B.: CNN-PROMOTER, new consensus promoter prediction program based on neural networks. Revista EIA 15, 153–164 (2011)
Google Scholar
Callahan, J.L., Andrews, K.J., Zakian, V.A., Freudenreich, C.H.: Mutations in yeast replication proteins that increase CAG/CTG expansions also increase repeat fragility. Mol. Cell. Biol. 23(21), 7849–7860 (2003)
Article Google Scholar
Wang, G., Vasquez, K.M.: Models for chromosomal replication-independent non-B DNA structure-induced genetic instability. Mol. Carcinog. 48(4), 286–298 (2009)
Article Google Scholar
Kiran, J.A., Veeraraghavulu, P.C., Yellapu, N.K., Somesula, S.R., Srinivasan, S.K., Matcha, B.: Comparison and correlation of simple sequence repeats. Bioinformation 6(5), 179–182 (2011)
Article Google Scholar
Gardiner, G.M., Frommer, M.: CpG islands in vertebrate genomes. J. Mol. Biol. 196, 261–287 (1987)
Article Google Scholar
Ioshikhes, I.P., Zhang, M.Q.: Large-scale human promoter mapping using CpG islands. Nat. Benet. 26, 61–63 (2000)
Google Scholar

Download references

Acknowledgments

Glory be to God almighty for making it possible for us to come out with this kind of project. Much appreciation is also rendered to Karunya University for the support and provision of facilities toward this research project.

Author information

Authors and Affiliations

Department of Bioinformatics, School of Health Science and Biotechnology, Karunya University, Coimbatore, India
Arul Mugilan
Department of Theoretical and Applied Biology, Kwame Nkrumah University of Science and Technology, College of Science, Kumasi, Ghana
Abraham Nartey

Authors

Arul Mugilan
View author publications
You can also search for this author in PubMed Google Scholar
Abraham Nartey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arul Mugilan .

Editor information

Editors and Affiliations

Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
B. V. Babu
Department of Computer Science, Liverpool Hope University, Liverpool, United Kingdom
Atulya Nagar
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Kusum Deep
Department of Paper Technology, Indian Institute of Technology Roorkee, Roorkee, India
Millie Pant
Department of Applied Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal
Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
Kanad Ray
Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
Umesh Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mugilan, A., Nartey, A. (2014). Novel Approach to Predict Promoter Region Based on Short Range Interaction Between DNA Sequences. In: Babu, B., et al. Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. Advances in Intelligent Systems and Computing, vol 236. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1602-5_103

Download citation

DOI: https://doi.org/10.1007/978-81-322-1602-5_103
Published: 26 February 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1601-8
Online ISBN: 978-81-322-1602-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics