Abstract
The ability to locate promoters within a section of DNA is known to be a very difficult and very important task in DNA analysis. We document an approach that incorporates the concept of DNA as a complex molecule using several models of its physico-chemical properties. A support vector machine is trained to recognise promoters by their distinctive physical and chemical properties. We demonstrate that by combining models, we can improve upon the classification accuracy obtained with a single model. We also show that by examining how the predictive accuracy of these properties varies over the promoter, we can reduce the number of attributes needed. Finally, we apply this method to a real-world problem. The results demonstrate that such an approach has significant merit in its own right. Furthermore, they suggest better results from a planned combined approach to promoter prediction using both physico-chemical and sequence based techniques.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fickett, J.W., Hatzigeorgiou, A.G.: Eukaryotic Promoter Recognition. Genome Research 7, 861–878 (1997)
Bajic, V.B., Tan, S.L., Suzuki, Y., Sugano, S.: Promoter prediction analysis on the whole human genome. Nature Biotechnology 22, 1467–1473 (2004)
Pedersen, A.G., Baldi, P., Chauvin, Y., Brunak, S.: DNA Structure in Human RNA Polymerase II Promoters. J. Mol. Biol. 281, 663–673 (1998)
Florquin, K., Saeys, Y., Degroeve, S., Rouze, P., Van de Peer, Y.: Large-scale structural analysis of the core promoter in mammalian and plant genomes. Nucl. Acids Res. 33, 4255–4264 (2005)
Fukue, Y., Sumida, N., Nishikawa, J.-i., Ohyama, T.: Core promoter elements of eukaryotic genes have a highly distinctive mechanical property. Nuc. Acids Res. 32, 5834–5840 (2004)
Fukue, Y., Sumida, N., Tanase, J.-i., Ohyama, T.: A highly distinctive mechanical property found in the majority of human promoters and its transcriptional relevance. Nuc. Acids Res. 33, 3821–3827 (2005)
Kanhere, A., Bansal, M.: Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Research 33, 3165–3175 (2005)
Choi, C.H., Kalosakas, G., Rasmussen, K., Hiromura, M., Bishop, A.R., Usheva, A.: DNA dynamically directs its own transcription initiation. Nucleic Acids Res. 32, 1584–1590 (2004)
Tsai, L., Luo, L., Sun, Z.: Sequence-dependent flexibility in promoter sequences. J. Biomol. Struct. Dyn. 20, 127–134 (2002)
Gabrielian, A., Landsman, D., Bolshoy, A.: Curved DNA in promoter sequences. In Silico Biol. 1, 183–196 (1999-2000)
Lisser, S., Margalit, H.: Determination of common structural features in Escherichia coli promoters by computer analysis. Eur. J. Biochem. 223, 823–830 (1994)
Wang, H., Noordeweier, M., Benham, C.J.: Stress-Induced DNA Duplex Destabilization (SIDD) in the E. coli Genome: SIDD Sites Are Closely Associated With Promoters. Genome Research 14, 1575–1584 (2004)
Ohler, U., Niemann, H., Liao, G., Rubin, G.: Joint modeling of DNA sequence and physical properties to improve eukaryotic promoter recognition. Bioinformatics 17, S199–S206 (2001)
Baldi, P., Chauvin, Y., Brunak, S., Anders, J.G., Pedersen, G.: Computational Applications of DNA Structural Scales. In: Int. Conf. Intell. Syst. Mol. Biol., pp. 35–42 (1998)
Ota, T., Suzuki, Y., Nishikawa, T., Otsuki, T., Sugiyama, T., Irie, R., Wakamatsu, A., Hayashi, K., Sato, H., Nagai, K., Kimura, K., Makita, H., Sekine, M., Obayashi, M., Nishi, T., Shibahara, T.: Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat. Genet. 36, 40–45 (2004)
Suzuki, Y., Sugano, S.: Construction of a full-length enriched and a 5’-end enriched cDNA library using the oligo-capping method. Methods Mol. Biol. 221, 73–91 (2003)
Suzuki, Y., Yamashita, R., Sugano, S., Nakai, K.: DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res. 32 (Database issue), 78–81 (2004)
Ivanov, V.I., Minchenkova, L.E.: The A-form of DNA: in search of the biological role. Mol. Biol. (Mosk) 28, 1258–1271 (1994)
Sivolob, A.V., Khrapunov, S.N.: Translational positioning of nucleosomes on DNA: the role of sequence-dependent isotropic DNA bending stiffness. J. Mol. Biol. 247, 918–931 (1995)
Blake, R.D., Delcourt, S.G.: Thermal stability of DNA. Nucleic Acids Res. 26, 3323–3332 (1998)
Breslauer, K., Frank, R., Blocker, H., Marky, L.: Predicting DNA duplex stability from the base sequence. Proc. Natl. Acad. Sci. USA 83, 3746–3750 (1986)
Satchwell, S.C., Drew, H.R., Travers, A.A.: Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol. 191, 659–675 (1986)
el Hassan, M., Calladine, C.: Propeller-twisting of base-pairs and the conformational mobility of dinucleotide steps in DNA. J. Mol. Biol. 259, 95–103 (1996)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Neural Computation 13, 637–649 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Uren, P., Cameron-Jones, R.M., Sale, A. (2006). Promoter Prediction Using Physico-Chemical Properties of DNA. In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_3
Download citation
DOI: https://doi.org/10.1007/11875741_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45767-1
Online ISBN: 978-3-540-45768-8
eBook Packages: Computer ScienceComputer Science (R0)