1 Abstract
Massive determination of complete genomes sequences has led to development of different tools for genome comparisons. Our approach is to compare genomes according to typical genomic distributions of a mathematical function that reflects a certain biological function. In this study we used comprehensive genome analysis of DNA curvature distributions before starts and after ends of prokaryotic genes to evaluate the assistance of mathematical and statistical procedures. Due to an extensive amount of data we were able to define the factors influencing the curvature distribution in promoter and terminator regions. Two clustering methods, K-means and PAM were applied and produced very similar clusterings that reflect genomic attributes and environmental conditions of species’ habitat.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Andersson, A. Zomorodipour, J. Andersson, T. Sicheritz-Pontent, U. Alsmark, R. Podowski, A. Naslund, A. Eriksson, H. Winkler, and C. Kurland. The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature, 396:109–110, 1998.
A Bolshoy, P. McNamara, R.E. Harrington, and E.N. Trifonov. Curved DNA without A-A: experimental estimation of all 16 DNA wedge angles. Proc Natl Acad. Sci. U.S.A., 88:2312–2316, 1991.
S. Diekmann and J.C. Wang. On the sequence determinants and flexibility of the kinetoplast DNA fragment with abnormal gel electrophoretic mobilities. J. Mol. Biol., 186:1–11, 1985.
E. Forgy. Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications. Biometrics, 21(3):768, 1965.
C. Fraley and A.E. Raftery. How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41(8):578–588, 1998.
C. Fraser, J. Gocanye, O. White, M. Adams, R. Clayton, R. Fleischmann, D. Bult, A. Kerlavage, G. Sutton, J. Kelly, and et al. The minimal gene complement of Mycoplasma genitalium. Science, 270:397–403, 1995.
J. Griffith, M. Bleyman, C.A. Rauch, P.A. Kitchin, and P.T. Englund. Visualization of the bent helix in kinetoplast DNA by electron microscopy. Cell, 46:717–724, 1986.
S. Hosid and A. Bolshoy. New elements of the termination of transcription in prokaryotes. J. Biomol. Struct. Dyn., 22:347–354, 2004.
W. Kabsch, C. Sander, and E.N. Trifonov. The ten helical twist angles of BDNA. Nucleic Acids Res., 10(3):1097–1104, 1982.
L. Kaufman and P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York, 1990.
L. Kozobay-Avraham, S. Hosid, and A. Bolshoy. Curvature distribution in prokaryotic genomes. In Silico Biol., 4(3):361–375, 2004.
L. Kozobay-Avraham, S. Hosid, and A. Bolshoy. Involvement of DNA curvature in intergenic regions of prokaryotes. Nucleic Acids Res., submited, 2006.
W. Krzanowski and Y. Lai. A criterion for determining the number of groups in a dataset using sum of squares clustering. Biometrics, 44:23–34, 1985.
J. Mardia, K. Kent and J. Bibby. Multivariate Analysis. Academic Press, San Diego, 1979.
J.C. Marini, S.D. Levene, D.M. Crothers, and P.T. Englund. Bent helical structure in kinetoplast DNA. Proc. Natl. Acad. Sci. U.S.A., 79:7664–7668, 1982.
J. Perez-Martin, F. Rojo, and de V. Lorenzo. Promoters responsive to DNA bending: a common theme in prokaryotic gene expression. Microbiol. Rev., 58:268–290, 1994.
S. Shigenobu, H. Watanabe, M. Hattori, K. Sakaki, and H. Ishikawa. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. aps. Nature, 7:81–86, 2000.
E.S. Shpigelman, E.N. Trifonov, and A. Bolshoy. Curvature: software for the analysis of curved DNA. Comput. Appl. Biosci., 9:435–440, 1993.
C. Sugar and G. James. Finding the number of clusters in a data set: An information theoretic approach. J of the American Statistical Association, 98:750–763, 2003.
E. N. Trifonov and L. E. Ulanovsky. Inherently curved DNA and its structural elements. In Unusual DNA Structures, Wells, R. D. and Harvey, S. C. (eds.), pages 173–187. Springer-Verlag, Berlin, 1987.
E.N. Trifonov. Curved DNA. CRC Crit Rev Biochem, 19:89–106, 1985.
H.M. Wu and D.M. Crothers. The locus of sequence-directed and protein-induced DNA bending. Nature, 308:509–513, 1984.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kozobay-Avraham, L., Bolshoy, A., Volkovich, Z. (2006). On prokaryotes’ clustering based on curvature distribution. In: Last, M., Szczepaniak, P.S., Volkovich, Z., Kandel, A. (eds) Advances in Web Intelligence and Data Mining. Studies in Computational Intelligence, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33880-2_28
Download citation
DOI: https://doi.org/10.1007/3-540-33880-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33879-6
Online ISBN: 978-3-540-33880-2
eBook Packages: EngineeringEngineering (R0)