Skip to main content

A Markovian Approach for the Segmentation of Chimpanzee Genome

  • Conference paper
Bioinformatics Research and Development (BIRD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4414))

Included in the following conference series:

  • 1142 Accesses

Abstract

Hidden Markov models (HMMs) are effective tools to detect series of statistically homogeneous structures, but they are not well suited to analyse complex structures such as DNA sequences. Numerous methodological difficulties are encountered when using HMMs to model non geometric distribution such as exons length, to segregate genes from transposons or retroviruses, or to determine the isochore classes of genes. The aim of this paper is to suggest new tools for the exploration of genome data. We show that HMMs can be used to analyse complex gene structures with bell-shaped length distribution by introducing macros-states. Our HMMs methods take into account many biological properties and were developped to model the isochore organisation of the chimpanzee genome which is considered as a fondamental level of genome organisation. A clear isochore structure in the chimpanzee genome, correlated with the gene density and guanine-cytosine content, has been identified.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Thiery, J.P., Macaya, G., Bernardi, G.: An analysis of eukaryotic genomes by density gradient centrifugation. J. Mol. Biol. 108(1), 219–235 (1976)

    Article  Google Scholar 

  2. Bernardi, G.: Isochores and the evolutionary genomics of vertebrates (review). Gene 241(1), 3–17 (2000)

    Article  Google Scholar 

  3. Krogh, A.: Two methods for improving performance of an HMM and their application for gene-finding. In: Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, pp. 179–186 (1997)

    Google Scholar 

  4. Henderson, J., Salzberg, S., Fasman, K.H.: Finding genes in DNA with a hidden Markov model. Journal of Computational Biology 4, 127–141 (1997)

    Article  Google Scholar 

  5. Lukashin, V.A., Borodovsky, M.: Gene-Mark.hmm: new solutions for gene finding. Nucleic Acids Research 26, 1107–1115 (1998)

    Article  Google Scholar 

  6. Burge, C., Karlin, S.: Prediction of complete gene structure in human genomic DNA. Journal of Molecular Biology 268, 78–94 (1997)

    Article  Google Scholar 

  7. Berget, S.M.: Exon recognition in vertebrate splicing. The Journal of Biological Chemistry 270(6), 2411–2414 (1995)

    Google Scholar 

  8. Hawkins, J.D.: A survey on intron and exon lengths. Nucleic Acids Research 16, 9893–9908 (1998)

    Article  Google Scholar 

  9. Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Poceeding of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  10. Guédon, Y.: Estimating hidden semi-Markov chains from discrete sequences. Journal of Computational and Graphical Statistics 12(3), 604–639 (2003)

    Article  MathSciNet  Google Scholar 

  11. Macaya, G., Thiery, J.P., Bernardi, G.: An approach to the organization of eukaryotic genomes at a macromolecular level. J. Mol. Biol. 108(1), 237–254 (1976)

    Article  Google Scholar 

  12. Eyre-Walker, A., Hurst, L.D.: The evolution of isochores (Review). Nat. Rev. Genet. 2(7), 549–555 (2001)

    Article  Google Scholar 

  13. Nekrutenko, A., Li, W.H.: Assessment of compositional heterogeneity within and between eukaryotic genomes. Genome Res. 10(12), 1986–1995 (2000)

    Article  Google Scholar 

  14. Bernaola-Galvan, P., Carpena, P., Roman-Roldon, R., Oliver, J.L.: Mapping isochores by entropic segmentation of long genome sequences. In: Sankoff, D., Lengauer, T. (eds.) RECOMB Proceedings of the Fifth Annual International Conference on Computational Biology, pp. 217–218 (2001)

    Google Scholar 

  15. Li, W., Bernaola-Galvan, P., Carpena, P., Oliver, J.L.: Isochores merit the prefix ’iso. Comput. Biol. Chem. 27(1), 5–10 (2003)

    Article  Google Scholar 

  16. Oliver, J.L., Carpena, P., Roman-Roldan, R., Mata-Balaguer, T., Mejias-Romero, A., Hackenberg, M., Bernaola-Galvan, P.: Isochore chromosome maps of the human genome. Gene 300(1-2), 117–127 (2002)

    Article  Google Scholar 

  17. Zhang, C.T., Zhang, R.: An isochore map of the human genome based on the Z curve method. Gene 317(1-2), 127–135 (2003)

    Article  Google Scholar 

  18. Costantini, M., Clay, O., Auletta, F., Bernardi, G.: An isochore map of human chromosomes. Genome Research 16, 536–541 (2006)

    Article  Google Scholar 

  19. Bernardi, G., Olofsson, B., Filipski, J., Zerial, M., Salinas, J., Cuny, G., Meunier-Rotival, M., Rodier, F.: The mosaic genome of warm-blooded vertabrates. Science 228(4702), 953–958 (1985)

    Article  Google Scholar 

  20. Mouchiroud, D., D’Onofrio, G., Aissani, B., Macaya, G., Gautier, C., Bernardi, G.: The distribution of genes in the human genome. Gene 100, 181–187 (1991)

    Article  Google Scholar 

  21. D’Onofrio, G., Mouchiroud, D., Aïssani, B., Gautier, C., Bernardi, B.: Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J. Mol. Evol. 32, 504–510 (1991)

    Article  Google Scholar 

  22. Clay, O., Caccio, S., Zoubak, S., Mouchiroud, D., Bernardi, G.: Human coding and non coding DNA: compositional correlations. Mol. Phyl. Evol. 1, 2–12 (1996)

    Article  Google Scholar 

  23. Jabbari, K., Bernardi, G.: CpG doublets, CpG islands and Alu repeats in long human DNA sequences from different isochore families. Gene 224(1-2), 123–127 (1998)

    Article  Google Scholar 

  24. Zoubak, S., Clay, O., Bernardi, G.: The gene distribution of the human genome. Gene 174(1), 95–102 (1996)

    Article  Google Scholar 

  25. Burge, C., Karlin, S.: Finding the genes in genomic DNA. Curr.Opin.Struc.Biol. 8, 346–354 (1998)

    Article  Google Scholar 

  26. Borodovsky, M., McIninch, J.: Recognition of genes in DNA sequences with ambiguities. Biosystems 30(1-3), 161–171 (1993)

    Article  Google Scholar 

  27. Rogic, S., Mackworth, A.K., Ouellette, F.B.: Evaluation of Gene-Finding Programs on Mammalian Sequences. Genome Research 11, 817–832 (2001)

    Article  Google Scholar 

  28. Guéguen, L.: Sarment: Python modules for HMM analysis and partitioning of sequences. Bioinformatics 21(16), 3427–3428 (2005)

    Article  Google Scholar 

  29. De Sario, A., Geigl, E.M., Palmieri, G., D’Urso, M., Bernardi, G.: A compositional map of human chromosome band Xq28. Proc. Natl. Acad. Sci. U S A. 93(3), 1298–1302 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Sepp Hochreiter Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Melodelima, C., Gautier, C. (2007). A Markovian Approach for the Segmentation of Chimpanzee Genome. In: Hochreiter, S., Wagner, R. (eds) Bioinformatics Research and Development. BIRD 2007. Lecture Notes in Computer Science(), vol 4414. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71233-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71233-6_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71232-9

  • Online ISBN: 978-3-540-71233-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics