Skip to main content

Part of the book series: Advances in Soft Computing ((AINSC,volume 49))

  • 697 Accesses

Summary

The current wealth of available genomic data provides an unprecedented opportunity to compare and contrast evolutionary histories of closely and distantly related organisms. The focus of this dissertation is on developing novel algorithms and software for efficient global and local comparison of multiple genomes and the application of these methods for a biologically relevant case study. The thesis research is organized into three successive phases, specifically: (1) multiple genome alignment of closely related species, (2) local multiple alignment of interspersed repeats, and finally, (3) a comparative genomics case study of Neisseria. In Phase 1, we first develop an efficient algorithm and data structure for maximal unique match search in multiple genome sequences. We implement these contributions in an interactive multiple genome comparison and alignment tool, M-GCAT, that can efficiently construct multiple genome comparison frameworks in closely related species. In Phase 2, we present a novel computational method for local multiple alignment of interspersed repeats. Our method for local alignment of interspersed repeats features a novel method for gapped extensions of chained seed matches, joining global multiple alignment with a homology test based on a hidden Markov model (HMM). In Phase 3, using the results from the previous two phases we perform a case study of neisserial genomes by tracking the propagation of repeat sequence elements in attempt to understand why the important pathogens of the neisserial group have sexual exchange of DNA by natural transformation. In conclusion, our global contributions in this dissertation have focused on comparing and contrasting evolutionary histories of related organisms via multiple alignment of genomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liolos, K., Tavernarakis, N., Hugenholtz, P., Kyrpides, N.: The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acids Research 34, 332–334 (2006)

    Article  Google Scholar 

  2. Edgar, R.: MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32 (2004)

    Google Scholar 

  3. Thompson, J.D., Higgins, D.G., Gibson, T.: Clustal W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  4. Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302, 205–217 (2000)

    Article  Google Scholar 

  5. Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348 (1994)

    Article  Google Scholar 

  6. Treangen, T.J., Roset, R., Messeguer, X.: Optimized search for common unique substrings, on both forward and reverse strands, in multiple DNA sequences. In: Poster proceedings of the 1st internation conference on Bioinformatics Research and Development BIRD (2007)

    Google Scholar 

  7. Treangen, T.J., Messeguer, X.: M-GCAT: Interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species. BMC Bioinformatics 7, 433 (2006)

    Article  Google Scholar 

  8. Darling, A.E., Treangen, T.J., Zhang, L., Kuiken, C., Messeguer, X., Perna, N.T.: Procrastination leads to efficient filtration for local multiple alignment. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Treangen, T.J., Darling, A.E., Ragan, M.A., Messeguer, X.: Gapped Extension for Local Multiple Alignment of Interspersed DNA Repeats. In: LNBI proceedings of the International Symposium on Bioinformatics Research and Applications ISBRA (2008)

    Google Scholar 

  10. Treangen, T.J., Ambur, O.H., Tonjum, T., Rocha, E.P.C.: The impact of the neisserial DNA uptake sequences on genome evolution and stability. Genome Biology 9(3), R60 (2008)

    Article  Google Scholar 

  11. Goodman, S.D., Scocca, J.J.: Factors influencing the specific interaction of Neisseria gonorrhoeae with transforming DNA. J. Bacteriol. 173, 5921–5923 (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Juan M. Corchado Juan F. De Paz Miguel P. Rocha Florentino Fernández Riverola

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Treangen, T.J., Messeguer, X. (2009). Novel Computational Methods for Large Scale Genome Comparison. In: Corchado, J.M., De Paz, J.F., Rocha, M.P., Fernández Riverola, F. (eds) 2nd International Workshop on Practical Applications of Computational Biology and Bioinformatics (IWPACBB 2008). Advances in Soft Computing, vol 49. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85861-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85861-4_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85860-7

  • Online ISBN: 978-3-540-85861-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics