Skip to main content

Multiple Genome Alignment by Clustering Pairwise Matches

  • Conference paper
Comparative Genomics (RCG 2004)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3388))

Included in the following conference series:

Abstract

We have developed a multiple genome alignment algorithm by using a sequence clustering algorithm to combine local pairwisegenome sequence matches produced by pairwise genome alignments, e.g, BLASTZ. Sequence clustering algorithms often generate clusters of sequences such that there exists a common shared region among all sequences in each cluster. To use a sequence clustering algorithm for genome alignment, it is necessary to handle numerous local alignments between a pair of genomes. We propose a multiple genome alignment method that converts the multiple genome alignment problem to the sequence clustering problem. This method does not need to make a guide tree to determine the order of multiple alignment, and it accurately detects multiple homologous regions. As a result, our multiple genome alignment algorithm performs competitively over existing algorithms. This is shown using an experiment which compares the performance of TBA, MultiPipMaker (MPM) and our algorithm in aligning 12 groups of 56 microbial genomes and by evaluating the number of common COGs detected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kellis, M., Patterson, N., Endrizzi, M., Birren, B., Lander, E.: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)

    Article  Google Scholar 

  2. Pevzner, P., Tesler, G.: Human and mouse genomic sequences reveal extensive breakpoint reuse in mammalian evolution. Proc. Natl. Acad. Sci. U.S.A. 100, 7672–7677 (2003)

    Article  Google Scholar 

  3. Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.C., Haussler, D., Miller, W.: Human-mouse alignments with BLASTZ. Genome Res. 13, 103–107 (2003)

    Article  Google Scholar 

  4. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  5. Smith, T.F., Waterman, M.S.: Identification of common molecular sequences. J. Mol. Biol. 147, 195–197 (1981)

    Article  Google Scholar 

  6. Lipman, D.J., Altschul, S.F., Kececioglu, J.D.: A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. U.S.A. 86, 4412–4415 (1989)

    Article  Google Scholar 

  7. Thompson, J., Higgins, D., Gibson, T.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  8. Corpet, F.: Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16, 10881–10890 (1988)

    Article  Google Scholar 

  9. Gotoh, O.: Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838 (1996)

    Article  Google Scholar 

  10. Notredame, C., Higgins, D.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)

    Article  Google Scholar 

  11. Kim, J., Pramanik, S., Chung, M.: Multiple sequence alignment using simulated annealing. Comput. Appl. Biosci. 10, 419–426 (1994)

    Google Scholar 

  12. Höhl, M., Kurtz, S., Ohlebusch, E.: Efficient multiple genome alignment. Bioinformatics 18, S312–S320 (2002)

    Google Scholar 

  13. Morgenstern, B., Frech, K., Dress, A., Werner, T.: DIALIGN: Finding local similarities by multiple sequence alignment. Bioinformatics 14, 290–294 (1998)

    Article  Google Scholar 

  14. Brudno, M., Do, C.B., Cooper, G.M., Kim, M.F., Davydov, E.: LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731 (2003)

    Article  Google Scholar 

  15. Bray, N., Pachter, L.: MAVID: Constrained ancestral alignment of multiple sequences. Genome Res. 14, 693–699 (2004)

    Article  Google Scholar 

  16. Blanchette, M., Kent, W.J., Riemer, C., Elnitski, L., Smit, A.F., Roskin, K.M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E.D., Haussler, D., Miller, W.: Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004)

    Article  Google Scholar 

  17. Schwartz, S., Elnitski, L., Li, M., Weirauch, M., Riemer, C., Smit, A., Program, N.C.S., Green, E.D., Hardison, R.C., Miller, W.: MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res. 31, 3518–3524 (2003)

    Article  Google Scholar 

  18. Kim, S.: Graph theoretic sequence clustering algorithms and their applications to genome comparison. In: Wu, C.H., Wang, P., Wang, J.T.L. (eds.) Computational Biology and Genome Informatics. World Scientific, Singapore (2003)

    Google Scholar 

  19. Kim, S., Gopu, A.: Cluster utility: A new metric to guide sequence clustering. Technical report, School of Informatics, Indiana University (2004)

    Google Scholar 

  20. Miller, W.: Comparison of genomic DNA sequences: Solved and unsolved problems. Bioinformatics 17, 391–397 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Choi, JH., Choi, K., Cho, HG., Kim, S. (2005). Multiple Genome Alignment by Clustering Pairwise Matches. In: Lagergren, J. (eds) Comparative Genomics. RCG 2004. Lecture Notes in Computer Science(), vol 3388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-32290-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-32290-0_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24455-4

  • Online ISBN: 978-3-540-32290-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics