Abstract
Aligning long DNA sequences is a fundamental and common task in molecular biology. Though dynamic programming algorithms have been developed to solve this problem, the space and time required by these algorithms are still a challenge. In this paper we present the Parallel Linear Space Alignment (PLSA) algorithm to compute the long sequence alignment to meet this challenge. Using this algorithm, the local start points and grid cache partition the whole sequence alignment problem into several smaller independent subproblems. A novel dynamic load balancing approach then efficiently solves these subproblems in parallel, which provides more parallelism in the trace-back phase. Furthermore, PLSA helps to find k near-optimal non-intersecting alignments. Our experiments show that this proposed algorithm scales well with the increasing number of processors, and it exhibits almost linear speedup for large-scale sequences.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Driga, A., Lu, P., Schaeffer, J., Szafron, D., Charter, K., Parsons, I.: FastLSA: A Fast, Linear-Space, Parallel and Sequential Algorithm for Sequence Alignment. In: The International Conference on Parallel Processing (2003)
Aluru, S., Futamura, N., Mehrotra, K.: Biological sequence comparison using pre∙ x computations. In: Proceedings 13th IEEE International Parallel Processing Symposium, pp. 653–659 (1999)
Delcher, A.L., Kasif, S., Fleischmann, R.D., Peterson, J., White, O., Salzberg, S.L.: Alignment of whole genomes. Nucleic Acids Research 27(11), 2369–2376 (1999)
Delcher, L., Phillippy, A., Carlton, J., Salzberg, S.L.: Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Research 30(11), 2478–2483 (2002)
Chen, C., Schmidt, B.: Computing Large-scale alignments on a Multi-luster. In: The IEEE International Conference on Cluster Computing (2003)
Hirschberg, D.S.: A linear space algorithm for computing longest common subsequences. Comm. ACM 18, 341–343 (1975)
Intel Corp., Intel Vtune Performance Analyzer ,available on-line: http://developer.intel.com/software/products/vtune/
Intel Corp. and Pallas, http://www.pallas.de/pages/vampir.htm
Martins, W.S., del Cuvillo, J.B., Cui, W., Gao, G.R.: Whole Genome Alignment using a Multithreaded Parallel Implementation. In: Proceedings 13th Symposium on Computer Architecture and High Performance Computing, September 10-12 (2001)
Myers, E., Miller, W.: Optimal alignments in linear space. Computer Applications in the Biosciences 4, 11–17 (1988)
Penn State University, Bioinformatics Group (2001), http://bio.cse.psu.edu
Needleman, S.B., Wunsch, C.D.: A General Method Applicable to the Search for Similarities in the amino acid Sequence of Two Sequences. Journal of Molecular Biology 48, 443–453 (1970)
Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147, 195–197 (1981)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, E., Xu, C., Wang, T., Jin, L., Zhang, Y. (2005). Parallel Linear Space Algorithm for Large-Scale Sequence Alignment. In: Cunha, J.C., Medeiros, P.D. (eds) Euro-Par 2005 Parallel Processing. Euro-Par 2005. Lecture Notes in Computer Science, vol 3648. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549468_132
Download citation
DOI: https://doi.org/10.1007/11549468_132
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28700-1
Online ISBN: 978-3-540-31925-2
eBook Packages: Computer ScienceComputer Science (R0)