Abstract
In this paper, we propose and evaluate Fickett-MM, a parallel strategy that combines the algorithms Fickett and Myers-Miller, splitting a pairwise sequence comparison into multiple comparisons of subsequences and calculating an appropriate Fickett band to each subsequence comparison (block). With this approach, we potentially reduce the number of cells calculated in the dynamic programming matrix when compared to Fickett, which uses a unique band to the whole comparison. Our adjustable multi-block strategy was integrated to the stage 4 of CUDAlign, a state-of-the-art parallel tool for optimal biological sequence comparison. Fickett-MM was used to compare real DNA sequences whose sizes ranged from 10KBP (Thousands of Base Pairs) to 47MBP (Millions of Base Pairs), reaching a speedup of 59.60\(\times \) in the 10MBP \(\times \) 10MBP comparison when compared to CUDAlign stage 4.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge (1999)
Fickett, J.W.: Fast optimal alignments. Nucleic Acids Res. 11, 175–179 (1984)
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3), 705–708 (1982)
Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Commun. ACM 18(6), 341–343 (1975)
Liu, Y., Tam, T., Lauenroth, F., Schmidt, B.: SWAPHI-LS: Smith-Waterman algorithm on Xeon Phi coprocessors for long DNA sequences. In: IEEE International Conference on Cluster Computing, pp. 257–265 (2014)
Liu, Y., Wirawan, A., Schmidt, B.: CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinformatics 14, 117 (2013)
Maleki, S., Musuvathi, M., Mytcowicz, T.: Parallelizing dynamic programming through rank convergence. In: 19th ACM PPoPP, pp. 219–232 (2014)
Myers, E.W., Miller, W.: Optimal alignments in linear space. Comput. Appl. Biosci. 4(1), 11–17 (1988)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
de Oliveira Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., de Melo, A.C.M.: CUDAlign 4.0: incremental speculative traceback for exact chromosome-wide alignment in GPU clusters. IEEE Tran. Parallel Dist. Syst. 27(10), 2838–2850 (2016)
de Oliveira Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., de Melo, A.C.M.: MASA: a multiplatform architecture for sequence aligners with block pruning. ACM Trans. Parallel Comput. 2(4), 28 (2016)
Rajko, S., Aluru, S.: Space and time optimal parallel sequence alignments. IEEE Trans. Parallel Distrib. Syst. 15(12), 1070–1081 (2004)
Sarkar, S., Kulkarni, G.R., Pande, P.P., Kalyanaraman, A.: Network-on-chip hardware accelerators for biological sequence alignment. IEEE Trans. Comput. 59(1), 29–41 (2010)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Wang, L., Chan, Y., Duan, X., Lan, H., Meng, X., Liu, W.: XSW: accelerating biological database search on Xeon Phi. In: IEEE AsHES, pp. 950–957 (2014)
Wienbrandt, L.: The FPGA-based high-performance computer RIVYERA for applications in bioinformatics. In: Beckmann, A., Csuhaj-Varjú, E., Meer, K. (eds.) CiE 2014. LNCS, vol. 8493, pp. 383–392. Springer, Cham (2014). doi:10.1007/978-3-319-08019-2_40
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Silva, G.H.G., Sandes, E.F.O., Teodoro, G., Melo, A.C.M.A. (2017). Parallel Biological Sequence Comparison in Linear Space with Multiple Adjustable Bands. In: Figueiredo, D., Martín-Vide, C., Pratas, D., Vega-Rodríguez, M. (eds) Algorithms for Computational Biology. AlCoB 2017. Lecture Notes in Computer Science(), vol 10252. Springer, Cham. https://doi.org/10.1007/978-3-319-58163-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-58163-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58162-0
Online ISBN: 978-3-319-58163-7
eBook Packages: Computer ScienceComputer Science (R0)