Abstract
The BLAST search engine was published and released in 1990. It is a heuristic that uses the idea of a neighborhood to find seed matches that are then extended. This approach came from work that this author was doing to lever these ideas to arrive at a deterministic algorithm with a characterized and superior time complexity. The resulting \(O(en^{\operatorname{pow}(e/p)} \log n)\) expected-time algorithm for finding all e-matches to a string of length p in a text of length n was completed in 1991. The function \(\operatorname{pow}( \epsilon )\) is 0 for ϵ=0 and concave increasing, so the algorithm is truly sublinear in that its running time is O(n c) for c<1 for ϵ sufficiently small. This paper reviews the history and the unfolding of the basic concepts, and it attempts to intuitively describe the deeper result whose time complexity, to this author’s knowledge, has yet to be improved upon.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)
Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64, 100–119 (1985)
Myers, E., Miller, W.: Optimal alignments in linear space. Comput. Appl. Biosci. 4(1), 11–17 (1988)
Landau, G., Vishkin, U.: Efficient string matching with k mismatches. Theor. Comput. Sci. 43, 239–249 (1986)
Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448 (1988)
Myers, E.: An O(ND) difference algorithm and its variations. Algorithmica 1(2), 251–266 (1986)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)
Myers, E.: A sublinear algorithm for approximate keyword searching. Algorithmica 12(4/5), 345–374 (1994)
Weiner, P.: Linear pattern matching algorithm. In: 14th Annual IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Manber, U., Myers, E.: Suffix arrays: a new method for on-line searches. In: Proc. 1st ACM-SIAM Symp. on Discrete Algorithms, pp. 319–327 (1990)
Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)
Li, H., Ruan, J., Durbin, R.: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18(11), 1851–1858 (2008)
Jokinen, P., Ukkonen, E.: Two algorithms for approximate string matching in static texts. In: Proc. of MFCS’91. LNCS, vol. 520, pp. 240–248 (1991)
Ukkonen, E.: Approximate string-matching with q-grams and maximal matches. Theor. Comput. Sci. 92(1), 191–211 (1992)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)
Roberts, L.: New chip may speed genome analysis. Science 244, 655–656 (1989)
Mealy, G.: A method for synthesizing sequential circuits. Bell Syst. Tech. J. 34, 1045–1079 (1955)
Moore, E.: Gedanken-experiments on sequential machines. In: Automata Studies. Annals of Mathematical Studies, vol. 34, pp. 129–153. Princeton University Press, Princeton (1956)
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)
Karlin, S., Altschul, S.: Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA 87, 2264–2268 (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Myers, G. (2013). What’s Behind Blast. In: Chauve, C., El-Mabrouk, N., Tannier, E. (eds) Models and Algorithms for Genome Evolution. Computational Biology, vol 19. Springer, London. https://doi.org/10.1007/978-1-4471-5298-9_1
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5298-9_1
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5297-2
Online ISBN: 978-1-4471-5298-9
eBook Packages: Computer ScienceComputer Science (R0)