Years and Authors of Summarized Original Work
-
2000; Li, Ma, Wang
-
2003; Deng, et al.
-
2008; Marx
-
2009; Ma, Sun
-
2011; Chen, Wang
-
2012; Chen, Ma, Wang
Problem Definition
The problem of finding a center string that is “close” to every given string arises and has applications in computational molecular biology [4, 5, 9–11, 18, 19] and coding theory [1, 6, 7].
This problem has two versions: The first problem comes from coding theory when we are looking for a code not too far away from a given set of codes.
Problem 1 (The closest string problem)
Input: a set of strings \(\mathcal{S} =\{ s_{1},s_{2},\ldots ,s_{n}\}\), each of length m.
Output: the smallest d and a string s of length m which is within Hamming distance d to each \(s_{i} \in \mathcal{S}\).
The second problem is much more elusive than the closest string problem. The problem is formulated from applications in finding conserved regions, genetic drug target identification, and genetic probes in molecular biology.
Problem 2 (The...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Ben-Dor A, Lancia G, Perone J, Ravi R (1997) Banishing bias from consensus sequences. In: Proceedings of the 8th annual symposium on combinatorial pattern matching conference, Aarhus, pp 247–261
Chen Z, Wang L (2011) Fast exact algorithms for the closest string and substring problems with application to the planted (L, d)-motif model. IEEE/ACM Trans Comput Biol Bioinform 8(5):1400–1410
Chen Z-Z, Ma B, Wang L (2012) A three-string approach to the closest string problem. J Comput Syst Sci 78(1):164–178
Deng X, Li G, Li Z, Ma B, Wang L (2003) Genetic design of drugs without side-effects. SIAM J Comput 32(4):1073–1090
Dopazo J, Rodríguez A, Sáiz JC, Sobrino F (1993) Design of primers for PCR amplification of highly variable genomes. CABIOS 9:123–125
Frances M, Litman A (1997) On covering problems of codes. Theor Comput Syst 30:113–119
Gasieniec L, Jansson J, Lingas A (1999) Efficient approximation algorithms for the hamming center problem. In: Proceedings of the 10th ACM-SIAM symposium on discrete algorithms, Baltimore, pp 135–S906
Gramm J, Niedermeier R, Rossmanith P 2003 Fixed-parameter algorithms for closest string and related problems. Algorithmica 37(1):25–42
Hertz G, Stormo G (1995) Identification of consensus patterns in unaligned DNA and protein sequences: a large-deviation statistical basis for penalizing gaps. In: Proceedings of the 3rd international conference on bioinformatics and genome research, Tallahassee, pp 201–216
Lanctot K, Li M, Ma B, Wang S, Zhang L (1999) Distinguishing string selection problems. In: Proceedings of the 10th ACM-SIAM symposium on discrete algorithms, Baltimore, pp 633–642
Lawrence C, Reilly A (1990) An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins 7:41–51
Li M, Ma B, Wang L (2002) Finding similar regions in many sequences. J Comput Syst Sci 65(1):73–96
Li M, Ma B, Wang L (1999) Finding similar regions in many strings. In: Proceedings of the thirty-first annual ACM symposium on theory of computing, Atlanta, pp 473–482
Li M, Ma B, Wang L (2002) On the closest string and substring problems. J ACM 49(2):157–171
Ma B (2000) A polynomial time approximation scheme for the closest substring problem. In: Proceedings of the 11th annual symposium on combinatorial pattern matching, Montreal, pp 99–107
Ma B, Sun X (2009) More efficient algorithms for closest string and substring problems. SIAM J Comput 39(4):1432–1443
Marx D (2008) Closest substring problems with small distances. SIAM J Comput 38(4):1382–1410
Stormo G (1990) Consensus patterns in DNA. In: Doolittle RF (ed) Molecular evolution: computer analysis of protein and nucleic acid sequences. Methods Enzymol 183:211–221
Stormo G, Hartzell GW III (1991) Identifying protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci USA 88:5699–5703
Wang L, Zhu B (2009) Efficient algorithms for the closest string and distinguishing string selection problems. In: Proceedings of 3rd international workshop on frontiers in algorithms, Hefei. Lecture notes in computer science, vol 5598, pp 261–270
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this entry
Cite this entry
Wang, L., Li, M., Ma, B. (2016). Closest String and Substring Problems. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_73
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2864-4_73
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2863-7
Online ISBN: 978-1-4939-2864-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering