Skip to main content

Randomized and Parameterized Algorithms for the Closest String Problem

  • Conference paper
  • 773 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8486))

Abstract

Given a set S = {s 1, s 2, …, s n } of strings of equal length L and an integer d, the closest string problem (CSP) requires the computation of a string s of length L such that d(s, s i ) ≤ d for each s i  ∈ S, where d(s, s i ) is the Hamming distance between s and s i . The problem is NP-hard and has been extensively studied in the context of approximation algorithms and parameterized algorithms. Parameterized algorithms provide the most practical solutions to its real-life applications in bioinformatics. In this paper we develop the first randomized parameterized algorithms for CSP. Not only are the randomized algorithms much simpler than their deterministic counterparts, their expected-time complexities are also significantly better than the previously best known (deterministic) algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Böcker, S., Jahn, K., Mixtacki, J., Stoye, J.: Computation of median gene clusters. Journal of Computational Biology 16(8), 1085–1099 (2009)

    Article  MathSciNet  Google Scholar 

  2. Boucher, C., Brown, D.G.: Detecting motifs in a large data set: Applying probabilistic insights to motif finding. In: Rajasekaran, S. (ed.) BICoB 2009. LNCS, vol. 5462, pp. 139–150. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. Ben-Dor, A., Lancia, G., Perone, J., Ravi, R.: Banishing bias from consensus sequences. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 247–261. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  4. Chen, Z.-Z., Ma, B., Wang, L.: A three-string approach to the closest string problem. Journal of Computer and System Sciences 78, 164–178 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  5. Chen, Z.-Z., Wang, L.: Fast exact algorithms for the closest string and substring problems with application to the planted (ℓ,d)-motif model. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(5), 1400–1410 (2011)

    Article  Google Scholar 

  6. Davila, J., Balla, S., Rajasekaran, S.: Space and time efficient algorithms for planted motif search. In: Proc. of the International Conference on Computational Science, pp. 822–829 (2006)

    Google Scholar 

  7. Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  8. Dopazo, J., Rodríguez, A., Sáiz, J.C., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. CABIOS 9, 123–125 (1993)

    Google Scholar 

  9. Evans, P.A., Smith, A.D.: Complexity of approximating closest substring problems. In: Proc. of the 14th International Symposium on Foundations of Complexity Theory, pp. 210–221 (2003)

    Google Scholar 

  10. Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  11. Frances, M., Litman, A.: On covering problems of codes. Theoret. Comput. Sci. 30, 113–119 (1997)

    MATH  MathSciNet  Google Scholar 

  12. Gramm, J., Guo, J., Niedermeier, R.: On exact and approximation algorithms for distinguishing substring selection. In: Lingas, A., Nilsson, B.J. (eds.) FCT 2003. LNCS, vol. 2751, pp. 195–209. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Gramm, J., Hüffner, F., Niedermeier, R.: Closest strings, primer design, and motif search. In: Florea, L., et al (eds.), Currents in Computational Molecular Biology. Poster Abstracts of RECOMB 2002, pp. 74–75 (2002)

    Google Scholar 

  14. Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  15. Hufsky, F., Kuchenbecker, L., Jahn, K., Stoye, J., Böcker, S.: Swiftly computing center strings. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 325–336. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Jiao, Y., Xu, J., Li, M.: On the k-closest substring and k-consensus pattern problems. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 130–144. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  17. Lanctot, K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string search problems. Inform. and Comput. 185, 41–55 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  18. Li, M., Ma, B., Wang, L.: On the closest string and substring problems. J. ACM 49(2), 157–171 (2002)

    Article  MathSciNet  Google Scholar 

  19. Lucas, K., Busch, M., Mösinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. CABIOS 7, 525–529 (1991)

    Google Scholar 

  20. Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. SIAM J. Comput. 39(4), 1432–1443 (2010)

    Article  MathSciNet  Google Scholar 

  21. Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38(4), 1382–1410 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  22. Marx, D.: Randomized techniques for parameterized algorithms. In: Thilikos, D.M., Woeginger, G.J. (eds.) IPEC 2012. LNCS, vol. 7535, p. 2. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  23. Mauch, H., Melzer, M.J., Hu, J.S.: Genetic algorithm approach for the closest string problem. In: Proc. of the 2nd IEEE Computer Society Bioinformatics Conference (CSB), pp. 560–561 (2003)

    Google Scholar 

  24. Meneses, C.N., Lu, Z., Oliveira, C.A.S., Pardalos, P.M.: Optimal solutions for the closest-string problem via integer programming. INFORMS J. Comput. (2004)

    Google Scholar 

  25. Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 315–327. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  26. Proutski, V., Holme, E.C.: Primer master: A new program for the design and analysis of PCR primers. CABIOS 12, 253–255 (1996)

    Google Scholar 

  27. Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Rau-Chaplin, A., Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 1997. LNCS, vol. 1272, pp. 126–135. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  28. Wang, L., Dong, L.: Randomized algorithms for motif detection. J. Bioinform. Comput. Biol. 3(5), 1039–1052 (2005)

    Article  MathSciNet  Google Scholar 

  29. Wang, L., Zhu, B.: Efficient algorithms for the closest string and distinguishing string selection problems. In: Deng, X., Hopcroft, J.E., Xue, J. (eds.) FAW 2009. LNCS, vol. 5598, pp. 261–270. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  30. Wang, Y., Chen, W., Li, X., Cheng, B.: Degenerated primer design to amplify the heavy chain variable region from immunoglobulin cDNA. BMC Bioinform. 7(suppl. 4), S9 (2006)

    Google Scholar 

  31. Zhao, R., Zhang, N.: A more efficient closest string algorithm. In: Proc. of the 2nd International Conference on Bioinformatics and Computational Biology (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, ZZ., Ma, B., Wang, L. (2014). Randomized and Parameterized Algorithms for the Closest String Problem. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07566-2_11

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07565-5

  • Online ISBN: 978-3-319-07566-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics