Skip to main content

Closest Substring

2005; Marx

  • Reference work entry
Encyclopedia of Algorithms

Keywords and Synonyms

Common approximate substring        

Problem Definition

Closest Substring is a core problem in the field of consensus string analysis with, in particular, applications in computational biology. Its decision version is defined as follows.

Closest Substring

Input: k strings \( { s_1, s_2, \dots , s_k } \) over alphabet Σ and non-negative integers d and L.

Question: Is there a string s of length L and, for all \( { i = 1, \dots, k } \), a length-L substring \( { s^{\prime}_i } \) of s i such that \( { d_H(s,s^{\prime}_i)\leq d } \)?

Here \( { d_H(s, s_i^{\prime}) } \) denotes the Hamming distance between s and s i ′, i. e., the number of positions in which s and s i ′ differ. Following the notation used in [7], m is used to denote the average length of the input strings and n to denote the total size of the problem input.

The optimization version of Closest Substring asks for the minimum value of the distance parameter d for which the input strings still allow a solution.

Key...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 399.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Buhler, J., Tompa, M.: Finding motifs using random projections. J. Comput. Biol. 9(2), 225–242 (2002)

    Article  Google Scholar 

  2. Evans, P.A., Smith, A.D., Wareham, H.T.: On the complexity of finding common approximate substrings. Theor. Comput. Sci. 306(1–3), 407–430 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  3. Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  4. Frances, M., Litman, A.: On covering problems of codes. Theor. Comput. Syst. 30, 113–119 (1997)

    MathSciNet  MATH  Google Scholar 

  5. Lanctot, J.K.: Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing String Search Problems. Inf. Comput. 185, 41–55 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  6. Li, M., Ma, B., Wang, L.: On the Closest String and Substring Problems. J. ACM 49(2), 157–171 (2002)

    Article  MathSciNet  Google Scholar 

  7. Marx, D.: The Closest Substring problem with small distances. In: Proceedings of the 46th FOCS, pp 63–72. IEEE Press, (2005)

    Google Scholar 

  8. Pevzner, P.A., Sze, S.H.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proc. of 8th ISMB, pp. 269–278. AAAI Press, (2000)

    Google Scholar 

  9. Sagot, M.F.: Spelling approximate repeated or common motifs using a suffix tree. In: Proc. of the 3rd LATIN, vol. 1380 in LNCS, pp. 111–127. Springer (1998)

    Google Scholar 

  10. Wang, J., Huang, M., Cheng, J.: A Lower Bound on Approximation Algorithms for the Closest Substring Problem. In: Proceedings COCOA 2007, vol. 4616 in LNCS, pp. 291–300 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag

About this entry

Cite this entry

Gramm, J. (2008). Closest Substring. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_74

Download citation

Publish with us

Policies and ethics