Skip to main content

Tight Bounds for String Reconstruction Using Substring Queries

  • Conference paper
Book cover Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques (APPROX 2005, RANDOM 2005)

Abstract

We resolve two open problems presented in [8]. First, we consider the problem of reconstructing an unknown string T over a fixed alphabet using queries of the form “does the string S appear in T?” for some query string S. We show that Ω(ε − − 1/2 n 2) queries are needed in order to reconstruct a 1–ε fraction of the strings of length n. This lower bound is asymptotically optimal since it is known that O(ε − − 1/2 n 2) queries are sufficient. The second problem is reconstructing a string using queries of the form “does a string from \(\mathcal{S}\) appear in T?”, where \(\mathcal{S}\) is a set of strings. We show that a 1–ε fraction of the strings of length n can be reconstructed using O(n) queries, where the maximum length of a string in the queries is \(2\log_{\sigma}n+\log_{\sigma}\frac{1}{\epsilon}+O(1)\). This construction is optimal up to constants.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arratia, R., Martin, D., Reinert, G., Waterman, M.S.: Poisson process approximation for sequence repeats, and sequencing by hybridization. J. of Computational Biology 3(3), 425–463 (1996)

    Article  Google Scholar 

  2. Bains, W., Smith, G.C.: A novel method for nucleic acid sequence determination. J. Theor. Biology 135, 303–307 (1988)

    Article  Google Scholar 

  3. Dyer, M.E., Frieze, A.M., Suen, S.: The probability of unique solutions of sequencing by hybridization. J. of Computational Biology 1, 105–110 (1994)

    Article  Google Scholar 

  4. Frieze, A., Preparata, F., Upfal, E.: Optimal reconstruction of a sequence from its probes. J. of Computational Biology 6, 361–368 (1999)

    Article  Google Scholar 

  5. Halperin, E., Halperin, S., Hartman, T., Shamir, R.: Handling long targets and errors in sequencing by hybridization. In: Proc. 6th Annual International Conference on Computational Molecular Biology (RECOMB 2002), pp. 176–185 (2002)

    Google Scholar 

  6. Margaritis, D., Skiena, S.: Reconstructing strings from substrings in rounds. In: Proc. 36th Symposium on Foundation of Computer Science (FOCS 1995), pp. 613–620 (1995)

    Google Scholar 

  7. Pevzner, P.A., Lysov, Y.P., Khrapko, K.R., Belyavsky, A.V., Florentiev, V.L., Mirzabekov, A.D.: Improved chips for sequencing by hybridization. J. Biomolecular Structure and Dynamics 9, 399–410 (1991)

    Google Scholar 

  8. Pevzner, P.A., Waterman, M.S.: Open combinatorial problems in computational molecular biology. In: Proc. 3rd Israel Symposium on Theory of Computing and Systems (ISTCS 1995), pp. 158–173 (1995)

    Google Scholar 

  9. Preparata, F., Upfal, E.: Sequencing by hybridization at the information theory bound: an optimal algorithm. J. of Computational Biology 7, 621–630 (2000)

    Article  Google Scholar 

  10. Shamir, R., Tsur, D.: Large scale sequencing by hybridization. J. of Computational Biology 9(2), 413–428 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsur, D. (2005). Tight Bounds for String Reconstruction Using Substring Queries. In: Chekuri, C., Jansen, K., Rolim, J.D.P., Trevisan, L. (eds) Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2005 2005. Lecture Notes in Computer Science, vol 3624. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11538462_38

Download citation

  • DOI: https://doi.org/10.1007/11538462_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28239-6

  • Online ISBN: 978-3-540-31874-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics