Abstract
We resolve two open problems presented in [8]. First, we consider the problem of reconstructing an unknown string T over a fixed alphabet using queries of the form “does the string S appear in T?” for some query string S. We show that Ω(ε − − 1/2 n 2) queries are needed in order to reconstruct a 1–ε fraction of the strings of length n. This lower bound is asymptotically optimal since it is known that O(ε − − 1/2 n 2) queries are sufficient. The second problem is reconstructing a string using queries of the form “does a string from \(\mathcal{S}\) appear in T?”, where \(\mathcal{S}\) is a set of strings. We show that a 1–ε fraction of the strings of length n can be reconstructed using O(n) queries, where the maximum length of a string in the queries is \(2\log_{\sigma}n+\log_{\sigma}\frac{1}{\epsilon}+O(1)\). This construction is optimal up to constants.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arratia, R., Martin, D., Reinert, G., Waterman, M.S.: Poisson process approximation for sequence repeats, and sequencing by hybridization. J. of Computational Biology 3(3), 425–463 (1996)
Bains, W., Smith, G.C.: A novel method for nucleic acid sequence determination. J. Theor. Biology 135, 303–307 (1988)
Dyer, M.E., Frieze, A.M., Suen, S.: The probability of unique solutions of sequencing by hybridization. J. of Computational Biology 1, 105–110 (1994)
Frieze, A., Preparata, F., Upfal, E.: Optimal reconstruction of a sequence from its probes. J. of Computational Biology 6, 361–368 (1999)
Halperin, E., Halperin, S., Hartman, T., Shamir, R.: Handling long targets and errors in sequencing by hybridization. In: Proc. 6th Annual International Conference on Computational Molecular Biology (RECOMB 2002), pp. 176–185 (2002)
Margaritis, D., Skiena, S.: Reconstructing strings from substrings in rounds. In: Proc. 36th Symposium on Foundation of Computer Science (FOCS 1995), pp. 613–620 (1995)
Pevzner, P.A., Lysov, Y.P., Khrapko, K.R., Belyavsky, A.V., Florentiev, V.L., Mirzabekov, A.D.: Improved chips for sequencing by hybridization. J. Biomolecular Structure and Dynamics 9, 399–410 (1991)
Pevzner, P.A., Waterman, M.S.: Open combinatorial problems in computational molecular biology. In: Proc. 3rd Israel Symposium on Theory of Computing and Systems (ISTCS 1995), pp. 158–173 (1995)
Preparata, F., Upfal, E.: Sequencing by hybridization at the information theory bound: an optimal algorithm. J. of Computational Biology 7, 621–630 (2000)
Shamir, R., Tsur, D.: Large scale sequencing by hybridization. J. of Computational Biology 9(2), 413–428 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsur, D. (2005). Tight Bounds for String Reconstruction Using Substring Queries. In: Chekuri, C., Jansen, K., Rolim, J.D.P., Trevisan, L. (eds) Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2005 2005. Lecture Notes in Computer Science, vol 3624. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11538462_38
Download citation
DOI: https://doi.org/10.1007/11538462_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28239-6
Online ISBN: 978-3-540-31874-3
eBook Packages: Computer ScienceComputer Science (R0)