Abstract
Amir et al. (CPM 2017) introduce the approximate string cover problem (ACP) motivated by applications including molecular biology, coding, automata theory, formal language theory and combinatorics. A cover of a string T is a string C for which every letter of T lies within some occurrence of C. The input of the ACP consists of a string T and the goal is to find a string C of length less than the length of T that covers a string \(T'\), which is as close to T as possible (under some predefined distance). Amir et al. study this problem for the Hamming distance and show that it is NP-hard.
In this paper we continue the work of Amir et al. and show the following results for the cover length relaxation of the ACP. After observing that the NP-hardness proof by Amir et al. (CPM 2017, TCS 2019) suffers from several lapses, we propose an amendment to the proof. We then introduce an approximation algorithm for a variant of the ACP, in which we aim to maximize the length of the input string minus the distance to the string covered by the approximate cover returned by the algorithm. This problem is naturally as hard as the ACP. We prove an asymptotic approximation ratio of \(\mathcal {O}(\sqrt{|T|})\), where |T| is the size of the input string. Finally, we present an FPT algorithm with respect to the alphabet size and the size of the cover based on a dynamic programming framework.
This work was supported by a grant of the Ministry of Research, Innovation and Digitization, CNCS - UEFISCDI, project number PN-III-P1-1.1-TE-2021-0253, within PNCDI III. We also thank the PHC Brincusi project between Univ. of Bordeaux and Univ. of Bucharest that facilitated the bilateral visits leading to this work.
M. Raffinot—Currently at Apple.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In the original construction \(y_4=x_0\), \(y_3=x_{-1}\), \(y_2=x_{N+1}\) and \(y_1=x_{N+2}\).
References
Amir, A., Levy, A., Lewenstein, M., Lubin, R., Porat, B.: Can we recover the cover? In: 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, Warsaw, Poland, 4–6 July 2017, pp. 25:1–25:15 (2017)
Amir, A., Levy, A., Lewenstein, M., Lubin, R., Porat, B.: Can we recover the cover? Algorithmica 81 (2019)
Amir, A., Levy, A., Lubin, R., Porat, E.: Approximate cover of strings. In: 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, Warsaw, Poland, 4–6 July 2017, pp. 26:1–26:14 (2017)
Amir, A., Levy, A., Lubin, R., Porat, E.: Approximate cover of strings. Theor. Comput. Sci. 793, 59–69 (2019)
Amir, A., Levy, A., Porat, E.: Quasi-periodicity under mismatch errors. In: Annual Symposium on Combinatorial Pattern Matching, CPM 2018, Qingdao, China, 2–4 July 2018, pp. 4:1–4:15 (2018)
Antoniou, P., Crochemore, M., Iliopoulos, C.S., Jayasekera, I., Landau, G.M.: Conservative string covering of indeterminate strings. In: Stringology, pp. 108–115 (2008)
Apostolico, A., Breslauer, D.: Of periods, quasiperiods, repetitions and covers. In: Mycielski, J., Rozenberg, G., Salomaa, A. (eds.) Structures in Logic and Computer Science. LNCS, vol. 1261, pp. 236–248. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-63246-8_14
Apostolico, A., Ehrenfeucht, A.: Efficient detection of quasiperiodicities in strings. Theor. Comput. Sci. 119(2), 247–265 (1993)
Apostolico, A., Farach, M., Iliopoulos, C.S.: Optimal superprimitivity testing for strings. Inf. Process. Lett. 39(1), 17–20 (1991)
Bacciotti, A., Rosier, L.: Liapunov Functions and Stability in Control Theory. Springer, Heidelberg (2006)
Barton, C., Kociumaka, T., Pissis, S.P., Radoszewski, J.: Efficient index for weighted sequences. In: 27th Annual Symposium on Combinatorial Pattern Matching, CPM 2016, Tel Aviv, Israel, 27–29 June 2016, pp. 4:1–4:13 (2016)
Breslauer, D.: An on-line string superprimitivity test. Inf. Process. Lett. 44(6), 345–347 (1992)
Breslauer, D.: Testing string superprimitivity in parallel. Inf. Process. Lett. 49(5), 235–241 (1994)
Brodal, G.S., Pedersen, C.N.S.: Finding maximal quasiperiodicities in strings. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 397–411. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45123-4_33
Brodzik, A.K.: Quaternionic periodicity transform: an algebraic solution to the tandem repeat detection problem. Bioinformatics 23(6), 694–700 (2007)
Christodoulakis, M., Iliopoulos, C., Mouchard, L., Perdikuri, K., Tsakalidis, A., Tsichlas, K.: Computation of repetitions and regularities of biologically weighted sequences. J. Comput. Biol. 13(6), 1214–1231 (2006)
Cole, R., Ilopoulos, C.S., Mohamed, M., Smyth, W.F., Yang, L.: The complexity of the minimum k-cover problem. J. Autom. Lang. Comb. 10(5–6), 641–653 (2005)
Crochemore, M., Iliopoulos, C.S., Kociumaka, T., Radoszewski, J., Rytter, W., Walen, T.: Covering problems for partial words and for indeterminate strings. Theor. Comput. Sci. 698, 25–39 (2017)
Flouri, T., et al.: Enhanced string covering. Theor. Comput. Sci. 506, 102–114 (2013)
Guo, Q., Zhang, H., Iliopoulos, C.S.: Computing the \(\lambda \)-seeds of a string. In: Cheng, S.-W., Poon, C.K. (eds.) AAIM 2006. LNCS, vol. 4041, pp. 303–313. Springer, Heidelberg (2006). https://doi.org/10.1007/11775096_28
Guo, Q., Zhang, H., Iliopoulos, C.S.: Computing the \(\lambda \)-covers of a string. Inf. Sci. 177(19), 3957–3967 (2007)
Iliopoulos, C.S., Moore, D.W., Park, K.: Covering a string. Algorithmica 16(3), 288–297 (1996)
Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems, vol. 54. Cambridge University Press, Cambridge (1997)
Kociumaka, T., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: A linear-time algorithm for seeds computation. ACM Trans. Algorithms 16(2) (2020)
Kociumaka, T., Pissis, S.P., Radoszewski, J., Rytter, W., Waleń, T.: Fast algorithm for partial covers in words. Algorithmica 73(1), 217–233 (2015)
Kolpakov, R., Kucherov, G.: Finding approximate repetitions under Hamming distance. Theor. Comput. Sci. 303(1), 135–156 (2003)
Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8(1), 1–18 (2001)
Li, Y., Smyth, W.F.: Computing the cover array in linear time. Algorithmica 32(1), 95–106 (2002)
Ming, L., Vitányi, P.M.: Kolmogorov complexity and its applications. In: Algorithms and Complexity, pp. 187–254. Elsevier (1990)
Moore, D., Smyth, W.F.: An optimal algorithm to compute all the covers of a string. Inf. Process. Lett. 50(5), 239–246 (1994)
Moore, D., Smyth, W.F.: A correction to “An optimal algorithm to compute all the covers of a string’’. Inf. Process. Lett. 54(2), 101–103 (1995)
Muchnik, A., Semenov, A., Ushakov, M.: Almost periodic sequences. Theor. Comput. Sci. 304(1–3), 1–33 (2003)
Sethares, W.A., Staley, T.W.: Periodicity transforms. IEEE Trans. Signal Process. 47(11), 2953–2964 (1999)
Timmermans, M., Heijmans, R., Daniels, H.: Cyclical patterns in risk indicators based on financial market infrastructure transaction data. De Nederlandsche Bank Working Paper (558) (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Blin, G., Popa, A., Raffinot, M., Uricaru, R. (2023). Approximation and Fixed Parameter Algorithms for the Approximate Cover Problem. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-43980-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43979-7
Online ISBN: 978-3-031-43980-3
eBook Packages: Computer ScienceComputer Science (R0)