Abstract
Multidimensional data are widely used in real-life applications. Intel’s new brand of SSDs, called 3D XPoint, is an example of three-dimensional data. Motivated by a structural analysis of multidimensional data, we introduce the multidimensional period recovery problem, defined as follows. The input is a d-dimensional text array, with dimensions \(n_1 \times n_2 \times \dots \times n_d\), that contains corruptions, while the original text without the corruptions is periodic. The goal is then to report the period of the original text. We show that, if the number of corruptions is at most , where \(\epsilon > 0\) and \(p_1 \times \cdots \times p_d\) are the period’s dimensions, then the amount of possible period candidates is \(O(\log N)\), where \(N = \varPi _{i=1}^{d}n_i\). The independency of this bound of the number of dimensions is a surprising key contribution of this paper. We present an \(O(\varPi _{i=1}^{d} n_i \varPi _{i=1}^{d} \log n_i)\) algorithm, for any constant dimension d, (linear time up to logarithmic factor) to report these candidates. The tightness of the bound on the number of errors enabling a small size candidate set is demonstrated by showing that if the number of errors is equal to , a family of texts with \(\varTheta (N)\) period candidates can be constructed for any dimension \(d \ge 2\).
A. Amir—Partly supported by ISF grant 1475/18 and BSF grant 2018141.
D. Sokol—Partly supported by BSF grant 2018141.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This notion should not be confused with other notions of primitivity in stringology, such as in covers. The difference in the definition of primitivity for covers stems from the fact that the string must end with a complete occurrence of a cover, which is not the case for a period.
References
Amir, A., Amit, M., Landau, G.M., Sokol, D.: Period recovery of strings over the hamming and edit distances. Theor. Comput. Sci. 710, 2–18 (2018)
Amir, A., Benson, G.: Two-dimensional periodicity in rectangular arrays. SIAM J. Comput. 27(1), 90–106 (1998)
Amir, A., Benson, G., Farach, M.: Optimal parallel two dimensional pattern matching. In: Snyder, L. (ed.) Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1993, Velen, Germany, 30 June–2 July 1993, pp. 79–85. ACM (1993)
Amir, A., Benson, G., Farach, M.: Optimal parallel two dimensional text searching on a CREW PRAM. Inf. Comput. 144(1), 1–17 (1998)
Amir, A., Boneh, I.: Dynamic palindrome detection. CoRR, abs/1906.09732 (2019)
Amir, A., Boneh, I., Charalampopoulos, P., Kondratovsky, E.: Repetition detection in a dynamic string. In: ESA, LIPIcs, vol. 144, pp. 5:1–5:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
Amir, A., Eisenberg, E., Levy, A.: Approximate periodicity. Inf. Comput. 241, 215–226 (2015)
Amir, A., Eisenberg, E., Levy, A., Porat, E., Shapira, N.: Cycle detection and correction. ACM Trans. Algorithms 9(1), 13:1–13:20 (2012)
Amir, A., Landau, G.M., Marcus, S., Sokol, D.: Two-dimensional maximal repetitions. Theoret. Comput. Sci. 812, 49–61 (2019)
Amit, M., Crochemore, M., Landau, G.M.: Locating all maximal approximate runs in a string. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 13–27. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38905-4_4
Apostolico, A., Brimkov, V.E.: Fibonacci arrays and their two-dimensional repetitions. Theor. Comput. Sci. 237(1–2), 263–273 (2000)
Apostolico, A., Giancarlo, R.: Periodicity and repetitions in parameterized strings. Discret. Appl. Math. 156(9), 1389–1398 (2008). General Theory of Information Transfer and Combinatorics
Boyer, R.S., Moore, J.S.: MJRTY: a fast majority vote algorithm. In: Boyer, R.S. (ed.) Automated Reasoning: Essays in Honor of Woody Bledsoe. Automated Reasoning Series, pp. 105–118. Kluwer Academic Publishers (1991)
Cole, R., et al.: Optimally fast parallel algorithms for preprocessing and pattern matching in one and two dimensions. In: 34th Annual Symposium on Foundations of Computer Science, Palo Alto, California, USA, 3–5 November 1993, pp. 248–258. IEEE Computer Society (1993)
Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inf. Process. Lett. 12(5), 244–250 (1981)
Crochemore, M., Gasieniec, L., Hariharan, R., Muthukrishnan, S., Rytter, W.: A constant time optimal parallel algorithm for two-dimensional pattern matching. SIAM J. Comput. 27(3), 668–681 (1998)
Crochemore, M., Rytter, W.: Usefulness of the Karp-Miller-Rosenberg algorithm in parallel computations on strings and arrays. Theoret. Comput. Sci. 88(1), 59–82 (1991)
Galil, Z.: Optimal parallel algorithms for string matching. Inf. Control 67(1–3), 144–157 (1985)
Galil, Z., Giancarlo, R.: Improved string matching with k mismatches. SIGACT News 17(4), 52–54 (1986)
Galil, Z., Park, K.: Alphabet-independent two-dimensional witness computation. SIAM J. Comput. 25(5), 907–935 (1996)
Gamard, G., Richomme, G., Shallit, J., Smith, T.J.: Periodicity in rectangular arrays. Inf. Process. Lett. 118, 58–63 (2017)
Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci. 69(4), 525–546 (2004)
Karp, R.M., Miller, R.E., Rosenberg, A.L.: Rapid identification of repeated patterns in strings, trees and arrays. In: Fischer, P.C., Zeiger, H.P., Ullman, J.D., Rosenberg, A.L. (ed.) Proceedings of the 4th Annual ACM Symposium on Theory of Computing, Denver, Colorado, USA, 1–3 May 1972, pp. 125–136. ACM (1972)
Kociumaka, T., Radoszewski, J., Rytter, W., Walen, T.: Internal pattern matching queries in a text and applications. In: Indyk, P. (ed.) Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, 4–6 January 2015, pp. 532–551. SIAM (2015)
Kolpakov, R.M., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: 40th Annual Symposium on Foundations of Computer Science, FOCS 1999, New York, NY, USA, 17–18 October 1999, pp. 596–604. IEEE Computer Society (1999)
Kolpakov, R.M., Kucherov, G.: Finding approximate repetitions under hamming distance. Theor. Comput. Sci. 303(1), 135–156 (2003)
Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8(1), 1–18 (2001)
Marcus, S., Sokol, D.: 2d Lyndon words and applications. Algorithmica 77(1), 116–133 (2017). https://doi.org/10.1007/s00453-015-0065-z
Régnier, M., Rostami, L.: A unifying look at d-dimensional periodicities and space coverings. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) CPM 1993. LNCS, vol. 684, pp. 215–227. Springer, Heidelberg (1993). https://doi.org/10.1007/BFb0029807
Sim, J.S., Iliopoulos, C.S., Park, K., Smyth, W.F.: Approximate periods of strings. Theoret. Comput. Sci. 262(1), 557–568 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Amir, A., Butman, A., Kondratovsky, E., Levy, A., Sokol, D. (2020). Multidimensional Period Recovery. In: Boucher, C., Thankachan, S.V. (eds) String Processing and Information Retrieval. SPIRE 2020. Lecture Notes in Computer Science(), vol 12303. Springer, Cham. https://doi.org/10.1007/978-3-030-59212-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-59212-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59211-0
Online ISBN: 978-3-030-59212-7
eBook Packages: Computer ScienceComputer Science (R0)