Abstract
A suffix tree is a fundamental data structure for string processing and information retrieval, however, its structure is still not well understood. The suffix trees reverse engineering problem, which its research aims at reducing this gap, is the following. Given an ordered rooted tree T with unlabeled edges, determine whether there exists a string w such that the unlabeled-edges suffix tree of w is isomorphic to T. Previous studies on this problem consider the relaxation of having the suffix links as well as assume a binary alphabet. This paper is the first to consider the suffix tree detection problem, in which the relaxation of having suffix links as input is removed. We study suffix tree detection on two scenarios that are interesting per se. We provide a suffix tree detection algorithm for general alphabet periodic strings. Given an ordered tree T with n leaves, our detection algorithm takes \(O(n+|\varSigma |^p)\)-time, where p is the unknown in advance length of a period that repeats at least 3 times in a string S having a suffix tree structure identical to T, if such S exists. Therefore, it is a polynomial time algorithm if p is a constant and a linear time algorithm if, in addition, the alphabet has a sub-linear size. We also show some necessary (but insufficient) conditions for binary alphabet general strings suffix tree detection. By this we take another step towards understanding suffix trees structure.
Partly supported by ISF grant 1475/18 and BSF grant 2018141.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Assuming the basic terms: rooted trees, internal tree nodes and tree leaves are known.
- 3.
- 4.
- 5.
The characters \(\sigma \in \varSigma \) are chosen to the edges of T’s root from left to right according to the order of \(\varSigma \), where the first edge is \(\$\).
References
Amir, A., Eisenberg, E., Levy, A., Porat, E., Shapira, N.: Cycle detection and correction. ACM Trans. Algorithms 9(1), 1–20 (2012). https://doi.org/10.1145/2390176.2390189
Amir, A., Levy, A., Lewenstein, M., Lubin, R., Porat, B.: Can we recover the cover? Algorithmica 81(7), 2857–2875 (2019). https://doi.org/10.1007/s00453-019-00559-8
Apostolico, A.: The myriad virtues of subword trees. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. NATO ASI Series, vol. 12, pp. 85–96. Springer, Heidelberg (1985). https://doi.org/10.1007/978-3-642-82456-2_6
Bannai, H., Inenaga, S., Shinohara, A., Takeda, M.: Inferring strings from graphs and arrays. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 208–217. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45138-9_15
Breslauer, D., Italiano, G.F.: On suffix extensions in suffix trees. Theoret. Comput. Sci. 457, 27–34 (2012)
Cazaux, B., Rivals, E.: Reverse engineering of compact suffix trees and links: a novel algorithm. J. Discrete Algorithms 28, 9–22 (2014)
Clément, J., Crochemore, M., Rindone, G.: Reverse engineering prefix tables. In: Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science STACS. LIPIcs, vol. 3, pp. 289–300. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany (2009)
Crochemore, M., Iliopoulos, C.S., Pissis, S.P., Tischler, G.: Cover array string reconstruction. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 251–259. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13509-5_23
Duval, J., Lecroq, T., Lefebvre, A.: Efficient validation and construction of border arrays and validation of string matching automata. RAIRO Theor. Inform. Appl. 43(2), 281–297 (2009)
Farach, M.: Optimal suffix tree construction with large alphabets. In: Proceedings of the 38th IEEE Symposium on Foundations of Computer Science, pp. 137–143 (1997)
Franek, F., et al.: Verifying a border array in linear time. J. Comb. Math. Comb. Comput. 42, 223–236 (2000)
Gawrychowski, P., Jez, A., Jez, L.: Validating the Knuth-Morris-Pratt failure function, fast and online. Theory Comput. Syst. 54(2), 337–372 (2014)
Gawrychowski, P., Kociumaka, T., Radoszewski, J., Rytter, W., Walen, T.: Universal reconstruction of a string. Theor. Comput. Sci. 812, 174–186 (2020)
Gelle, K., Iván, S.: Recognizing union-find trees is NP-complete, even without rank info. Int. J. Found. Comput. Sci. 30(6–7), 1029–1045 (2019)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
He, M., Munro, J.I., Rao, S.S.: A categorization theorem on suffix arrays with applications to space efficient text indexes. In: SODA, vol. 5, pp. 23–32. Citeseer (2005)
Tomohiro, I., Inenaga, S., Bannai, H., Takeda, M.: Verifying and enumerating parameterized border arrays. Theor. Comput. Sci. 412(50), 6959–6981 (2011)
Tomohiro, I., Inenaga, S., Bannai, H., Takeda, M.: Inferring strings from suffix trees and links on a binary alphabet. Discrete Appl. Math. 163, 316–325 (2014)
Kärkkäinen, J., Piatkowski, M., Puglisi, S.J.: String inference from longest-common-prefix array. In: Proceedings of the 44th International Colloquium on Automata, Languages, and Programming, ICALP. LIPIcs, vol. 80, pp. 62:1–62:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)
Kucherov, G., Tóthmérész, L., Vialette, S.: On the combinatorics of suffix arrays. Inf. Process. Lett. 113(22), 915–920 (2013)
McCreight, E.M.: A space-economical suffix tree construction algorithm. J. ACM 23, 262–272 (1976)
Nakashima, Y., Okabe, T., Tomohiro, I., Inenaga, S., Bannai, H., Takeda, M.: Inferring strings from Lyndon factorization. Theor. Comput. Sci. 689, 147–156 (2017)
Schürmann, K.B., Stoye, J.: Counting suffix arrays and strings. Theoret. Comput. Sci. 395(2–3), 220–234 (2008)
Starikovskaya, T., Vildhøj, H.W.: A suffix tree or not a suffix tree? J. Discrete Algorithms 32, 14–23 (2015)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14, 249–260 (1995)
Weiner, P.: Linear pattern matching algorithm. In: Proceedings of the 14 IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Amir, A., Kondratovsky, E., Levy, A. (2023). On Suffix Tree Detection. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-43980-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43979-7
Online ISBN: 978-3-031-43980-3
eBook Packages: Computer ScienceComputer Science (R0)