Abstract
A suffix tree is able to efficiently locate a pattern in an indexed string, but not in general the most recent copy of the pattern in an online stream, which is desirable in some applications. We study the most general version of the problem of locating a most recent match: supporting queries for arbitrary patterns, at each step of processing an online stream. We present augmentations to Ukkonen’s suffix tree construction algorithm for optimal-time queries, maintaining indexing time within a logarithmic factor in the size of the indexed string. We show that the algorithm is applicable to sliding-window indexing, and sketch a possible optimization for use in the special case of Lempel-Ziv compression.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Apostolico, A.: The myriad virtues of subword trees. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. nato asi series, vol. F12, pp. 85–96. Springer, Heidelberg (1985)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press (1997)
Larsson, N.J.: Structures of String Matching and Data Compression. Ph.D. thesis, Department of Computer Science, Lund University, Sweden (September 1999)
Weiner, P.: Linear pattern matching algorithms. In: Proc. 14th Ann. ieee Symp. Switching and Automata Theory, pp. 1–11 (1973)
McCreight, E.M.: A space-economical suffix tree construction algorithm. J. acm 23(2), 262–272 (1976)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Farach, M.: Optimal suffix tree construction with large alphabets. In: Proc. 38th Ann. ieee Symp. Foundations of Comput. Sci. pp. 137–143 (October 1997)
Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. siam J. Comput. 22(5), 935–948 (1993)
Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. acm Computing Surveys (CSUR)Â 39(2), 4 (2007)
Amir, A., Landau, G.M., Ukkonen, E.: Online timestamped text indexing. Information processing letters 82(5), 253–259 (2002)
Ferragina, P., Nitto, I., Venturini, R.: On the bit-complexity of Lempel-Ziv compression. In: Proc. Twentieth Ann. acm – siam Symp. Discr. Alg. pp. 768–777 (2009)
Crochemore, M., Langiu, A., Mignosi, F.: The rightmost equal-cost position problem. In: Proc. ieee Data Compression Conf. pp. 421–430 (March 2013)
Breslauer, D., Italiano, G.F.: On suffix extensions in suffix trees. Theoretical Computer Science 457, 27–34 (2012)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. ieee Trans. Inf. Theory IT 23(3), 337–343 (1977)
Larsson, N.J., Fuglsang, K., Karlsson, K.: Efficient representation for online suffix tree construction. Preprint, arXiv:1403.0457 [cs.DS], http://arxiv.org/abs/1403.0457
Larsson, N.J.: Most recent match queries in on-line suffix trees (with appendix), arXiv:1403.0800 [cs.DS], http://arxiv.org/abs/1403.0800
Cole, R., Hariharan, R.: Dynamic lca queries on trees. SIAM Journal on Computing 34(4), 894–923 (2005)
Fiala, E.R., Greene, D.H.: Data compression with finite windows. Commun. acm 32(4), 490–505 (1989)
Larsson, N.J.: Extended application of suffix trees to data compression. In: Proc. acm Data Compression Conf. pp. 190–199 (March-April 1996)
Dietz, P., Sleator, D.: Two algorithms for maintaining order in a list. In: Proc. 19th Ann. acm Symp. Theory of Computing, pp. 365–372. ACM (1987)
Westbrook, J.: Fast incremental planarity testing. In: Kuich, W. (ed.) ICALP 1992. LNCS, vol. 623, pp. 342–353. Springer, Heidelberg (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Larsson, N.J. (2014). Most Recent Match Queries in On-Line Suffix Trees. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-07566-2_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07565-5
Online ISBN: 978-3-319-07566-2
eBook Packages: Computer ScienceComputer Science (R0)