Abstract
We present a self-adjusting layout scheme for suffix trees in secondary storage that provides optimal number of disk accesses for a sequence of string or substring queries. This has been an open problem since Sleator and Tarjan presented their splaying technique to create self-adjusting binary search trees in 1985. In addition to resolving this open problem, our scheme provides two additional advantages: 1) The partitions are slowly readjusted, requiring fewer disk accesses than splaying methods, and 2) the initial state of the layout is balanced, making it useful even when the sequence of queries is not highly skewed. Our method is also applicable to PATRICIA trees, and potentially to other data structures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barbay, J., Golynski, A., Munro, J.I., Rao, S.S.: Adaptive searching in succinctly encoded binary relations and tree-structured documents. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 24–35. Springer, Heidelberg (2006)
Bedathur, S., Haritsa, J.: Search-optimized suffix-tree storage for biological applications. In: Proc. 12th IEEE International Conference on High Performance Computing, pp. 29–39. IEEE Computer Society Press, Los Alamitos (2005)
Bell, J., l Gupta, G.: An evaluation of self-adjusting binary search tree techniques. Software - Practice and Experience 23(4), 369–382 (1993)
Brodal, G.S., Fagerberg, R.: Cache-oblivious string dictionaries. In: Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 581–590. ACM Press, New York (2006)
Ciriani, V., Ferragina, P., Luccio, F., Muthukrishnan, S.: Static optimality theorem for external memory string access. In: Proc. 43rd Annual Symposium on Foundations of Computer Science, pp. 219–227 (2002)
Clark, D.R., Munro, J.I.: Efficient suffix trees on secondary storage. In: Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 383–391. ACM Press, New York (1996)
Farach-Colton, M., Ferragina, P., Muthukrishnan, S.: On the sorting-complexity of suffix tree construction. Journal of the ACM 47(6), 987–1011 (2000)
Grossi, R., Vitter, J.S.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In: Proc. 32nd Annual ACM Symposium on Theory of Computing, pp. 397–406. ACM Press, New York (2000)
Ko, P., Aluru, S.: Obtaining provably good performance from suffix trees in secondary storage. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 72–83. Springer, Heidelberg (2006)
Kurtz, S.: Reducing the space requirement of suffix trees. Software - Practice and Experience 29(13), 1149–1171 (1999)
Munro, J.I., Raman, V., Rao, S.S.: Space efficient suffix trees. J. Algorithms 39(2), 205–222 (2001)
Sleator, D.D., Tarjan, R.E.: Self-adjusting binary search trees. Journal of the ACM 32(3), 652–686 (1985)
Williams, H.E., Zobel, J., Heinz, S.: Self-adjusting trees in practice for large text collections. Software - Practice and Experience 31(10), 925–939 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ko, P., Aluru, S. (2007). Optimal Self-adjusting Trees for Dynamic String Data in Secondary Storage. In: Ziviani, N., Baeza-Yates, R. (eds) String Processing and Information Retrieval. SPIRE 2007. Lecture Notes in Computer Science, vol 4726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75530-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-75530-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75529-6
Online ISBN: 978-3-540-75530-2
eBook Packages: Computer ScienceComputer Science (R0)