ABSTRACT
We present the first fully-persistent external-memory search tree achieving amortized I/O bounds matching those of the classic (ephemeral) B-tree by Bayer and McCreight. The insertion and deletion of a value in any version requires amortized O(logB Nv) I/Os and a range reporting query in any version requires worst-case O(logB Nv + K/B) I/Os, where K is the number of values reported, Nv is the number of values in the version v of the tree queried or updated, and B is the external-memory block size. The data structure requires space linear in the total number of updates. Compared to the previous best bounds for fully persistent B-trees [Brodal, Sioutas, Tsakalidis, and Tsichlas, SODA 2012], this paper eliminates from the update bound an additive term of O(log2 B) I/Os. This result matches the previous best bounds for the restricted case of partial persistent B-trees [Arge, Danner and Teh, JEA 2003]. Central to our approach is to consider the problem as a dynamic set of two-dimensional rectangles that can be merged and split.
- Georgy M. Adelson-Velsky and Evgenii M. Landis. 1962. An algorithm for the organization of information. Proceedings of the USSR Academy of Sciences (in Russian), 146 (1962), 263–266. English translation by Myron J. Ricci in Soviet Mathematics - Doklady, 3:1259–1263, 1962. Google Scholar
- Alok Aggarwal and Jeffrey Scott Vitter. 1988. The Input/Output Complexity of Sorting and Related Problems. Commun. ACM, 31, 9 (1988), 1116–1127. https://doi.org/10.1145/48529.48535 Google ScholarDigital Library
- Lars Arge, Gerth Stølting Brodal, and S. Srinivasa Rao. 2012. External Memory Planar Point Location with Logarithmic Updates. Algorithmica, 63, 1 (2012), 457–475. issn:0178-4617 https://doi.org/10.1007/s00453-011-9541-2 Google ScholarCross Ref
- Lars Arge, Andrew Danner, and Sha-Mayn Teh. 2003. I/O-efficient point location using persistent B-trees. ACM Journal of Experimental Algorithmics, 8 (2003), 22 pages. https://doi.org/10.1145/996546.996549 Google ScholarDigital Library
- Lars Arge and Jeffrey Vitter. 2003. Optimal External Memory Interval Management. SIAM J. Comput., 32 (2003), 09, 1488–1508. https://doi.org/10.1137/S009753970240481X Google ScholarDigital Library
- Rudolf Bayer and Edward M. McCreight. 1972. Organization and Maintenance of Large Ordered Indices. Acta Informatica, 1 (1972), 173–189. https://doi.org/10.1007/BF00288683 Google ScholarDigital Library
- Bruno Becker, Stephan Gschwind, Thomas Ohler, Bernhard Seeger, and Peter Widmayer. 1996. An Asymptotically Optimal Multiversion B-Tree. The VLDB Journal, 5, 4 (1996), 264–275. https://doi.org/10.1007/s007780050028 Google ScholarDigital Library
- Michael A. Bender, Rathish Das, Martin Farach-Colton, Rob Johnson, and William Kuszmaul. 2020. Flushing Without Cascades. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, Shuchi Chawla (Ed.). SIAM, 650–669. https://doi.org/10.1137/1.9781611975994.40 Google ScholarCross Ref
- Michael A. Bender, Martín Farach-Colton, Rob Johnson, Simon Mauras, Tyler Mayer, Cynthia A. Phillips, and Helen Xu. 2017. Write-Optimized Skip Lists. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS ’17). Association for Computing Machinery, New York, NY, USA. 69–78. isbn:9781450341981 https://doi.org/10.1145/3034786.3056117 Google ScholarDigital Library
- Gerth Stølting Brodal and Rolf Fagerberg. 2003. On the Limits of Cache-Obliviousness. In Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing (STOC ’03). Association for Computing Machinery, New York, NY, USA. 307–315. isbn:1581136749 https://doi.org/10.1145/780542.780589 Google ScholarDigital Library
- Gerth Stølting Brodal, Spyros Sioutas, Konstantinos Tsakalidis, and Kostas Tsichlas. 2012. Fully Persistent B-trees. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012. SIAM, 602–614. isbn:978-1-611972-11-5 issn:1557-9468 https://doi.org/10.1137/1.9781611973099.51 Google ScholarCross Ref
- Gerth Stølting Brodal, Spyros Sioutas, Konstantinos Tsakalidis, and Kostas Tsichlas. 2020. Fully persistent B-trees. Theoretical Computer Science, 841 (2020), 10–26. https://doi.org/10.1016/j.tcs.2020.06.027 Google ScholarCross Ref
- Bernard Chazelle. 1986. Filtering Search: A New Approach to Query-Answering. SIAM J. Comput., 15, 3 (1986), 703–724. https://doi.org/10.1137/0215051 Google ScholarDigital Library
- Bernard Chazelle and Leonidas J. Guibas. 1986. Fractional Cascading: I. A Data Structuring Technique. Algorithmica, 1, 2 (1986), 133–162. https://doi.org/10.1007/BF01840440 Google ScholarDigital Library
- Rathish Das, John Iacono, and Yakov Nekrich. 2022. External-memory dictionaries with worst-case update cost. arxiv:2211.06044. Google Scholar
- Erik D. Demaine, John Iacono, and Stefan Langerman. 2007. Retroactive Data Structures. ACM Transactions on Algorithms, 3, 2 (2007), Article 13, May, 20 pages. issn:1549-6325 https://doi.org/10.1145/1240233.1240236 Google ScholarDigital Library
- P. Dietz and D. Sleator. 1987. Two Algorithms for Maintaining Order in a List. In Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (STOC ’87). ACM, New York, NY, USA. 365–372. isbn:0-89791-221-7 https://doi.org/10.1145/28395.28434 Google ScholarDigital Library
- James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator, and Robert Endre Tarjan. 1989. Making Data Structures Persistent. J. Comput. System Sci., 38, 1 (1989), 86–124. https://doi.org/10.1016/0022-0000(89)90034-2 Google ScholarDigital Library
- Yoav Giora and Haim Kaplan. 2009. Optimal Dynamic Vertical Ray Shooting in Rectilinear Planar Subdivisions. ACM Transactions on Algorithms, 5, 3 (2009), Article 28, July, 51 pages. issn:1549-6325 https://doi.org/10.1145/1541885.1541889 Google ScholarDigital Library
- Leonidas J. Guibas and Robert Sedgewick. 1978. A Dichromatic Framework for Balanced Trees. In 19th Annual Symposium on Foundations of Computer Science, Ann Arbor, Michigan, USA, 16-18 October 1978. IEEE Computer Society, 8–21. https://doi.org/10.1109/SFCS.1978.3 Google ScholarDigital Library
- Scott Huddleston and Kurt Mehlhorn. 1982. A New Data Structure for Representing Sorted Lists. Acta Informatica, 17 (1982), 157–184. https://doi.org/10.1007/BF00288968 Google ScholarDigital Library
- Sitaram Lanka and Eric Mays. 1991. Fully Persistent B+-trees. SIGMOD Records, 20, 2 (1991), April, 426–435. issn:0163-5808 https://doi.org/10.1145/119995.115861 Google ScholarDigital Library
- David B. Lomet and Betty Salzberg. 1993. Exploiting A History Database for Backup. In Proceedings of the 19th International Conference on Very Large Data Bases (VLDB ’93). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 380–390. isbn:1-55860-152-X https://dl.acm.org/doi/10.5555/645919.672672 Google Scholar
- J. Ian Munro and Yakov Nekrich. 2019. Dynamic Planar Point Location in External Memory. In 35th International Symposium on Computational Geometry, SoCG 2019, June 18-21, 2019, Portland, Oregon, USA, Gill Barequet and Yusu Wang (Eds.) (LIPIcs, Vol. 129). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 52:1–52:15. https://doi.org/10.4230/LIPIcs.SoCG.2019.52 Google ScholarCross Ref
- Neil Sarnak and Robert Endre Tarjan. 1986. Planar Point Location Using Persistent Search Trees. Commun. ACM, 29, 7 (1986), 669–679. https://doi.org/10.1145/6138.6151 Google ScholarDigital Library
- Peter J. Varman and Rakesh M. Verma. 1997. An Efficient Multiversion Access Structure. IEEE Transactions on Knowledge and Data Engineering, 9, 3 (1997), 391–409. https://doi.org/10.1109/69.599929 Google ScholarDigital Library
- Dan E. Willard. 1985. New Data Structures for Orthogonal Range Queries. SIAM J. Comput., 14, 1 (1985), 232–253. https://doi.org/10.1137/0214019 Google ScholarDigital Library
Index Terms
- External Memory Fully Persistent Search Trees
Recommendations
Fully persistent B-trees
SODA '12: Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete algorithmsWe present I/O-efficient fully persistent B-Trees that support range searches at any version in O(logB n + t/B) I/Os and updates at any version in O(logB n + log2 B) amortized I/Os, using space O(m/B) disk blocks. By n we denote the number of elements ...
Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries
PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsWe present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data ...
Sublinear algorithms in the external memory model
Property testingWe initiate the study of sublinear-time algorithms in the external memory model. In this model, the data is stored in blocks of a certain size B, and the algorithm is charged a unit cost for each block access. This model is well-studied, since it ...
Comments