ABSTRACT
In this paper, we present lower bounds for permuting and sorting in the cache-oblivious model. We prove that (1) I/O optimal cache-oblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimal cache-oblivious algorithm for permuting, not even in the presence of a tall cache assumption.Our results for sorting show the existence of an inherent trade-off in the cache-oblivious model between the strength of the tall cache assumption and the overhead for the case M » B, and show that Funnelsort and recursive binary mergesort are optimal algorithms in the sense that they attain this trade-off.
- P. Agarwal, L. Arge, A. Danner, and B. Holland-Minkley. On cache-oblivious multidimensional range searching. In Proc. 19th ACM Symposium on Computational Geometry, 2003. To appear.]] Google ScholarDigital Library
- A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116--1127, Sept. 1988.]] Google ScholarDigital Library
- L. Arge. External memory data structures. In Proc. 9th Annual European Symposium on Algorithms (ESA), volume 2161 of LNCS, pages 1--29. Springer, 2001.]] Google ScholarDigital Library
- L. Arge, M. A. Bender, E. D. Demaine, B. Holland-Minkley, and J. I. Munro. Cache-oblivious priority queue and graph algorithm applications. In Proc. 34th Ann. ACM Symp. on Theory of Computing, pages 268--276. ACM Press, 2002.]] Google ScholarDigital Library
- L. Arge, M. Knudsen, and K. Larsen. A general lower bound on the I/O-complexity of comparison-based algorithms. In F. K. H. A. Dehne, J.-R. Sack, N. Santoro, and S. Whitesides, editors, Algorithms and Data Structures, Third Workshop, volume 709 of LNCS, pages 83--94, Montreal, Canada, 11--13 Aug. 1993. Springer.]] Google ScholarDigital Library
- S. Baase and A. V. Gelder. Computer Algorithms, Introduction to Design and Analysis. Addison-Wesley, 3rd edition, 1999.]] Google ScholarDigital Library
- R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1:173--189, 1972.]]Google ScholarDigital Library
- M. Bender, R. Cole, E. Demaine, and M. Farach-Colton. Scanning and traversing: Maintaining data for traversals in a memory hierarchy. In Proc. 10th Annual European Symposium on Algorithms (ESA), volume 2461 of LNCS, pages 139--151. Springer, 2002.]] Google ScholarDigital Library
- M. Bender, R. Cole, and R. Raman. Exponential structures for cache-oblivious algorithms. In Proc. 29th International Colloquium on Automata, Languages, and Programming (ICALP), volume 2380 of LNCS, pages 195--207. Springer, 2002.]] Google ScholarDigital Library
- M. Bender, E. Demaine, and M. Farach-Colton. Efficient tree layout in a multilevel memory hierarchy. In Proc. 10th Annual European Symposium on Algorithms (ESA), volume 2461 of LNCS, pages 165--173. Springer, 2002.]] Google ScholarDigital Library
- M. A. Bender, E. Demaine, and M. Farach-Colton. Cache-oblivious B-trees. In Proc. 41st Ann. Symp. on Foundations of Computer Science, pages 399--409. IEEE Computer Society Press, 2000.]] Google ScholarDigital Library
- M. A. Bender, E. Demaine, and M. Farach-Colton. Cache-oblivious B-trees. In Proc. 41st Ann. Symp. on Foundations of Computer Science, pages 399--409. IEEE Computer Society Press, 2000.]] Google ScholarDigital Library
- G. Bilardi and E. Peserico. A characterization of temporal locality and its portability across memory hierarchies. In ICALP: Annual International Colloquium on Automata, Languages and Programming, volume 2076, pages 128--139. Springer, 2001.]] Google ScholarDigital Library
- G. S. Brodal and R. Fagerberg. Cache oblivious distribution sweeping. In Proc. 29th International Colloquium on Automata, Languages, and Programming (ICALP), volume 2380 of LNCS, pages 426--438. Springer, 2002.]] Google ScholarDigital Library
- G. S. Brodal and R. Fagerberg. Funnel heap - a cache oblivious priority queue. In Proc. 13th Annual International Symposium on Algorithms and Computation, volume 2518 of LNCS, pages 219--228. Springer, 2002.]] Google ScholarDigital Library
- G. S. Brodal, R. Fagerberg, and R. Jacob. Cache oblivious search trees via binary trees of small height. In Proc. 13th Ann. ACM-SIAM Symp. on Discrete Algorithms, pages 39--48, 2002.]] Google ScholarDigital Library
- M. R. Brown and R. E. Tarjan. A fast merging algorithm. Journal of the ACM, 26(2):211--226, 1979.]] Google ScholarDigital Library
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In 40th Annual Symposium on Foundations of Computer Science, pages 285--297. IEEE Computer Society Press, 1999.]] Google ScholarDigital Library
- S. Huddleston and K. Mehlhorn. A new data structure for representing sorted lists. Acta Informatica, 17:157--184, 1982.]]Google ScholarDigital Library
- F. K. Hwang and S. Lin. A simple algorithm for merging two disjoint linearly ordered sets. SIAM Journal of Computing, 1(1):31--39, 1972.]]Google ScholarCross Ref
- H. Prokop. Cache-oblivious algorithms. Master's thesis, Massachusetts Institute of Technology, June 1999.]]Google Scholar
- N. Rahman, R. Cole, and R. Raman. Optimised predecessor data structures for internal memory. In Proc. 5th Int. Workshop on Algorithm Engineering (WAE), volume 2141, pages 67--78. Springer, 2001.]] Google ScholarDigital Library
- D. D. Sleator and R. E. Tarjan. Amortized Efficiency of List Update and Paging Rules. Communications of the ACM, 28:202--208, 1985.]] Google ScholarDigital Library
- J. S. Vitter. External memory algorithms and data structures: Dealing with massive data. ACM Computing Surveys, 33(2):209--271, June 2001.]] Google ScholarDigital Library
Index Terms
- On the limits of cache-obliviousness
Recommendations
Cache-Oblivious Algorithms
This article presents asymptotically optimal algorithms for rectangular matrix transpose, fast Fourier transform (FFT), and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: ...
On the limits of cache-oblivious rational permutations
Permuting a vector is a fundamental primitive which arises in many applications. In particular, rational permutations, which are defined by permutations of the bits of the binary representations of the vector indices, are widely used. Matrix ...
Location cache: a low-power L2 cache system
ISLPED '04: Proceedings of the 2004 international symposium on Low power electronics and designWhile set-associative caches incur fewer misses than direct-mapped caches, they typically have slower hit times and higher power consumption, when multiple tag and data banks are probed in parallel. This paper presents the location cache structure which ...
Comments