ABSTRACT
Parallel disks provide a cost effective way of speeding up I/Os in applications that work with large amounts of data. The main challenge is to achieve as much parallelism as possible, using prefetching to avoid bottlenecks in disk access. Efficient algorithms have been developed for some particular patterns of accessing the disk blocks. In this paper, we consider general request sequences. When the request sequence consists of unique block requests, the problem is called prefetching and is a well-solved problem for arbitrary request sequences. When the reference sequence can have repeated references to the same block, we need to devise an effective caching policy as well. While optimum offline algorithms have been recently designed for the problem, in the online case, no effective algorithm was previously known. Our main contribution is a deterministic online algorithm threshold-LRU which achieves O((MD/L)2/3) competitive ratio and a randomized online algorithm threshold-MARK which achieves O(√(MD/L) log (MD/L)) competitive ratio for the caching/prefetching problem on the parallel disk model (PDM), where D is the number of disks, M is the size of fast memory buffer, and M+L is the amount of lookahead available in the request sequence. The best-known lower bound on the competitive ratio is Ω(≾MD/L) for lookahead L ≥ M in both models. We also show that if the deterministic online algorithm is allowed to have twice the memory of the offline then a tight competitive ratio of Θ(≾MD/L) can be achieved. This problem generalizes the well-known paging problem on a single disk to the parallel disk model.
- S. Albers. On the influence of lookahead in competitive paging algorithms. Algorithmica, 18(3):283--305, 1997.]]Google ScholarCross Ref
- S. Albers and M. Büttner. Integrated prefetching and caching in single and parallel disk systems. In SPAA, pages 109--117, 2003.]] Google ScholarDigital Library
- S. Albers, N. Garg, and S. Leonardi. Minimizing stall time in single and parallel disk systems. In In Proc. of 30th Annual ACM Symp. on Theory of Computing (STOC 98), pages 454--462, 1998.]] Google ScholarDigital Library
- S. Albers and C. Witt. Minimizing stall time in single and parallel disk systems using multicommodity network flows. In RANDOM-APPROX, 2001.]] Google ScholarDigital Library
- R. Barve, M. Kallahalla, P. J. Varman, and J. S. Vitter. Competitive parallel disk prefetching and buffer management. In In Proc. of Fifth Workshop on I/O in parallel and Distributed Systems, pages 47--56, Nov 1997.]] Google ScholarDigital Library
- L. A. Belady. A study of replacement algorithms for virtual storage computers. IBM Systems Journal, 5:78--101, 1966.]]Google ScholarDigital Library
- A. Borodin and R. El-Yaniv. Online computation and competitive analysis. Cambridge University Press, 1998.]] Google ScholarDigital Library
- D. Breslauer. On competitive online paging with lookahead. TCS, 290(1-2):365--375, 1998.]] Google ScholarDigital Library
- P. Cao, E. W. Felton, A. R. Karlin, and K. Li. A study of integrated prefetching and caching strategies. In In Proc. of the joint Intl. Conf. on measurement and modeling of computer systems, pages 188--197, May 1995.]] Google ScholarDigital Library
- A. Fiat, R. Karp, M. Luby, L. McGoech, D. D. Sleator, and N. E. Young. Competitive paging algorithms. Journal of Algorithms, 12(4):685--699, Dec 1991.]] Google ScholarDigital Library
- D. A. Hutchinson, P. Sanders, and J. S. Vitter. Duality between prefetching and queued writing with application to integrated caching and prefetching and to external sorting. In ESA, 2001.]] Google ScholarDigital Library
- M. Kallahalla and P. J. Varman. Optimal read-once parallel disk scheduling. In In Proc. of Sixth ACM Workshop on I/O in Parallel and Distributed Systems, pages 68--77, 1999.]] Google ScholarDigital Library
- M. Kallahalla and P. J. Varman. Optimal prefetching and caching for parallel i/o systems. In SPAA, 2001.]] Google ScholarDigital Library
- A. R. Karlin, M. S. Manasse, L. Rudolph, and D. D. Sleator. Competitive snoopy caching. Algorithmica, 3(1):79--119, 1988.]]Google ScholarDigital Library
- T. Kimbrel, P. Cao, E.W. Felten, A. R. Karlin, and K. Li. Integrated parallel prefetching and caching. In SIGMETRICS, 1996.]] Google ScholarDigital Library
- T. Kimbrel and A. R. Karlin. Near optimal parallel prefetching and caching. In FOCS, pages 540--549, 1996.]] Google ScholarDigital Library
- L. A. McGeoch and D. D. Sleator. A strongly competitive randomized paging algorithm. Algorithmica, 6:816--825, 1991.]]Google ScholarDigital Library
- D. D. Sleator and R. E. Tarjan. Amortized efficiency of the list update and paging rules. Communications of the ACM, 28:202--208, November 1985.]] Google ScholarDigital Library
- J. S. Vitter. External memory algorithms and data structures: Dealing with massive data. ACM Computing surveys, 33(2):209--271, June 2001.]] Google ScholarDigital Library
- N. Young. Competitive paging and dual-guided on-line weighted caching and matching algorithms. In Ph.D. thesis. Princeton University, 1991. CS-TR-348-91.]] Google ScholarDigital Library
Index Terms
- Online algorithms for prefetching and caching on parallel disks
Recommendations
Tight competitive ratios for parallel disk prefetching and caching
SPAA '08: Proceedings of the twentieth annual symposium on Parallelism in algorithms and architecturesWe consider the natural extension of the well-known single disk caching problem to the parallel disk I/O model (PDM) [17]. The main challenge is to achieve as much parallelism as possible and avoid I/O bottlenecks. We are given a fast memory (cache) of ...
Online File Caching with Rejection Penalties
In the file caching problem, the input is a sequence of requests for files out of a slow memory. A file has two attributes, a positive retrieval cost and an integer size. An algorithm is required to maintain a cache of size k such that the total size of ...
Near-Optimal Parallel Prefetching and Caching
Recently there has been a great deal of interest in the operating systems research community in prefetching and caching data from parallel disks, as a technique for enabling serial applications to improve input--output (I/O) performance. In this paper, ...
Comments