Abstract
External Memory models, most notable being the I-O Model [3], capture the effects of memory hierarchy and aid in algorithm design. More than a decade of architectural advancements have led to new features not captured in the I-O model – most notably the prefetching capability. We propose a relatively simple Prefetch model that incorporates data prefetching in the traditional I-O models and show how to design algorithms that can attain close to peak memory bandwidth. Unlike (the inverse of) memory latency, the memory bandwidth is much closer to the processing speed, thereby, intelligent use of prefetching can considerably mitigate the I-O bottleneck. For some fundamental problems, our algorithms attain running times approaching that of the idealized Random Access Machines under reasonable assumptions. Our work also explains the significantly superior performance of the I-O efficient algorithms in systems that support prefetching compared to ones that do not.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, A., Alpern, B., Chandra, A., Snir, M.: A model for hierarchical memory. In: Proceedings of ACM Symposium on Theory of Computing (1987)
Aggarwal, A., Chandra, A., Snir, M.: Hierarchical memory with block transfer. In: Proceedings of IEEE Foundations of Computer Science, pp. 204–216 (1987)
Aggarwal, A., Vitter, J.: The input/output complexity of sorting and related problems. Communications of the ACM 31(9), 1116–1127 (1988)
Alpern, B., Carter, L., Feig, E., Selker, T.: The uniform memory hierarchy model of computation. Algorithmica 12(2), 72–109 (1994)
Brodal, G.S., Fagerberg, R.: On the limits of cache-obliviousness. In: Proceedings of STOC, pp. 307–315 (2003)
Chaudhry, G., Cormen, T.H.: Getting more for out-of-core columnsort. In: Mount, D.M., Stein, C. (eds.) ALENEX 2002. LNCS, vol. 2409, p. 143. Springer, Heidelberg (2002)
Chen, T., Baer, J.: Effective hardware-based data prefetching for high-performance processors. IEEE Transactions on Computers 44(5), 609–623 (1995)
Cormen, T.H., Sundquist, T., Wisniewski, L.F.: Asymptotically tight bounds for performing bmmc permutations on parallel disk systems. SIAM Journal on Computing 28(1), 105–136 (1999)
Dementiev, R., Sanders, P.: Asynchronous parallel disk sorting. In: Proceedings of SPAA (2003)
Adiga, N.R., et al.: An overview of the bluegene/l supercomputer. In: Proceedings of Supercomputing (SC) (2002)
Floyd, R.: Permuting information in idealized two-level storage. Complexity of Computer Computations, 105–109 (1972)
Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: Proceedings of FOCS (1999)
Worthington, B., Ganger, G., Patt, Y.: The disksim simulation envirnoment (version 2.0), Available at: http://www.ece.cmu.edu/~ganger/disksim/
Hong, J.-W., Kung, H.T.: I/O complexity: The red-blue pebble game. In: Proceedings of the 13th Symposium on the Theory of Computing (May 1981)
Iyer, S., Druschel, P.: Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous i/o. In: Proceedings of SOSP (2001)
Kallahalla, M., Varman, P.J.: Optimal read-once parallel disk scheduling. In: Proceedings of IOPADS, pp. 68–77 (1999)
Lund, K., Goebel, V.: Adaptive disk scheduling in a multimedia dbms. In: Proceedings of ACM Multimedia (2003)
Meyer, U., Zeh, N.: I-o efficient undirected shortest paths. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 434–445. Springer, Heidelberg (2003)
Nesbit, K.J., Smith, J.E.: Data cache prefetching using a global history buffer. In: Proceedings of HPCA, pp. 96–105 (2004)
Sen, S., Chatterjee, S., Dumir, N.: Towards a theory of cache-efficient algorithms. Journal of the ACM (2002)
Verma, A., Sen, S.: Model and algorithms for prefetching in memory hierarchy, Working Draft, (2005), Available at: http://www.research.ibm.com/people/a/akshat_verma/akshat_verma.wip.html/FILE/prefetch_main.ps
Vishkin, U.: Can parallel algorithms enhance serial implementation? Communications of the ACM (1996)
Vitter, J., Shriver, E.: Algorithms for parallel memory I: Two-level memories. Algorithmica 12(2), 110–147 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Verma, A., Sen, S. (2006). Algorithmic Ramifications of Prefetching in Memory Hierarchy. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2006. HiPC 2006. Lecture Notes in Computer Science, vol 4297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945918_8
Download citation
DOI: https://doi.org/10.1007/11945918_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68039-0
Online ISBN: 978-3-540-68040-6
eBook Packages: Computer ScienceComputer Science (R0)