article

Free Access

The influence of caches on the performance of heaps

ACM Journal of Experimental Algorithmics Volume 1pp 4–eshttps://doi.org/10.1145/235141.235145

Published:01 January 1996Publication History

ACM Journal of Experimental Algorithmics

Abstract

As memory access times grow larger relative to processor cycle times, the cache performance of algorithms has an increasingly large impact on overall performance. Unfortunately, most commonly used algorithms were not designed with cache performance in mind. This paper investigates the cache performance of implicit heaps. We present optimizations which significantly reduce the cache misses that heaps incur and improve their overall performance. We present an analytical model called collective analysis that allows cache performance to be predicted as a function of both cache configuration and algorithm configuration. As part of our investigation, we perform an approximate analysis of the cache performance of both traditional heaps and our improved heaps in our model. In addition empirical data is given for five architectures to show the impact our optimizations have on overall performance. We also revisit a priority queue study originally performed by Jones [25]. Due to the increases in cache miss penalties, the relative performance results we obtain on today's machines differ greatly from the machines of only ten years ago. We compare the performance of implicit heaps, skew heaps and splay trees and discuss the difference between our results and Jones's.

Supplemental Material

Available for Download

tar

p4-lamarca.tar (136 KB)

The software suite accompanying the article.

vol1nbr4.ps (389.5 KB)

tar

vol1nbr4.tex.tar (900 KB)

References

{1} A. Agarwal, M. Horowitz, and J. Hennessy. An analytical cache model. ACM Transactions on Computer Systems, 7:2:184-215, 1989. Google ScholarDigital Library
{2} R. Agarwal, F. Gustavson, and M. Zubair. Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms. IBM Journal of Research and Development , 38:5:563-576, Sep 1994. Google ScholarDigital Library
{3} A. Aggarwal, K. Chandra, and M. Snir. A model for hierarchical memory. In 19th Annual ACM Symposium on Theory of Computing, pages 305-314, 1987. Google ScholarDigital Library
{4} A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Massachusetts, 1974. Google ScholarDigital Library
{5} B. Alpern, L. Carter, E. Feig, and T. Selker. The uniform memory hierarchy model of computation. Algorithmica, 12:2-3:72-109, 1994.Google ScholarDigital Library
{6} J. Anderson and M. Lam. Global optimizations for parallelism and locality on scalable parallel machines. In Proceedings of the 1993 ACM Symposium on Programming Languages Design and Implementation, pages 112-125. ACM, 1993. Google ScholarDigital Library
{7} S. Carlsson. An optimal algorithm for deleting the root of a heap. Information Processing Letters, 37:2:117-120, 1991. Google ScholarDigital Library
{8} S. Carr, K. McKinley, and C. W. Tseng. Compiler optimizations for improving data locality. In Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 252-262, 1994. Google ScholarDigital Library
{9} M. Cierniak and Wet Li. Unifying data and control transformations for distributed shared-memory machines. In Proceedings of the 1995 ACM Symposium on Programming Languages Design and Implementation, pages 205-217. ACM, 1995. Google ScholarDigital Library
{10} D. Clark. Cache performance of the VAX-11/780. ACM Transactions on Computer Systems, 1:1:24-37, 1983. Google ScholarDigital Library
{11} E. Coffman and P. Denning. Operating Systems Theory. Prentice-Hall, Englewood Cliffs, NJ, 1973. Google ScholarDigital Library
{12} T. Cormen, C. Leiserson, and R. Rivest. Introduction to Algorithms. The MIT Press, Cambridge, MA, 1990. Google ScholarDigital Library
{13} J. De Graffe and W. Kosters. Expected heights in heaps. BIT, 32:4:570-579, 1992. Google ScholarDigital Library
{14} E. Doberkat. Inserting a new element into a heap. BIT, 21:225-269, 1981.Google ScholarCross Ref
{15} E. Doberkat. Deleting the root of a heap. Acta Informatica, 17:245-265, 1982.Google ScholarDigital Library
{16} J. Dongarra, O. Brewer, J. Kohl, and S. Fineberg. A tool to aid in the design, implementation, and understanding of matrix algorithms for parallel processors. Journal of Parallel and Distributed Computing, 9:2:185-202, June 1990. Google ScholarDigital Library
{17} M. Farrens, G. Tyson, and A. Pleszkun. A study of single-chip processor/cache organizations for large numbers of transistors. In Proceedings of the 21st Annual International Symposium on Computer Architecture, pages 338-347, 1994. Google ScholarDigital Library
{18} D. Fenwick, D. Foley, W. Gist, S. VanDoren, and D. Wissell. The AlphaServer 8000 series: High-end server platform development. Digital Technical Journal, 7:1:43-65, 1995. Google ScholarDigital Library
{19} Robert W. Floyd. Treesort 3. Communications of the ACM, 7:12:701, 1964.Google ScholarDigital Library
{20} D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformation. Journal of Parallel and Distributed Computing, 5:5:587- 616, Oct 1988. Google ScholarDigital Library
{21} G. Gonnet and J. Munro. Heaps on heaps. SIAM Journal of Computing, 15:4:964-971, 1986. Google ScholarDigital Library
{22} D. Grunwald, B. Zorn, and R. Henderson. Improving the cache locality of memory allocation. In Proceedings of the 1993 ACM Symposium on Programming Languages Design and Implementation, pages 177-186. ACM, 1993. Google ScholarDigital Library
{23} J. Hennesey and D. Patterson. Computer Architecture A Quantitative Approach. Morgan Kaufman Publishers, Inc., San Mateo, CA, 1990. Google ScholarDigital Library
{24} D.B. Johnson. Priority queues with update and finding minimum spanning trees. Information Processing Letters, 4, 1975.Google Scholar
{25} D. Jones. An emperical comparison of priority-queue and event-set implementations. Communications of the ACM, 29:4:300-311, 1986. Google ScholarDigital Library
{26} K. Kennedy and K. McKinley. Optimizing for parallelism and data locality. In Proceedings of the 1992 International Conference on Supercomputing, pages 323-334, 1992. Google ScholarDigital Library
{27} D.E. Knuth. The Art of Computer Programming, vol III-Sorting and Searching. Addison-Wesely, Reading, MA, 1973. Google ScholarDigital Library
{28} A. LaMarca. Caches and algorithms. Ph.D. Dissertation, University of Washington, May 1996. Google ScholarDigital Library
{29} A. LaMarca and R.E. Ladner. The influence of caches on the performance of sorting. Technical Report 96-10-01, University of Washington, Department of Computer Science and Engineering, 1992. Also appears in the Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, January 1997. Google ScholarDigital Library
{30} A. Lebeck and D. Wood. Cache profiling and the spec benchmarks: a case study. Computer, 27:10:15-26, Oct 1994. Google ScholarDigital Library
{31} M. Martonosi, A. Gupta, and T. Anderson. Memspy: analyzing memory system bottlenecks in programs. In Proceedings of the 1992 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 1-12, 1992. Google ScholarDigital Library
{32} D. Naor, C. Martel, and N. Matloff. Performance of priority queue structures in a virtual memory environment. Computer Journal, 34:5:428-437, Oct 1991. Google ScholarDigital Library
{33} G. Rao. Performance analysis of cache memories. Journal of the ACM, 25:3:378-395, 1978. Google ScholarDigital Library
{34} R. Sedgewick. Algorithms. Addison-Wesley, Reading, MA, 1988. Google ScholarDigital Library
{35} J.P. Singh, H.S. Stone, and D.F. Thiebaut. A model of workloads and its use in miss-rate prediction for fully associative caches. IEEE Transactions on Computers, 41:7:811-825, 1992. Google ScholarDigital Library
{36} D. Sleator and R. Tarjan. Self-adjusting binary search trees. Journal of the ACM, 32:3:652- 686, 1985. Google ScholarDigital Library
{37} Amitabh Srivastava and Alan Eustace. ATOM: A system for building customized program analysis tools. In Proceedings of the 1994 ACM Symposium on Programming Languages Design and Implementation, pages 196-205. ACM, 1994. Google ScholarDigital Library
{38} O. Temam, C. Fricker, and W. Jalby. Cache interference phenomena. In Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems , pages 261-271, 1994. Google ScholarDigital Library
{39} R. Uhlig, D. Nagle, T. Stanley, T. Mudge, S. Sechrest, and R. Brown. Design tradeoffs for software-managed TLBs. ACM Transactions on Computer Systems, 12:3:175-205, 1994. Google ScholarDigital Library
{40} M. Weiss. Data structures and algorithm analysis. Benjamin/Cummings Pub. Co., Redwood City, CA, 1995. Google ScholarDigital Library
{41} H. Wen and J. L. Baer. Efficient trace-driven simulation methods for cache performance analysis. ACM Transactions on Computer Systems, 9:3:222-241, 1991. Google ScholarDigital Library
{42} J. W. Williams. Heapsort. Communications of the ACM, 7:6:347-348, 1964.Google Scholar
{43} M. Wolf and M. Lam. A data locality optimizing algorithm. In Proceedings of the 1991 ACM Symposium on Programming Languages Design and Implementation, pages 30-44. ACM, 1991. Google ScholarDigital Library

Index Terms

The influence of caches on the performance of heaps

Recommendations

Performance of One's Complement Caches

On-chip caches to reduce average memory access latency are commonplace in today's commercial microprocessors. These on-chip caches generally have low associativity and small cache sizes. Cache line conflicts are the main source of cache misses, which ...
Read More
SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES
Read More
Fetch Caches
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Journal of Experimental Algorithmics Volume 1, Issue
1996
104 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/235141
Editor:
Bernard M. E. Moret
Univ. of New Mexico, Albuquerque
Issue’s Table of Contents
Copyright © 1996 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 1996
Published in jea Volume 1, Issue
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 62
  Total Citations
  View Citations
- 850
  Total Downloads
- Downloads (Last 12 months)150
- Downloads (Last 6 weeks)46
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The influence of caches on the performance of heaps

ACM Journal of Experimental Algorithmics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Performance of One's Complement Caches

SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES

Fetch Caches

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The influence of caches on the performance of heaps

ACM Journal of Experimental Algorithmics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Performance of One's Complement Caches

SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES

Fetch Caches

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media