Skip to main content

A Memory Bandwidth Effective Cache Store Miss Policy

  • Conference paper
Book cover Advances in Computer Systems Architecture (ACSAC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3740))

Included in the following conference series:

Abstract

Memory bandwidth becomes more and more important in the forthcoming 10 billion transistors chip times. This paper discusses and implements a memory bandwidth effective cache store miss policy. Although the write-allocate policy is adopted, we find it is possible not to load the full cache block from lower memory hierarchy when cache store miss occurs, if the cache block is fully modified before any load instruction accesses the un-modified data of the same cache block. This cache store miss policy will partly reduce the pressure on memory bandwidth, and improve the cache hit rate. We provides a hardware mechanism, Store Merge Buffer, to implement the policy in Goodson-2 processor. Our experiments demonstrate the encouraging results: Memory bandwidth improved by almost 50% (tested by stream benchmark), and IPC on SPEC CPU2K improved by 9.4% on average.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tullsen, D.M., Eggers, S.J., Levy, H.M., et al.: Simultaneous Multithreading: Maximizing On-Chip Parallelism. In: 22nd Annual International Symposium on Computer Architecture (1995)

    Google Scholar 

  2. Hu, W., Tang, Z.: Microarchitecture design of the Godson-1 processor. Chinese Journal of Computers, 385–396 (April 2003) (in Chinese)

    Google Scholar 

  3. Hu, W.-W., Zhang, F.-X., Li, Z.-S.: Microarchitecture of the Godson-2 Processor. Journal of Computer Science and Technology 20(2) (March 2005)

    Google Scholar 

  4. Patterson, D., Hennessy, J.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, Inc., San Francisco (1996)

    MATH  Google Scholar 

  5. McCalpin, J.D.: STREAM: Sustainable Memory Bandwidth in High Performance Computers, http://www.cs.virginia.edu/stream/

  6. Yeager, K.: The MIPS R10000 superscalar microprocessor. IEEE Micro 16, 28–41 (1996)

    Article  Google Scholar 

  7. Kessler, R.: The Alpha 21264 microprocessor. IEEE Micro 19, 24–36 (1999)

    Article  Google Scholar 

  8. Burger, D., Goodman, J.R., Kagi, A.: Memory Bandwidth Limitations of Future Microprocessors. ISCA, 78–89 (1996)

    Google Scholar 

  9. Chen, T.-F., Baer, J.-L.: A performance study of software and hardware data prefetching schemes. In: The 21st Annual International Symposium on Computer Architecture, pp. 223–232 (1994)

    Google Scholar 

  10. Wulf, W., McKee, S.: Hitting the Memory Wall: Implications of the Obvious. ACM Computer Architecture News 23(1), 20–24 (1995)

    Article  Google Scholar 

  11. IBM Microelectronics and Motorola Corporation, PowerPC Microprocessor Family: The Programming Environments, Motorola Inc., (1994)

    Google Scholar 

  12. Jouppi, N.: Cache Write Policies and Performance. ACM SIGARCH Computer Architecture News 21(2), 191–201 (1993)

    Article  Google Scholar 

  13. Henning, J.L.: SPEC CPU 2000: Measuring CPU Performance in the new millennium. IEEE Computer (July 2000)

    Google Scholar 

  14. Hu, S., John, L.: Avoiding Store Misses to Fully Modified Cache Blocks. Submitted to EURO-PAR (October 2005)

    Google Scholar 

  15. Huh, J., Burger, D., Keckler, S.: Exploring the design space of future CMPs. In: The 10th International Conference on Parallel Architectures and Compilation Techniques, September 2001, pp. 199–210 (2001)

    Google Scholar 

  16. Burger, D., Goodman, J.R.: Billion-transistor architectures: there and back again. Computer 37, 22–28 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rui, H., Zhang, F., Hu, W. (2005). A Memory Bandwidth Effective Cache Store Miss Policy. In: Srikanthan, T., Xue, J., Chang, CH. (eds) Advances in Computer Systems Architecture. ACSAC 2005. Lecture Notes in Computer Science, vol 3740. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11572961_61

Download citation

  • DOI: https://doi.org/10.1007/11572961_61

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29643-0

  • Online ISBN: 978-3-540-32108-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics