Skip to main content

Concerning with On-Chip Network Features to Improve Cache Coherence Protocols for CMPs

  • Conference paper
Advances in Computer Systems Architecture (ACSAC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4697))

Included in the following conference series:

  • 906 Accesses

Abstract

Chip multiprocessors (CMPs) with on-chip network connecting processor cores have been pervasively accepted as a promising technology to efficiently utilize the ever increasing density of transistors on a chip. Communications in CMPs require invalidating cached copies of a shared data block. The coherence traffic incurs more and more significant overhead as the number of cores in a CMP increases. Conventional designs of cache coherence protocols do not take into account characteristics of underlying networks for flexibility reasons. However, in CMPs, processor cores and the on-chip network are tightly integrated. Exposing the network features to cache coherence protocols will unveil some optimization opportunities. In this paper, we propose distance aware protocol and multi-target invalidations, which exploit the network characteristics to reduce the invalidation traffic overhead at negligible hardware cost. Experimental results on a 16-core CMP simulator showed that the two mechanisms reduced the average invalidation traffic latency by 5%, up to 8%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dally, W.J., Towles, B.: Route packets, not wires: on-chip inteconnection networks. In: DAC 2001. Proceedings of the 38th conference on Design automation, New York, NY, USA, pp. 684–689. ACM Press, New York (2001)

    Chapter  Google Scholar 

  2. Ho, R., Mai, K.W., Horowitz, M.A.: The future of wires. Proceedings of the IEEE 89(4), 490–504 (2001)

    Article  Google Scholar 

  3. Zhang, M., Asanovic, K.: Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors. In: ISCA 2005. Proceedings of the 32nd Annual International Symposium on Computer Architecture, pp. 336–345. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  4. Held, J., Bautista, J., Koehl, S.: From a Few Cores to Many: A Tera-scale Computing Research Overview. Technical report, intel (2006)

    Google Scholar 

  5. Laudon, J., Lenoski, D.: The sgi origin: a ccnuma highly scalable server. In: ISCA 1997. Proceedings of the 24th annual international symposium on Computer architecture, pp. 241–251. ACM Press, New York, NY, USA (1997)

    Chapter  Google Scholar 

  6. Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks. Kaufmann Publishers Inc., San Francisco, CA, USA (2003)

    Google Scholar 

  7. Hu, W., Zhang, F., Li, Z.: Microarchitecture of the Godson-2 Processor. Journal of Computer Science and Technology 20(2), 243–249 (2005)

    Article  Google Scholar 

  8. Cox, A.L., Fowler, R.J.: Adaptive cache coherency for detecting migratory shared data. In: ISCA 1993. Proceedings of the 20th annual international symposium on Computer architecture, New York, NY, USA, pp. 98–108. ACM Press, New York (1993)

    Chapter  Google Scholar 

  9. Kaxiras, S., Goodman, J.R.: Improving CC-NUMA Performance Using Instruction-Based Prediction. In: Proceedings of the Fifth IEEE Symposium on High-Performance Computer Architecture, pp.161–170 (1999)

    Google Scholar 

  10. Abdel-Shafi, H., Hall, J., Adve, S.V., Adve, V.S.: An evaluation of fine-grain producer-initiated communication in cache-coherent multiprocessors. In: Third International Symposium on High-Performance Computer Architecture, pp. 204–215 (1997)

    Google Scholar 

  11. Koufaty, D.A., Chen, X., Poulsen, D.K., Torrellas, J.: Data forwarding in scalable shared-memory multiprocessors. In: ICS 1995. Proceedings of the 9th international conference on Supercomputing, pp. 255–264. ACM Press, New York, NY, USA (1995)

    Chapter  Google Scholar 

  12. Lebeck, A.R., Wood, D.A.: Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors. In: ISCA 1995. Proceedings of the 22nd annual international symposium on Computer architecture, pp. 48–59. ACM Press, New York, NY, USA (1995)

    Chapter  Google Scholar 

  13. Lai, A.-C., Falsafi, B.: Selective, accurate, and timely self-invalidation using last-touch prediction. In: ISCA 2000. Proceedings of the 27th annual international symposium on Computer architecture, pp. 139–148. ACM Press, New York, NY, USA (2000)

    Chapter  Google Scholar 

  14. Mullins, R., West, A., Moore, S.: Low-latency virtual-channel routers for on-chip networks. In: ISCA 2004. Proceedings of the 31st annual international symposium on Computer architecture, p. 188. IEEE Computer Society, Washington, DC, USA (2004)

    Google Scholar 

  15. Kim, J., Park, D., Theocharides, T., Vijaykrishnan, N., Das, C.R.: A low latency router supporting adaptivity for on-chip interconnects. In: DAC 2005. Proceedings of the 42nd annual conference on Design automation, pp. 559–564. ACM Press, New York, NY, USA (2005)

    Chapter  Google Scholar 

  16. Eisley, N., Peh, L.S., Shang, L.: In-network cache coherence. In: MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, Washington, DC, USA, pp. 321–332. (2006)

    Google Scholar 

  17. Cheng, L., Muralimanohar, N., Ramani, K., Balasubramonian, R., Carter, J.B.: Interconnect-aware coherence protocols for chip multiprocessors. In: ISCA 2006. Proceedings of the 33rd annual international symposium on Computer Architecture, pp. 339–351. IEEE Computer Society Press, Washington, DC, USA (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Lynn Choi Yunheung Paek Sangyeun Cho

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zeng, H., Huang, K., Wu, M., Hu, W. (2007). Concerning with On-Chip Network Features to Improve Cache Coherence Protocols for CMPs. In: Choi, L., Paek, Y., Cho, S. (eds) Advances in Computer Systems Architecture. ACSAC 2007. Lecture Notes in Computer Science, vol 4697. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74309-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74309-5_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74308-8

  • Online ISBN: 978-3-540-74309-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics