Skip to main content

Advertisement

Log in

Energy-efficient hybrid coherence protocol for multicore processors

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In multicore processors, a cache coherence protocol is used for maintaining the data coherence in private caches. Traditional snoop-based protocols used broadcasting messages for maintaining data coherence to result in many tag comparisons in private caches. However, the use of broadcasting messages consumes a considerable amount of energy as well as execution time because of tag comparisons. In this paper, an energy-efficient hybrid coherence protocol (EEHCP) is proposed to reduce energy consumption for maintaining the data coherence of private caches by reducing the amount of broadcasting messages and tag comparisons. According to the simulation results, the proposed EEHCP consumes 27, 18, and 46% less energy than the SFT (snoop filter table) protocol, the snoop-based protocol with the hybrid write strategy, and the traditional snoop-based protocol with MESI, and 18, 17, and 32% less execution time than these three protocols, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Yi, K., Ro, W.W., Gaudiot, J.-L.: Importance of coherence protocols with network applications on multicore processors. IEEE Trans. Comput. 62(1), 6–15 (2013)

    Article  MathSciNet  Google Scholar 

  2. Sangeetha, P., Mythili, M.: Features of Intel Core i7 Processors. Int. J. Eng. Res. Gener. Sci. 3(2) (2015)

  3. Zebchuk, J., Srinivasan, V., Qureshi, M.K., Moshovos, A.: A tagless coherence directory. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 423–434 (2009)

  4. Ros, A., Acacio, M.E., Garcia, J.M.: A direct coherence protocol for many-core chip multiprocessors. IEEE Trans. Parallel Distrib. Syst. 21(12), 1779–1792 (2010)

    Article  Google Scholar 

  5. Lotfi-Kamran, P., Ferdman, M., Crisan, D., Faksafi, B.: TurboTag lookup filtering to reduce coherence directory power. In: ACM/IEEE International Symposium on Low-Power Electronics and Design, pp. 377–382 (2010)

  6. Weber, W., Gupta, A.: Analysis of cache invalidation patterns in multiprocessors. In: ASPLOS-III Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, vol. 17(2), pp. 243–256 (1989)

  7. Zhou, X., Yu, C., Dash, A., Petrov, P.: Application-aware snoop filtering for low-power cache coherence in embedded multiprocessors. In: ACM Transactions on Design Automation of Electronic Systems, vol. 13(1) (2008)

    Google Scholar 

  8. Ekman, M., Dahlgren, F., Stenstrom, P.: TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors. In: Proceedings of the International Symposium on Low Power Electronics and Design, pp. 243–246 (2002)

  9. Patel, A., Ghose, K.: Energy-efficient MESI cache coherence with pro-active snoop filtering for multicore multiprocessors. In: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 11–13 (2008)

  10. Bournoutian, G., Orailoglu, A.: Dynamic, multi-core cache coherence architecture for power-sensitive mobile processors. In: Proceedings of the Seventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, pp. 89–98 (2011)

  11. Hsia, A., Chen, C.-W., Liu, T.-J.: Energy-efficient synonym data detection and consistency for virtual cache. Microprocess. Microsyst. 40, 27–44 (2016)

    Article  Google Scholar 

  12. Zhan, D., Jiang, H., Seth, S.C.: STEM: spatiotemporal management of capacity for intra-core last level caches. In: 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 163–174 (2010)

  13. Chen, C.-W., Ku, C.-J.: A tagless cache design for power saving in embedded systems. J. Supercomput. 62(1), 174–198 (2012)

    Article  Google Scholar 

  14. Gupta, A., Weber, W.-D., Mowry, T.: Reducing memory and traffic requirements for scalable directory-based cache coherence schemes. In: Proceeding of the International Conference on Parallel Processing, pp. 312–321 (1990)

  15. Bianchini, R., Leblanc, T.J., Veenstra, J.: Eliminating useless messages in write-update protocols on scalable multiprocessors. Technical reports, University of Rochester (1994)

  16. Loghi, M., Poncino, M., Benini, L.: Cache coherence tradeoffs in shared-memory MPSoCs. In: ACM Transactions on Embedded Computing Systems, vol. 5(2), pp. 383–407 (2006)

    Article  Google Scholar 

  17. Papamarcos, M.S., Patel, J.H.: A low-overhead coherence solution for multiprocessors with private cache memories. In: Proceedings of the 11th Annual International Symposium on Computer Architecture, pp. 348–354 (1984)

  18. Culler, D.E., Gupta, A., Singh, J.P.: Parallel computer architecture: a hardware/software approach. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  19. Li, J.-M., Yang, P., Ding, N., Guan, H., Zhang, J., Men, C., Wu, Y., Li, J., Wang, C.: A new kind of hybrid cache coherence protocol for multiprocessor with D-cache. In: International Conference on Future Computer Science and Education (ICFCSE), pp. 641–645, (2011)

  20. Patel, A., Ghose, K.: Energy-efficient MESI cache coherence with pro-active snoop filtering for multicore microprocessors. In: IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 247–252 (2008)

  21. Shafiee, A., Shahidi, N., Gainsaid, A.: Using partial tag comparison in low-power snoop-based chip multiprocessors. Lect. Notes Comput. Sci. 6161, 211–221 (2012)

    Article  Google Scholar 

  22. Moshovos, A., Memik, G., Choudhary, A., Falsafi, B.: JETTY: filtering snoops for reduced energy consumption in SMP servers. In: The proceedings of the 7th International Symposium on High-Performance Computer Architecture, pp. 85–96 (2001)

  23. Chaiken, D., Kubiatowicz, J., Agarwal, A.: LimitLESS directories: a scalable cache coherence scheme. In: The Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 224–234 (1991)

  24. Martin, M.M.K., Harper, P.J., Sorin, D.J., Hill, M.D., Wood, D.A.: Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors. In: The Proceedings of the 30th Annual International Symposium on Computer Architecture, pp. 206–217 (2003)

  25. Grahn, H., Stenström, P., Dubois, M.: Implementation and evaluation of update- based cache protocols under relaxed memory consistency models. Future Gener. Comput. Syst. 11(3), 247–271 (1995)

    Article  Google Scholar 

  26. Nilsson, H., Stenström, P.: An adaptive update-based cache coherence protocol for reduction of miss rate and traffic. PARLE’94 Parallel Archit. Lang. Europe 817, 363–374 (1994)

    Article  Google Scholar 

  27. Chtioui, H., Lamih, S.N., Ben-Atitallah, R., Zahran, M., Dekeyser, J.L., Abid, M.: A dynamic hybrid cache coherency protocol for shared-memory MPSoC architectures. Int. J. Comput. Appl. 47(3), 45–50 (2012)

    Google Scholar 

  28. Magnusson, P.S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: a full system simulation platform. IEEE Comput. 35(2), 50–58 (2002)

    Article  Google Scholar 

  29. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA ‘95), pp. 24–36 (1995)

  30. Thoziyoor, S., Muralimanohar, N., Ahn, J.H., Jouppi, N.P.: CACTI 5.1. Technical reports. HP Laboratories, Palo Alto (2008)

  31. Blelloch, G.E., Leiserson, C.E., Maggs, B.M., Plaxton, C.G., Smith, S.J., Zagha, M.: A comparison of sorting algorithms for the connection machine CM-2. In: Proceedings of the Symposium on Parallel Algorithms and Architectures, pp. 3–16 (1991)

  32. Bailey, D.H.: FFT’s in external or hierarchical memory. J. Supercomput. 4(1), 23–35 (1990)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by Ministry of Science and Technology Taiwan. (MOST104-2221-E-035-005- and MOST 105-2221-E-035-062 -).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ching-Wen Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, CW., Hsia, A., Zhan, YW. et al. Energy-efficient hybrid coherence protocol for multicore processors. Cluster Comput 21, 1521–1541 (2018). https://doi.org/10.1007/s10586-018-1947-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1947-z

Keywords

Navigation