Skip to main content

Advertisement

Log in

Asymmetrically reliable caches for multicore architectures under performance and energy constraints

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Cache structures in a multicore system are more vulnerable to soft errors due to high transistor density. Protecting all caches unselectively has notable overhead on performance and energy consumption. In this study, we propose asymmetrically reliable caches to supply reliability need of the system using sufficient additional hardware under the performance and energy constraints. In our framework, a chip multiprocessor is composed of a high reliability core which has ECC protection, and a set of low reliability cores which have no protection on their data caches. Between two types of cores, there is also a middle-level reliability core which has only parity check. Application threads are mapped on the different cores in terms of reliability based on their critical data usage. The experimental results for selected applications show that our proposed techniques improve reliability with considerable performance and energy overhead on the average compared to traditional unsafe caches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Alameldeen, A.R., Wagner, I., Chishti, Z., Wu, W., Wilkerson, C., Lu, S.L.: Energy-efficient cache design using variable-strength error-correcting codes. In: Proceedings of the 38th Annual International Symposium on Computer Architecture, ISCA ’11. ACM, New York, NY, USA (2011). doi:10.1145/2000064.2000118

  2. Arslan, S., Topcuoglu, H., Kandemir, M., Tosun, O.: Performance and energy efficient asymmetrically reliable caches for multicore architectures. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), pp. 1025–1032 (2015). doi:10.1109/IPDPSW.2015.113

  3. Arslan, S., Topcuoglu, H.R., Kandemir, M.T., Tosun, O.: Protecting Code Regions on Asymmetrically Reliable Caches, pp. 375–387. Springer, Cham (2016). doi:10.1007/978-3-319-30695-7_28

  4. Asadi, G.H., Mehdi, V.S., Tahoori, B., Kaeli, D.: Balancing performance and reliability in the memory hierarchy. In: IEEE International Symposium on Performance Analysis of Systems and Software, 2005, ISPASS 2005, pp. 269–279 (2005). doi:10.1109/ISPASS.2005.1430581

  5. Binkert, N., Beckmann, B., Black, G., Reinhardt, S.K., Saidi, A., Basu, A., Hestness, J., Hower, D.R., Krishna, T., Sardashti, S., Sen, R., Sewell, K., Shoaib, M., Vaish, N., Hill, M.D., Wood, D.A.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011). doi:10.1145/2024716.2024718

    Article  Google Scholar 

  6. Cai, Y., Schmitz, M., Ejlali, A., Al-Hashimi, B., Reddy, S. (2006) Cache size selection for performance, energy and reliability of time-constrained systems. In: Asia and South Pacific Conference on Design Automation, 2006. doi:10.1109/ASPDAC.2006.1594804

  7. Carbin, M., Misailovic, S., Rinard, M.C.: Verifying quantitative reliability for programs that execute on unreliable hardware. SIGPLAN Not. 48(10), 33–52 (2013). doi:10.1145/2544173.2509546

    Article  Google Scholar 

  8. Ebrahimi, M., Evans, A., Tahoori, M.B., Costenaro, E., Alexandrescu, D., Chandra, V., Seyyedi, R.: Comprehensive analysis of sequential and combinational soft errors in an embedded processor. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 34(10), 1586–1599 (2015). doi:10.1109/TCAD.2015.2422845

    Article  Google Scholar 

  9. Eltawil, A.A., Engel, M., Geuskens, B., Djahromi, A.K., Kurdahi, F.J., Marwedel, P., Niar, S., Saghir, M.A.: A survey of cross-layer power-reliability tradeoffs in multi and many core systems-on-chip. Microprocess. Microsyst. 37(8), 760–771 (2013). doi:10.1016/j.micpro.2013.07.008

  10. González, A., Aliagas, C., Valero, M.: A data cache with multiple caching strategies tuned to different types of locality. In: Proceedings of the 9th International Conference on Supercomputing, ICS ’95, pp. 338–347. ACM, New York (1995). doi:10.1145/224538.224622

  11. Iqbal, S., Liang, Y., Grahn, H.: Parmibench—an open-source benchmark for embedded multiprocessor systems. Comput. Archit. Lett. 9(2), 45–48 (2010). doi:10.1109/L-CA.2010.14

    Article  Google Scholar 

  12. de Kruijf, M., Nomura, S., Sankaralingam, K.: Relax: An architectural framework for software recovery of hardware faults. In: Proceedings of the 37th Annual International Symposium on Computer Architecture, ISCA ’10, pp. 497–508. ACM, New York, NY (2010). doi:10.1145/1815961.1816026

  13. Lee, K., Shrivastava, A., Issenin, I., Dutt, N., Venkatasubramanian, N.: Mitigating soft error failures for multimedia applications by selective data protection. In: Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, CASES ’06, pp. 411–420. ACM, New York, NY (2006). doi:10.1145/1176760.1176810

  14. Leem, L., Cho, H., Bau, J., Jacobson, Q., Mitra, S.: Ersa: Error resilient system architecture for probabilistic applications. In: Design, Automation Test in Europe Conference Exhibition (DATE), 2010, pp. 1560–1565 (2010). doi:10.1109/DATE.2010.5457059

  15. Leveugle, R., Calvez, A., Maistri, P., Vanhauwaert, P.: Statistical fault injection: quantified error and confidence. In: Design, Automation Test in Europe Conference Exhibition, 2009, DATE ’09, pp. 502–506 (2009). doi:10.1109/DATE.2009.5090716

  16. Luo, Y., Govindan, S., Sharma, B., Santaniello, M., Meza, J., Kansal, A., Liu, J., Khessib, B., Vaid, K., Mutlu, O.: Characterizing application memory error vulnerability to optimize datacenter cost via heterogeneous-reliability memory. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 467–478 (2014), doi:10.1109/DSN.2014.50

  17. Meaney, P., Lastras-Montano, L., Papazova, V., Stephens, E., Johnson, J., Alves, L., O’Connor J., Clarke, W.: Ibm zenterprise redundant array of independent memory subsystem. IBM J. Res. Dev. 56(1.2):4:1–4:11 (2012). doi:10.1147/JRD.2011.2177106

  18. Misailovic, S., Carbin, M., Achour, S., Qi, Z., Rinard, M.C.: Chisel: Reliability- and accuracy-aware optimization of approximate computational kernels. SIGPLAN Not. 49(10), 309–328 (2014). doi:10.1145/2714064.2660231

  19. Muralimanohar, N., Balasubramonian, R., Jouppi, N.P.: Architecting efficient interconnects for large caches with cacti 6.0. IEEE Micro. 28(1), 69–79 (2008). doi:10.1109/MM.2008.2, http://dx.doi.org/10.1109/MM.2008.2

  20. Naseer, R., Boulghassoul, Y., Draper, J., DasGupta, S., Witulski, A.: Critical charge characterization for soft error rate modeling in 90 nm sram. In: IEEE International Symposium on Circuits and Systems, 2007 (ISCAS 2007), pp. 1879–1882 (2007). doi:10.1109/ISCAS.2007.378282

  21. Oz, I., Topcuoglu, H.R., Kandemir, M., Tosun, O.: Thread vulnerability in parallel applications. J. Parallel Distrib. Comput. 72(10), 1171–1185 (2012). doi:10.1016/j.jpdc.2012.05.002

  22. Rehman, S., Kriebel, F., Shafique, M., Henkel, J.: Compiler-driven dynamic reliability management for on-chip systems under variabilities. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1–4 (2014). doi:10.7873/DATE.2014.119

  23. Rehman, S., Kriebel, F., Shafique, M., Henkel, J.: Reliability-driven software transformations for unreliable hardware. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 33(11), 1597–1610 (2014). doi:10.1109/TCAD.2014.2341894

  24. Sampson, A., Dietl, W., Fortuna, E., Gnanapragasam, D., Ceze, L., Grossman, D.: Enerj: Approximate data types for safe and general low-power computation. SIGPLAN Not. 46(6), 164–174 (2011). doi:10.1145/1993316.1993518

  25. Shantharam, M., Srinivasmurthy, S., Raghavan, P.: Characterizing the impact of soft errors on iterative methods in scientific computing. In: Proceedings of the International Conference on Supercomputing, ICS ’11, pp. 152–161. ACM, New York, NY (2011). doi:10.1145/1995896.1995922

  26. Shivakumar, P., Kistler, M., Keckler, S., Burger, D., Alvisi, L.: Modeling the effect of technology trends on the soft error rate of combinational logic. In: Proceedings of the International Conference on Dependable Systems and Networks, 2002. DSN 2002, pp. 389–398. doi:10.1109/DSN.2002.1028924

  27. Suleman, M.A., Mutlu, O., Qureshi, M.K., Patt, Y.N.: Accelerating critical section execution with asymmetric multi-core architectures. SIGARCH Comput. Archit. News 37(1), 253–264 (2009). doi:10.1145/2528521.1508274

    Article  Google Scholar 

  28. Ungsunan, P., Lin, C., Gai, Y., Kong, X.: Improving multi-core system dependability with asymmetrically reliable cores. In: International Conference on Complex, Intelligent and Software Intensive Systems, 2009. CISIS ’09, pp. 1252–1257 (2009). doi:10.1109/CISIS.2009.95

  29. Wilkerson, C., Alameldeen, A.R., Chishti, Z., Wu, W., Somasekhar, D., Lu, S.l.: Reducing cache power with low-cost, multi-bit error-correcting codes. In: Proceedings of the 37th Annual International Symposium on Computer Architecture, ISCA ’10, pp. 83–93. ACM, New York, NY, USA (2010). doi:10.1145/1815961.1815973

  30. Woo, S., Ohara, M., Torrie, E., Singh, J., Gupta, A.: The splash-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995, pp. 24–36 (1995)

  31. Woo, S.C., Singh, J.P., Hennessy, J.L.: The performance advantages of integrating block data transfer in cache-coherent multiprocessors. SIGOPS Oper. Syst. Rev. 28(5), 219–229 (1994). doi:10.1145/381792.195547

  32. Yetim, Y., Malik, S., Martonosi, M.: Commguard: Mitigating communication errors in error-prone parallel execution. SIGARCH Comput Archit News 43(1), 311–323 (2015). doi:10.1145/2786763.2694354

    Article  Google Scholar 

  33. Yoon DH, Erez M (2009) Memory mapped ecc: Low-cost error protection for last level caches. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, pp. 116–127. ACM, New York, NY, USA. doi:10.1145/1555754.1555771

  34. Yoon, D.H., Erez, M.: Virtualized ecc: Flexible reliability in main memory. Micro, IEEE 31(1), 11–19 (2011). doi:10.1109/MM.2010.103

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by The Scientific and Technological Research Council of Turkey (TUBITAK) with a research grant (Project Number: 113E530).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haluk Rahmi Topcuoglu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arslan, S., Topcuoglu, H.R., Kandemir, M.T. et al. Asymmetrically reliable caches for multicore architectures under performance and energy constraints . Cluster Comput 19, 1819–1833 (2016). https://doi.org/10.1007/s10586-016-0641-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-016-0641-2

Keywords

Navigation