Abstract
Many ground-level and space systems require reliability testing before their deployment, since they are increasingly susceptible to transient and permanent faults. Such process must be accurate, controllable, generic, cheap, and fast. Even though fault injection at gate-level is often the most appropriate solution when one seeks for accuracy and controllability, it is very time-consuming. Considering that, this work proposes a hybrid fault injection framework that automatically switches between RTL and gate-level simulation modes. By using a complex 8-issue VLIW processor as case-study, we show that the injection process can be accelerated by more than \(10\times \) for transient faults and almost 2 times for permanent faults over conventional injectors, while maintaining gate-level accuracy and controllability. The proposed framework is generic, so that faults can be injected into any arbitrary circuit, which is demonstrated by also injecting faults in a neural network and achieving a speedup of more than \(30\times \).
Similar content being viewed by others
References
Beck ACS, Lisbôa CAL, Carro L (2012) Adaptable embedded systems. Springer, Heidelberg
Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill MD, Wood DA (2011) The Gem5 simulator. SIGARCH Comput Archit News 39(2):1–7. https://doi.org/10.1145/2024716.2024718
Binkert NL, Dreslinski RG, Hsu LR, Lim KT, Saidi AG, Reinhardt SK (2006) The M5 simulator: modeling networked systems. IEEE Micro 26(4):52–60. https://doi.org/10.1109/MM.2006.82
Bolchini C, Sandionigi C (2010) Fault classification for SRAM-based FPGAs in the space environment for fault mitigation. IEEE Embed Syst Lett 2(4):107–110
Cho H, Cher CY, Shepherd T, Mitra S (2015) Understanding soft errors in uncore components. In: Proceedings of the 52nd annual design automation conference, DAC, pp 89:1–89:6. ACM, New York, NY, USA. https://doi.org/10.1145/2744769.2744923
Cho H, Mirkhani S, Cher CY, Abraham JA, Mitra S (2013) Quantitative evaluation of soft error injection techniques for robust system design. In: 50th ACM/EDAC/IEEE design automation conference (DAC), pp 1–10
Ejlali A, Miremadi SG, Zarandi H, Asadi G, Sarmadi SB (2003) A hybrid fault injection approach based on simulation and emulation co-operation. In: Dependable systems and networks. Proceedings international conference on, pp 479–488. https://doi.org/10.1109/DSN.2003.1209958
Erichsen AG, Sartor AL, Souza JD, Pereira MM, Wong S, Beck ACS (2018) ISA-DTMR: selective protection in configurable heterogeneous multicores. In: Voros N, Huebner M, Keramidas G, Goehringer D, Antonopoulos C, Diniz PC (eds) Applied reconfigurable computing. Architectures, tools, and applications. Springer International Publishing, Cham, pp 231–242
Goswami KK (1997) DEPEND: a simulation-based environment for system level dependability analysis. IEEE Trans Comput 46(1):60–74. https://doi.org/10.1109/12.559803
Gustafsson J, Betts A, Ermedahl A, Lisper B (2010) The Malardalen WCET benchmarks: past, present and future. WCET 15:136–146
Hari SKS, Adve SV, Naeimi H, Ramachandran P (2012) Relyzer: exploiting application-level fault equivalence to analyze application resiliency to transient faults. SIGPLAN Not 47(4):123–134. https://doi.org/10.1145/2248487.2150990
Hauser J (2002) Berkeley SoftFloat. http://www.jhauser.us/arithmetic/SoftFloat.html
Hsueh MC, Tsai TK, Iyer RK (1997) Fault injection techniques and tools. Computer 30(4):75–82
Kalbarczyk Z, Iyer RK, Ries GL, Patel JU, Lee MS, Xiao Y (1999) Hierarchical simulation approach to accurate fault modeling for system dependability evaluation. IEEE Trans Softw Eng 25(5):619–632. https://doi.org/10.1109/32.815322
Kaliorakis M, Tselonis S, Chatzidimitriou A, Gizopoulos D (2015) Accelerated microarchitectural fault injection-based reliability assessment. In: IEEE international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFTS), pp 47–52. https://doi.org/10.1109/DFT.2015.7315134
Kobayashi H, Usuki H, Shiraishi K, Tsuchiya H, Kawamoto N, Merchant G, Kase J (2004) Comparison between neutron-induced system-SER and accelerated-SER in SRAMs. In: Reliability physics symposium, 42nd annual IEEE international, pp 288–293. IEEE
Kooli M, Natale GD, Bosio A (2016) Cache-aware reliability evaluation through LLVM-based analysis and fault injection. In: IEEE 22nd international symposium on on-line testing and robust system design (IOLTS), pp 19–22. https://doi.org/10.1109/IOLTS.2016.7604663
Lesea A, Drimer S, Fabula JJ, Carmichael C, Alfke P (2005) The rosetta experiment: atmospheric soft error rate testing in differing technology FPGAs. IEEE Trans Device Mater Reliab 5(3):317–328
Li ML, Ramachandran P, Karpuzcu UR, Hari SKS, Adve SV (2009) Accurate microarchitecture-level fault modeling for studying hardware faults. In: IEEE 15th international symposium on high performance computer architecture, pp 105–116. https://doi.org/10.1109/HPCA.2009.4798242
Libano F, Rech P, Tambara L, Tonfat J, Kastensmidt F (2018) On the reliability of linear regression and pattern recognition feedforward artificial neural networks in FPGAs. IEEE Trans Nucl Sci 65(1):288–295. https://doi.org/10.1109/TNS.2017.2784367
Magnusson PS, Christensson M, Eskilson J, Forsgren D, Hallberg G, Hogberg J, Larsson F, Moestedt A, Werner B (2002) Simics: a full system simulation platform. Computer 35(2):50–58. https://doi.org/10.1109/2.982916
Martin MMK, Sorin DJ, Beckmann BM, Marty MR, Xu M, Alameldeen AR, Moore KE, Hill MD, Wood DA (2005) Multifacet’s general execution-driven mMultiprocessor simulator (GEMS) toolset. SIGARCH Comput Archit News 33(4):92–99. https://doi.org/10.1145/1105734.1105747
Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: Microarchitecture, 36th annual IEEE/ACM international symposium on, p 29. IEEE Computer Society
Parasyris K, Tziantzoulis G, Antonopoulos CD, Bellas N (2014) GemFI: a fault injection tool for studying the behavior of applications on unreliable substrates. In: Dependable systems and networks (DSN), 44th annual IEEE/IFIP international conference on, pp 622–629. IEEE
Patel A, Afram F, Chen S, Ghose K (2011) MARSS: a full system simulator for multicore x86 CPUs. In: Proceedings of the 48th design automation conference, DAC ’11, pp. 1050–1055. ACM, New York, NY, USA. https://doi.org/10.1145/2024724.2024954
Ramachandran P, Kudva P, Kellington J, Schumann J, Sanda P (2008) Statistical fault injection. In: IEEE international conference on dependable systems and networks with FTCS and DCC (DSN), pp. 122–127. https://doi.org/10.1109/DSN.2008.4630080
Sartor AL, Becker PHE, Beck ACS (2017) Simbah-FI: simulation-based hybrid fault injector. In: VII Brazilian symposium on computing systems engineering (SBESC), pp 94–101
Sartor AL, Becker PHE, Hoozemans J, Wong S, Beck ACS (2018) Dynamic trade-off among fault tolerance, energy consumption, and performance on a multiple-issue VLIW processor. IEEE Trans Multi-Scale Comput Syst 4(3):327–339. https://doi.org/10.1109/TMSCS.2017.2760299
Sartor AL, Lorenzon AF, Carro L, Kastensmidt F, Wong S, Beck A (2015) A novel phase-based low overhead fault tolerance approach for VLIW processors. In: VLSI (ISVLSI), IEEE Computer Society annual symposium on, pp 485–490. IEEE
Sartor AL, Lorenzon AF, Carro L, Kastensmidt F, Wong S, Beck ACS (2017) Exploiting idle hardware to provide low overhead fault tolerance for VLIW processors. ACM J Emerg Technol Comput Syst 13(2):13:1–13:21. https://doi.org/10.1145/3001935
Sartor AL, Lorenzon AF, Kundu S, Koren I, Beck ACS (2018) Adaptive and polymorphic VLIW processor to optimize fault tolerance, energy consumption, and performance. In: ACM international conference on computing frontiers, pp 54–61. ACM. https://doi.org/10.1145/3203217.3203238
Sartor AL, Wong S, Beck ACS (2016) Adaptive ILP control to increase fault tolerance for VLIW processors. In: IEEE international conference on application-specific systems, architectures and processors (ASAP), pp 9–16. https://doi.org/10.1109/ASAP.2016.7760767
Scott J, Lee LH, Arends J, Moyer B (1998) Designing the low-power MCORE architecture. In: Power driven microarchitecture workshop, pp 145–150
Shivakumar P, Kistler M, Keckler S, Burger D, Alvisi L (2002) Modeling the effect of technology trends on the soft error rate of combinational logic. In: Dependable systems and networks (DSN), International conf. on pp 389–398
Violante M, Sterpone L, Manuzzato A, Gerardin S, Rech P, Bagatin M, Paccagnella A, Andreani C, Gorini G, Pietropaolo A (2007) Others: a new hardware/software platform and a new 1/E neutron source for soft error studies: testing FPGAs at the ISIS facility. IEEE Trans Nucl Sci 54(4):1184–1189
Wind River: Simics - Supported Targets (2017). http://www.windriver.com/products/simics/simics-supported-targets.html
Wong S, Van As T, Brown G (2008) \(\rho \)-VEX: a reconfigurable and extensible softcore VLIW processor. In: International conference on ICECE technology, pp 369–372. IEEE
Yahagi Y, Saito Y, Terunuma K, Nunomiya T, Nakamura T (2002) Self-consistent integrated system for susceptibility to terrestrial neutron induced soft-error of sub-quarter micron memory devices. In: Integrated reliability workshop, IEEE international, pp 143–146. IEEE
Yalcin G, Unsal OS, Cristal A, Valero M (2011) FIMSIM: a fault injection infrastructure for microarchitectural simulators. In: IEEE 29th international conference on computer design (ICCD), pp 431–432. https://doi.org/10.1109/ICCD.2011.6081435
Acknowledgements
This study was financed in part by: Pronex 16/0472-2; and the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sartor, A.L., Becker, P.H.E. & Beck, A.C.S. A fast and accurate hybrid fault injection platform for transient and permanent faults. Des Autom Embed Syst 23, 3–19 (2019). https://doi.org/10.1007/s10617-018-9217-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10617-018-9217-0