Skip to main content

Advertisement

Log in

Prevention from Soft Errors via Architecture Elasticity

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Due to the decreasing threshold voltages, shrinking feature size, as well as the exponential growth of on-chip transistors, modern processors are increasingly vulnerable to soft errors. However, traditional mechanisms of soft error mitigation take actions to deal with soft errors only after they have been detected. Instead of the passive responses, this paper proposes a novel mechanism which proactively prevents from the occurrence of soft errors via architecture elasticity. In the light of a predictive model, we adapt the processor architectures holistically and dynamically. The predictive model provides the ability to quickly and accurately predict the simulation target across different program execution phases on any architecture configurations by leveraging an artificial neural network model. Experimental results on SPEC CPU 2000 benchmarks show that our method inherently reduces the soft error rate by 33.2% and improves the energy efficiency by 18.3% as compared with the static configuration processor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Baumann R. Radiation-induced soft errors in advanced semi- conductor technologies. IEEE Transactions on Device and Materials Reliability, 2005, 5(3): 305-316.

    Article  MathSciNet  Google Scholar 

  2. Mukherjee S S, Weaver C, Emer J, et al. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proc. the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2003), December 2003, p.29.

  3. Shivakumar P, Kistler M, Keckler S et al. Modeling the effect of technology trends on the soft error rate of combinational logic. In Proc. the International Conference on Dependable Systems and Networks (DSN2002), June 2002, pp.389-398.

  4. Mitra S, Seifert N, Zhang M, Shi Q et al. Robust system design with built-in soft-error resilience. IEEE Computer, 2005, 38(2): 43-52.

    Article  Google Scholar 

  5. Lyons R E, Vanderkulk W. The use of triple-modular redundancy to improve computer reliability. IBM Journal of Research and Development, 1962, 6(2): 200-209.

    Article  MATH  Google Scholar 

  6. Bernick D, Bruckert B, Vigna P D et al. Nonstopr advanced architecture. In Proc. Int. Conf. Dependable Systems and Networks (DSN2005), June 28-July 1, 2005, pp.12-21.

  7. Li M L, Ramachandran P, Sahoo S K et al. Understanding the propagation of hard errors to software and implications for resilient system design. In Proc. the 13th Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS2008), March 2008, pp.265-276.

  8. Rotenberg E. AR-SMT: A microarchitectural approach to fault tolerance in microprocessors. In Proc. the 29th Int. Symp. Fault-Tolerant Computing (FTCS1999), June 1999, pp.84-91.

  9. Gomaa M, Vijaykumar T. Opportunistic transient-fault detection. In Proc. the 32nd International Symposium on Computer Architecture (ISCA2005), June 2005, pp.172-183.

  10. Mukherjee S S, Kontz M, Reinhardt S K. Detailed design and evaluation of redundant multi-threading alternatives. In Proc. the 29th International Symposium on Computer Architecture (ISCA2002), May 2002, pp.99-110.

  11. Wang C, Kim H, Wu Y, Ying V. Compiler-managed software-based redundant multi-threading for transient fault detection. In Proc. the International Symposium on Code Generation and Optimization (CGO2007), March 2007, pp.244-258.

  12. Rehman S, Shafique M, Henkel J. Instruction scheduling for reliability-aware compilation. In Proc. the 49th Design Automation Conference (DAC2012), June 2012, pp.1292-1300.

  13. Chen Y J, Chen T S, Guo Q et al. An elastic architecture adaptable to millions of application scenarios. In Proc. the 9th IFIP International Conference on Network and Parallel Computing (NPC2012), Sept. 2012, pp.188-195.

  14. Duan L, Zhang Y, Li B, Peng L. Universal rules guided design parameter selection for soft error resilient processors. In Proc. Int. Symp. Performance Analysis of Systems and Software (ISPASS2011), April 2011, pp.247-256.

  15. Soundararajan N, Parashar A, Sivasubramaniam A. Mechanisms for bounding vulnerabilities of processor structures. In Proc. the 19th International Symposium on Computer Architecture (ISCA2007), June 2007, pp.506-515.

  16. Biswas A, Cheveresan R, Emer J et al. Computing architectural vulnerability factors for address-based structures. In Proc. the 32nd International Symposium on Computer Architecture (ISCA2005), June 2005, pp.532-543.

  17. Dubach C, Jones T M, Bonilla E V, Boyle M F P O. A predictive model for dynamic microarchitectural adaptivity control. In Proc. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2010), Dec. 2010, pp.485-496.

  18. Dubach C, Jones T, O’Boyle M. Microarchitectural design space exploration using an architecture-centric approach. In Proc. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2007), Dec. 2007, pp.262-271.

  19. Guo Q, Chen T S, Chen Y J et al. Effective and efficient microprocessor design space exploration using unlabeled design configurations. In Proc. Int. Joint Conf. Artificial Intelligence (IJCAI2011), July 2011, pp.1671-1677.

  20. Brooks D, Tiwari V, Martonosi M. Wattch: A framework for architectural-level power analysis and optimizations. In Proc. International Symposium on Computer Architecture (ISCA2000), June 2000, pp.83-94.

  21. Henning J L. SPEC CPU2000: Measuring CPU performance in the new millennium. IEEE Computer, 2000, 33(7): 28-35.

    Article  Google Scholar 

  22. Sherwood T, Perelman E, Hamerly G, Calder B. Automatically characterizing large scale program behavior. In Proc. International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS2002), October 2002, pp.45-57.

  23. Nair A A, Eyerman S, Eeckhout L, John L K. A first-order mechanistic model for architectural vulnerability factor. In Proc. International Symposium on Computer Architecture (ISCA2012), Oct. 2012, pp.273-284.

  24. Buyuktosunoglu A, Albonesi D, Schuster S et al. A circuit level implementation of an adaptive issue queue for power-aware microprocessors. In Proc. the 11th Great Lakes Symposium on VLSI (GLSVLSI2001), March 2001, pp.73-78.

  25. Bitirgen R, Ipek E, Martinez J. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In Proc. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2008), November 2008, pp.318-329.

  26. Meixner A, Bauer M, Sorin D. Argus: Low-cost, comprehensive error detection in simple cores. IEEE Micro, 2008, 28(1): 52-59.

    Article  Google Scholar 

  27. Vadlamani R, Zhao J, Burleson W, Tessier R. Multicore soft error rate stabilization using adaptive dual modular redundancy. In Proc. IEEE Design, Automation and Test in Europe Conference & Exhibition (DATE2010), March 2010, pp.27-32.

  28. Walcott K R, Humphreys G, Gurumurthi S. Dynamic prediction of architectural vulnerability from microarchitectural state. In Proc. the 34th International Symposium on Computer Architecture (ISCA2007), May 2007, pp.516-527.

  29. Racunas P, Constantinides K, Manne S, Mukherjee S S. Perturbation-based fault screening. In Proc. the 13th Int. Symp. High Performance Computer Architecture (HPCA2007), Feb. 2007, pp.169-180.

  30. Wang N, Patel S. ReStore: Symptom-based soft error detection in microprocessors. IEEE Transactions on Dependable and Secure Computing, 2006, 3(3): 188-201.

    Article  Google Scholar 

  31. Feng S, Gupta S, Ansari A, Mahlke S. Shoestring: Probabilistic soft error reliability on the cheap. In Proc. the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS2010), March 2010, pp.13-17.

  32. Weaver C, Emer J, Mukherjee S S, Reinhardt S K. Techniques to reduce the soft error rate of a high-performance microprocessor. In Proc. Annual International Symposium on Computer Architecture (ISCA2004), June 2004, pp.264-275.

  33. Cho C, Zhang W, Li T. Informed microarchitecture design space exploration using workload dynamics. In Proc. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2007), December 2007, pp.274-285.

  34. Duan L, Li B, Peng L. Versatile prediction and fast estimation of architectural vulnerability factor from processor performance metrics. In Proc. the 15th International Symposium on High Performance Computer Architecture (HPCA2009), February 2009, pp.129-140.

  35. Abella J, Gonzalez A. On reducing register pressure and energy in multiple-banked register files. In Proc. International Conference on Computer Design (ICCD2003), October 2003, pp.14-20.

  36. Balasubramonian R, Albonesi D, Buyuktosunoglu A, Dwarkadas S. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In Proc. Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2000), December 2000, pp.245-257.

  37. Ipek E, Kirman M, Kirman N, Martinez J F. Core fusion: Accommodating software diversity in chip multiprocessors. In Proc. Annual International Symposium on Computer Architecture (ISCA2007), June 2007, pp.186-197.

  38. Watanabe Y, Davis J D, Wood D A. Widget: Wisconsin decoupled grid execution tiles. In Proc. Annual International Symposium on Computer Architecture (ISCA2010), June 2010, pp.2-13.

  39. Khubaib K, Suleman M A, Hashemi M et al. MorphCore: An energy-efficient microarchitecture for high performance ILP and high throughput TLP. In Proc. the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO2012), Dec. 2012, pp.305-316.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi-Xiao Yin.

Additional information

This work is supported by the National Science and Technology Major Project under Grant Nos. 2009ZX01028-002-003, 2009ZX01029-001-003, the National Natural Science Foundation of China under Grant Nos. 61221062, 61100163, 61133004, 61232009, 61222204, 61221062, 61303158, the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010403, and the Ten Thousand Talent Program of China.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 29 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, YX., Chen, YJ., Guo, Q. et al. Prevention from Soft Errors via Architecture Elasticity. J. Comput. Sci. Technol. 29, 247–254 (2014). https://doi.org/10.1007/s11390-014-1427-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-014-1427-8

Keywords