skip to main content
10.1145/3316781.3323487acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Software Approaches for In-time Resilience

Published:02 June 2019Publication History

ABSTRACT

Advances in semiconductor technology have enabled unprecedented growth in safety-critical applications. However, due to unabated scaling, the unreliability of the underlying hardware is only getting worse. For a lot of applications, just recovering from errors is not enough -- the latency between the occurrence of the fault to it's detection and recovery from the fault, i.e., in-time error resilience is of vital importance. This is especially true for real-time applications, where the timing of application events is a crucial part of the correctness of application. While software techniques for resilience are highly desirable since they can be flexibly applied, but achieving reliable, in-time software resilience is still an elusive goal. A new class of recent techniques have started to tackle this problem. This paper presents a succinct overview of existing software resilience techniques from the point-of-view of in-time resilience, and points out future challenges.

References

  1. Shekhar Borkar. 2005. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. MICRO (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Moslem Didehban et al. 2017. InCheck: An in-application recovery scheme for soft errors. In DAC. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Moslem Didehban et al. 2017. NEMESIS: A software approach for computing in presence of soft errors. In ICCAD. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Moslem Didehban and Aviral Shrivastava. 2016. nZDC: a compiler technique for near Zero Silent data Corruption. In Proceedings of the 53rd Annual Design Automation Conference. ACM, 48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Moslem Didehban and Aviral Shrivastava. 2018. A Compiler Technique for Processor-Wide Protection From Soft Errors in Multithreaded Environments. IEEE Transactions on Reliability 67, 1 (2018), 249--263.Google ScholarGoogle ScholarCross RefCross Ref
  6. Shuguang Feng et al. 2010. Shoestring: probabilistic soft error reliability on the cheap. In SIGARCH Computer Architecture News, Vol. 38. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shuguang Feng et al. 2011. Encore: low-cost, fine-grained transient fault recovery. In Proceedings of International Symposium on Microarchitecture. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jörg Henkel, Lars Bauer, Nikil Dutt, Puneet Gupta, Sani Nassif, Muhammad Shafique, Mehdi Tahoori, and Norbert Wehn. 2013. Reliable on-chip systems in the nano-era: Lessons learnt and future trends. In Proceedings of the 50th Annual Design Automation Conference. ACM, 99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dmitrii Kuvaiskii, Oleskii Oleksenko, Pramod Bhatotia, Pascal Felber, and Christof Fetzer. 2016. Elzar: Triple modular redundancy using intel avx (practical experience report). In 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 646--653.Google ScholarGoogle ScholarCross RefCross Ref
  10. George Reis et al. 2007. Automatic instruction-level software-only recovery. IEEE micro 27 (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. George A Reis et al. 2005. Software-controlled fault tolerance. TACO 2 (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Muhammad Shafique, Siddharth Garg, Jörg Henkel, and Diana Marculescu. 2014. The EDA challenges in the dark silicon era: Temperature, reliability, and variability perspectives. In Proceedings of the 51st Annual Design Automation Conference. ACM, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hwisoo So et al. 2018. EXPERT: Effective and flexible error protection by redundant multithreading. In Design, Automation & Test in Europe Conference & Exhibition. IEEE, 533--538.Google ScholarGoogle Scholar
  14. Hwisoo So et al. 2019. A software-level Redundant MultiThreading for Soft/Hard Error Detection and Recovery. In Design, Automation & Test in Europe Conference & Exhibition. IEEE.Google ScholarGoogle Scholar
  1. Software Approaches for In-time Resilience

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019
      June 2019
      1378 pages
      ISBN:9781450367257
      DOI:10.1145/3316781

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 June 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate1,770of5,499submissions,32%

      Upcoming Conference

      DAC '24
      61st ACM/IEEE Design Automation Conference
      June 23 - 27, 2024
      San Francisco , CA , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader