Abstract
There is a growing concern about the increasing rate of defects in computing substrates. Traditional redundancy solutions prove to be too expensive for commodity microprocessor systems. Modern microprocessors feature multiple execution units to take advantage of instruction level parallelism. However, most workloads do not exhibit the level of instruction level parallelism that a typical microprocessor is resourced for. This offers an opportunity to reexecute instructions using idle execution units. But, relying solely on idle resources will not provide full instruction coverage and there is a need to explore other alternatives. To that end, we propose and evaluate two instruction replay schemes within the same core for online testing of the execution units. One scheme (RER) reexecutes only the retired instructions, while the other (REI) reexecutes all the issued instructions. The complete proposed solution requires a comparator and minor modifications to control logic, resulting in negligible hardware overhead. Both soft and hard error detection are considered and the performance and energy impact of both schemes are evaluated and compared against previously proposed redundant execution schemes. Results show that even though the proposed schemes result in a small performance penalty when compared to previous work, the energy overhead is significantly reduced.
- T. Austin. 1999. DIVA: a reliable substrate for deep submicron microarchitecture design. In Proceedings of the 32nd Annual International Symposium on Microarchitecture. Google ScholarDigital Library
- R. Baumann. 2005. Soft errors in advanced computer systems. IEEE Des. Test Comput. 22, 3. Google ScholarDigital Library
- S. Borkar. 2005. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro. Google ScholarDigital Library
- F. Bower, D. Sorin, and S. Ozev. 2005. A mechanism for online diagnosis of hard faults in microprocessors. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture. Google ScholarDigital Library
- H. Hana and B. Johnson. 1986. Concurrent error detection in VLSI circuits using time redundancy. In Proceedings of the IEEE Southeastcon'86 Regional Conference.Google Scholar
- A. Mendelson and N. Suri. 2000. Designing high-performance and reliable superscalar architectures: The out of order reliable superscalar (O3RS) approach. In Proceedings of the International Conference on Dependable Systems and Networks (DSN'00). Google ScholarDigital Library
- E. Mizan, T. Amimeur, and M. Jacome. 2007. Self-imposed temporal redundancy: An efficient technique to enhance the reliability of pipelined functional units. In Proceedings of the 19th International Symposium on Computer Architecture and High Performance Computing.Google Scholar
- J. H. Patel and L. Y. Fung. 1982. Concurrent error detection in ALU's by recomputing with shifted operands. IEEE Trans. Comput. 31, 7. Google ScholarDigital Library
- J. Ray, J. Hoe, and B. Falsafi. 2001. Dual use of superscalar datapath for transient-fault detection and recovery. In Proceedings of the 34th ACM/IEEE International Symposium on Microarchitecture. Google ScholarDigital Library
- S. K. Reinhardt and S. S. Mukherjee. 2000. Transient fault detection via simultaneous multithreading. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA'00). Google ScholarDigital Library
- J. Renau. 2005. SESC: SuperESCalar simulator. Tech. rep., University of California at Santa Cruz.Google Scholar
- R. Rodrigues and S. Kundu. 2011. An online mechanism to verify datapath execution using existing resources in chip multiprocessors. In Proceedings of the 20th Asian Test Symposium. 161--166. Google ScholarDigital Library
- E. Rotenberg. 1999. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors. In Proceedings of the 29th Annual International Symposium on Fault-Tolerant Computing (Digest of Papers). Google ScholarDigital Library
- S. Rusu, S. Tam, H. Muljono, D. Ayers, and J. Chang. 2006. A dual-core multi-threaded Xeon processor with 16mb l3 cache. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC'06) (Digest of Technical Papers). 315--324.Google Scholar
- S. Shyam, K. Constantinides, S. Phadke, V. Bertacco, and T. Austin. 2006. Ultra low-cost defect protection for microprocessor pipelines. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. Google ScholarDigital Library
- D. P. Siewiorek and R. S. Swarz. 1998. Reliable Computer Systems: Design and Evaluation. AK Peters, Ltd. Google ScholarDigital Library
- J. Smolens, J. Kim, J. Hoe, and B. Falsafi. 2004. Efficient resource sharing in concurrent error detecting superscalar microarchitectures. In Proceedings of the 37th International Symposium on Microarchitecture. 257--268. Google ScholarDigital Library
- D. J. Sorin, M. M. K. Martin, M. D. Hill, and D. A. Wood. 2002. SafetyNet: Improving the availability of shared memory multiprocessors with global checkpoint/recovery. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA'02). Google ScholarDigital Library
- SPEC2000. The Standard Performance Evaluation Corporation (Spec CPI2000 suite).Google Scholar
- A. Timor, A. Mendelson, Y. Birk, and N. Suri. 2010. Using Underutilized CPU Resources to Enhance Its Reliability. IEEE Trans. Depend. Secure Comput. Google ScholarDigital Library
- D. Vasudevan and P. Lala. 2005. A technique for modular design of self-checking carry-select adder. In Proceedings of the 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05). Google ScholarDigital Library
- M. Yilmaz, D. R. Hower, S. Ozev, and D. J. Sorin. 2006. Self-checking and self-diagnosing 32-bit microprocessor multiplier. In Proceedings of the IEEE International Test Conference.Google Scholar
- M. Yilmaz, A. Meixner, S. Ozev, and D. J. Sorin. 2007. Lazy error detection for microprocessor functional units. In Proceedings of the 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT'07). Google ScholarDigital Library
Index Terms
- A low-power instruction replay mechanism for design of resilient microprocessors
Recommendations
Reducing instruction bit-width for low-power VLIW architectures
VLIW (very long instruction word) architectures have proven to be useful for embedded applications with abundant instruction level parallelism. But due to the long instruction bus width it often consumes more power and memory space than necessary. One ...
Low power microarchitecture with instruction reuse
CF '08: Proceedings of the 5th conference on Computing frontiersPower consumption has become a very important metric and challenging research topic in the design of microprocessors in the recent years. The goal of this work is to improve power efficiency of superscalar processors through instruction reuse at the ...
Comprehensive Evaluation of an Instruction Reissue Mechanism
ISPAN '00: Proceedings of the 2000 International Symposium on Parallel Architectures, Algorithms and NetworksIn this paper, we evaluate a mechanism to reissue instructionson the mispredicted speculation path. An instruction which is once dispatched to a functional unit during mispredicted speculation is issued again inside an instruction window. This scheme is ...
Comments