Skip to main content
Log in

Error Detection Enhancement in COTS Superscalar Processors with Performance Monitoring Features

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

Increasing use of commercial off-the-shelf (COTS) superscalar processors in industrial, embedded, and real-time systems necessitates the development of error detection mechanisms for such systems. This paper presents an error detection scheme called Committed Instructions Counting (CIC) to increase error detection in such systems. The scheme uses internal Performance Monitoring features and an external watchdog processor (WDP). The Performance Monitoring features enable counting the number of committed instructions in a program. The scheme is experimentally evaluated on a 32-bit Pentium® processor using software implemented fault injection (SWIFI). A total of 8181 errors were injected into the Pentium® processor. The results show that the error detection coverage varies between 90.92 and 98.41%, for different workloads. To verify the experimental results an analytical evaluation of the coverage is also performed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Advanced Micro Devices, Inc., AMD x86–64 Architecture Programmer's Manual, vol. 2: System Programming, Sept. 2002.

  2. AdvanTech Industrial Computers, http://www.advantech.com.

  3. Z. Alkhalifa and V.S.S. Nair, “Design of a Portable Control-Flow Checking Technique,” in Proceedings of the High-Assurance Systems Engineering Workshop, Aug. 1997, pp. 120–123.

  4. Z. Alkhalifa, V.S.S. Nair, N. Krishnamurthy, and J.A. Abraham, “Design and Evaluation of System-level Checks for On-line Control Flow Error Detection,” IEEE Trans. on Parallel and Distributed Systems, vol. 10, no. 6, June 1999, pp. 627–641.

    Google Scholar 

  5. A. Avizienis, “A Fault Tolerance Infrastructure for Depend-able Computing With High-Performance COTS Components,” in Proceedings International Conference on Dependable Systems and Networks, June 2000, pp. 492–500.

  6. A. Benso, S. Di Carlo, G. Di Natale, P. Prinetto, and L. Tagliaferri, “Control-Flow Checking Via Regular Expressions,” in Proc. of 10th Asian Test Symposium, Nov. 2001, pp. 229–303.

  7. P. Chevochot and I. Puaut, “Experimental Evaluation of the Fail-Silent Behavior of a Distributed Real-Time Run-Time Support Built from COTS Components,” in IEEE/IFIP International Conference on Dependable Systems and Network, July 2001, pp. 304–313.

  8. Compaq Computer Corp., Alpha Architecture Handbook, 1998.

  9. P. Croll and P. Nixon, “Developing Safety-Critical Software within a CASE Environment,” in IEE Colloquium on Computer Aided Software Engineering Tools for Real-Time Control, April 1991, p. 8.

  10. J.B. Eifert and J.P. Shen, “Processor Monitoring Using Asynchronous Signatured Instruction Streams,” FTCS-14, 1984, pp. 394–399.

  11. P. Folkesson, S. Svensson, and J. Karlsson, “A Comparison of Simulation Based and Scan Chain Implemented Fault Injection,” in Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing, June 1998, pp. 284–293.

  12. J. Gaisler, “APortable and Fault-Tolerant Microprocessor Based on the SPARC v8 Architecture,” in Proceedings of International Conference on Dependable Systems and Networks, June 2002, pp. 409–415.

  13. C.D. Gill, R.K. Cytron, and D.C. Schmidt, “Multiparadigm Scheduling for Distributed Real-Time Embedded Computing”, in Proceedings of the IEEE,vol. 91, no. 1, Jan. 2003, pp. 183–197.

    Google Scholar 

  14. Goloubeva et al., “Soft-Error Detection Using Control Flow Assertions,” in Proc. of the 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'03), 2003.

  15. Intel Corp., Pentium® Processor Family Developer's Manual, 1997.

  16. G.A. Kanawati, V.S.S. Nair, N. Krishnamurthy, and J.A. Abraham, “Evaluation of Integrated System-Level Checks for On-Line Error detection,” in Proc. of lEEE Intemational Computer Performance and Dependability Symposium, 1996, pp. 292–301.

  17. Kontron Embedded Computers AG, ETX-P3M User's Guide, 2003, URL:http://www.kontron.com.

  18. H. Madeira, J. Camoes, and J.G. Silva, “Signature Verification: A New Concept for Building Simple and Effective Watchdog Processors,” in Proc. of 6th Mediterranean Electrotechnical Conference, vol. 2, 1991, pp. 1188–1191.

    Google Scholar 

  19. H. Madeira, M. Rela, P. Furtado, and J.G. Silva, “Time Behaviour Monitoring as an Error Detection Mechanism,” in 3rd IFIP Working Conference on Dependable Computing for Critical Applications (DCCA-3), Sept. 1992, pp. 121–132.

  20. H. Madeira and J.G. Silva, “On-Line Signature Learning and Checking: Experimental Evaluation,” in Proc. of Advanced Computer Technology, Reliable Systems and Applications (Comp-Euro'91), May 1991, pp. 642–646.

  21. H. Madeira, R.R. Some, F. Moreira, D. Costa, and D. Rennels, “Experimental Evaluation of a COTS System for Space Applications,” IEEE/IFIP International Conference on Dependable Systems and Networks (DSN '02), 2002.

  22. A. Mahmood and E.J. McCluskey, “Concurrent Error Detection Using Watchdog Processors—A survey,” in IEEE Transactions on Computers, Feb. 1988, pp. 160–174.

  23. T. Michel, R. Leveugle, and G. Saucier, “A New Approach to Control Flow Checking without Program Modification,” in 21st Int. Symposium on Fault-Tolerant Computing, 1991, pp. 334–341.

  24. MIPS Technologies Inc., MIPS R10000 Microprocessor User's Manual, Oct. 1996.

  25. G. Miremadi, J. Karlsson, U. Gunneflo, and J. Torin, “Two Software Techniques for On-Line Error Detection,” in 22nd Annual International Symposium on Fault-Tolerant Computing (FTCS-22), July 1992, pp. 328–335.

  26. G. Miremadi, J. Ohlsson, M. Rimen, and J. Karlsson, “Use of Time, Location and Instruction Signatures for Control Flow Checking,” in Proc. of the DCCA-6 International Conference, IEEE Computer Society Press, 1998.

  27. G. Miremadi, J. Ohlsson, M. Rimen, and J. Karlsson, “Use of Time, Location and Instruction Signatures for Control Flow Checking,” in Proceedings of the DCCA-6 International Conference, Urbana Champaign, IEEE Computer Society Press, 1998, ISBN 3–211–82649.

    Google Scholar 

  28. G. Miremadi and J. Torin, “Evaluation Processor-Behavior Threee Error-Detection Mechanisms Using Physical Fault-Injection,” Trans. on Reliability, vol. 44, no. 3, pp. 441–453, Sept. 1995.

    Google Scholar 

  29. Motorola Inc., PowerPC604 RISC Microprocessor Technical Summary, 1994.

  30. M. Namjoo and E.J. McCluskey, “Watchdog Processors and Capability Checking,” in Proc. of 12th Fault Tolerant Computing Symposium, FTCS-12, 1982, pp. 245–248.

  31. B. Nicolescu and R. Velazco, “Detecting Soft Errors by a Purely Software Approach: Method, Tools and Experimental Results,” Design, Automation and Test in Europe Conference and Exhibition (DATE'03), 2003.

  32. N. Oh, P.P. Shirvani, and E.J. McCluskey, “Error Detection by Duplicated Instructions in Super-Scalar Processors,” IEEE Transaction on Reliability, vol. 51, no. 1, pp. 63–75, March 2002.

    Google Scholar 

  33. N. Oh, P.P. Shirvani, and E.J. McCluskey, “Control-FlowChecking by Software Signatures,” IEEE Trans. In Reliability, vol. 51, no. 2, March 2002.

  34. J. Ohlsson and M. Rimen, “Implicit Signature Checking,” Twenty-Fifth International Symposium on Fault-Tolerant Computing, FTCS-25, 1995, pp. 218–227.

  35. PCI Industrial Computer Manufactures Group, CompactPci, http://www.picmg.org.

  36. M.A. Schuette and J.P. Shen, “Processor Control Flow Monitoring Using Signatured Instruction Streams,” IEEE Trans. on Computers, vol. C-36, no. 3, pp. 264–276, March 1987.

    Google Scholar 

  37. R.R. Some and D.C. Ngo, “REE: A COTS-Based Fault Tolerant Parallel Processing Supercomputer for Spacecraft Onboard Scientific Data Analysis,” in Proc. of the Digital Avionics System Conference, vol. 2, 1999, pp. B3–1–7–B3–1–12.

    Google Scholar 

  38. R.R. Some, W.S. Kim, G. Khanoyan, L. Callum, A. Aqrawal, J.J. Beahan, A. Shamilian, and A. Nilola, “Fault Injection Experiment Results in Space Borne Parallel Application Programs,” in IEEE Aerospace Conference, March 2002.

  39. V. Stachetti, J. Gaisler, G. Goller, and C.L. Gargasson, “32-BIT Processong Unit for Embedded Space Flight Applications,” IEEE Transaction on Nuclear Science, vol. 43, no. 3, pp. 873–878, June 1996.

    Google Scholar 

  40. Venkatasubramanian et al., “Low-Cost On-Line Fault Detection Using Control Flow Assertions,” in Proc. of the 9th IEEE International On-Line Testing Symposium (IOLTS'03), 2003.

  41. K. Wilken and J.P. Shen, “Continuous Signature Monitoring: Low-Cost Concurrent Detection of Processor Control Errors,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 9, June 1990, pp. 629–641.

    Google Scholar 

  42. S.S. Yau, F.C. Chen, and K.H. Yau, “An Approach to Real-Time Control Flow Checking,” Computer Software and Applications Conference (COMPSAC '78), Nov. 1978, pp. 163–168.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajabzadeh, A., Miremadi, S.G. & Mohandespour, M. Error Detection Enhancement in COTS Superscalar Processors with Performance Monitoring Features. Journal of Electronic Testing 20, 553–567 (2004). https://doi.org/10.1023/B:JETT.0000042519.31454.1b

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:JETT.0000042519.31454.1b

Navigation