Self-testing fault-tolerant real-time systems

Rooholamini, M.; Hosseini, S. H.

doi:10.1007/3-540-64359-1_738

M. Rooholamini¹ &
S. H. Hosseini¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1388))

Included in the following conference series:

International Parallel Processing Symposium

106 Accesses

Abstract

We propose a periodic diagnostic algorithm based on the testing model of computation for real-time systems. The diagnostic task runs on every processor of the system. When the task starts execution, all the processors are synchronized and will be doing the same operation at every step of the algorithm. Each processor performs a test of itself and generates a token which contains the test result. Then the token is passed to some neighboring processors to check if a failure has occurred. In our model, a faulty processor does not necessarily stop functioning and it may behave in erratic manners when checking the token of a processor it is assigned to test. We give the conditions under which all processor failures are detected in a torus interconnection network, where each processor is tested by a minimum number of processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. A. Bertossi, M. Bonometto, and L. V. Mancini. Increasing Processor Utilization in Hard-Real-Time Systems with Checkpoints. Real-Time Systems Journal, 9(1):5–29, July 1995.
Article Google Scholar
A. Burchard, J. Liebeherr, Y. Oh, and S. H. Son. New Strategies for Assigning Real-Time Tasks to Multiprocessor Systems. IEEE Transactions on Computers, 44(12):1429–1442, Dec. 1995.
Article Google Scholar
S. H. Hosseini. Fault-Tolerant Scheduling of Independent Tasks and Concurrent Fault-Diagnosis in Multiprocessor Systems. pages 343–350. Proceedings IEEE International Conference on Parallel Processing, Illinois, Aug. 1988.
Google Scholar
S. H. Hosseini and N. Jamal. Efficient Distributed Algorithms for Self Testing of Multiple Processor Systems. IEEE Transactions on Computers, 41(1):1397–1409, Nov. 1992.
Article Google Scholar
K. Hwang. Advanced Computer Architecture. Parallelism, Scalability, Programmability. McGraw Hill, 1993.
Google Scholar
F. J. Meyer and D. K. Pradhan. Dynamic Testing Strategy for Distributed Systems. IEEE Transactions on Computers, 39(3), Mar. 1989.
Google Scholar
Y. Oh and S. H. Son. Fault-Tolerant Real-Time Multiprocessor Scheduling. Technical Report TR-92-09, University of Virginia, April 1992.
Google Scholar
S. Ponzio.Bounds on the Time to Detect Failures Using Bounded-Capacity Message Links. pages 236-245. Proceedings of Real-Time Systems Symposium, Phoenix, AZ, Dec. 1992.
Google Scholar
K. Ramamritham and J. A. Stankovic. Scheduling Algorithms and Operating Systems Support for Real-Time Systems. Proceedings of the IEEE, 82(1):55–67, Jan. 1994.
Article Google Scholar
K. Ramamritham, J. A. Stankovic, and P. F. Shiah. Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems. IEEE Transactions on Parallel and Distributed Systems, 1(2):184–194, April 1990.
Article Google Scholar
J. A. Stankovic, M. Spuri, M. Di Natale, and G. Buttazzo. Implications of Classical Scheduling Results for Real-Time Systems. IEEE Computer, pages 16–25, June 1995.
Google Scholar
J. Sun, R. Bettati, and J. W. S. Liu. An End-to-End Approach to Schedule Tasks with Shared Resources in Multiprocessor Systems. pages 18–22. 11th IEEE Workshop on Real-Time Operating Systems and Software, Seattle, Wash., May 1994.
Google Scholar
T. Tsuchiya, Y. Kakuda, and T. Kikuno. A New Fault-Tolerant Scheduling Technique for Real-Time Multiprocessor Systems. pages 197–202. Proceedings of the 2nd International Workshop on Computing and Applications, Tokyo, Japan, Oct. 1995.
Google Scholar
C. L. Yang and G. M. Masson. An Efficient Algorithm for Multiprocessor Fault Diagnosis Using the Comparison Approach. pages 238–243. The 16th Annual International Symposium on Fault-Tolerant Computing Systems, FTCS-16, 1986.
Google Scholar

Download references

Author information

Authors and Affiliations

Electrical Engineering and Computer Science Department, University of Wisconsin - Milwaukee, 53201, Milwaukee, WI, USA
M. Rooholamini & S. H. Hosseini

Authors

M. Rooholamini
View author publications
You can also search for this author in PubMed Google Scholar
S. H. Hosseini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Rolim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rooholamini, M., Hosseini, S.H. (1998). Self-testing fault-tolerant real-time systems. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_738

Download citation

DOI: https://doi.org/10.1007/3-540-64359-1_738
Published: 08 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64359-3
Online ISBN: 978-3-540-69756-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics