Abstract
Today’s many-core processors are manufactured in inherently unreliable technologies. Massively defective technologies used for production of many-core processors are the direct consequence of the feature size shrinkage in today’s CMOS (complementary metal-oxide-semiconductor) technology. Due to these reliability problems, fault-tolerance of many-core processors becomes one of the major challenges. To reduce the probability of failures of many-core processors various fault tolerance techniques can be applied. The most preferable and promising techniques are the ones that can be easily implemented and have minimal cost while providing high level of processor fault tolerance. One of the promising techniques for detection of faulty cores, and consequently, for performing the first step in providing many-core processor fault tolerance is mutual testing among processor cores. Mutual testing can be performed either in a random manner or according to a deterministic scheduling policy. In the paper we deal with random execution of mutual tests. Effectiveness of such testing can be evaluated through its modeling. In the paper, we have shown how Stochastic Petri Nets can be used for this purpose and have obtained some results that can be useful for developing and implementation of testing procedure in many-core processors.
Similar content being viewed by others
References
Aggarwal N, Ranganathan P, Jouppi NP, Smith JE (2007) Cofigurable isolation: building high availability systems with commodity multi-core processors. ISCA
Bechta Dugan J, Bobbio A, Ciardo G, Trivedi K (1985) The design of a unified package for the solution of stochastic Petri net models. In Proc. Int. Workshop on Timed Petri Nets, Italy, IEEE Soc Press No.674, pp 6–13
Bonet P, Llado CM, Puijaner R, Knottenbelt WJ (2007) PIPEv2.5: A Petri Net Tool for Performance Modelling. In Proc. 23rd Latin American Conference on Informatics (CLEI 2007), Costa Rica
Ciardo G, Muppala JK, Trivedi KS (1989) SPNP: Stochastic Petri Net Package. In: Proc Int Workshop on Petri Nets and Performance Models. IEEE Computer Society Press, Japan, pp 142–150
Collet JH, Zajac P, Psarakis M, Gizopoulos D (2011) Chip self-organization and fault-tolerance in massively defective multicore arrays. IEEE Trans Dependable Secur Comput, 8:(2)
Diao Q, Song JJ (2008) Prediction of CPU idle-busy activity pattern. In Proceedings of HPCA, pp 27–36
Gizopoulos D, Psarakis M, Adve SV, Ramachandran P, Sorin D, Meixner A, Biswas A, Vera X (2011) Architectures for online error detection and recovery in multicore processors. ACM/IEEE Des Autom Test Europe Conf (DATA 2011), Grenoble, France
Gostev V, Mashkov V, Mashkov O (1995) Self-diagnosis of modular systems in random performance of elementary tests. Cybern Comput Technol (Discret Contr Syst), Allerton Press, Inc. No. 105, pp 104–111
Hetherington G, et al (1999) Logic BIST for large industrial design: real issues and case studies. ITC
Laforge LE, Huang K, Agarwal VK (1994) Almost sure diagnosis of almost every good elements. IEEE Trans Comput 43(3):295–305
Maestrini P, Santi P (1995) Self-diagnosis of processor arrays using a comparison model. In Proc 4th IEEE Symp Reliab Distrib Syst, pp 218–228
Markov Analysis ITEM ToolKit Module. Available on: http://www.itemuk.com/markov.html
Mashkov V (2011) Selected problems of system level self-diagnosis. Ukrainian Academic Press, 184pages
Mashkov V, Barabash O (1995) Self-checking of modular systems under random performance of elementary checks. Eng Simul 12:433–445
Molloy MK (1981) On the integration of delay and throughput measures in distributed processing models. Technical report, PhD Thesis, UCLA
Natkin S (1980) Les reseaux de Petri stochastiques et leur application a l’evaluation des systems informatiques. Technical report, These de Docteur Ingegneur, Paris
Peter J. Haas (2002) Stochastic Petri Nets: Modelling, Stability, Simulation. Springer Series in Operations Research. Editors: Peter W. Glynn and M. Robinson. ISBN 0-387-95445-7
PNgenerator. Available on: http://vtan.ujep.cz/PNgenerator
Psarakis M, Gizopoulos D, Sanchez E, Sonza Reorda M (2010) Microprocessor software-based self-testing. IEEE Des Test Comput 27(3):4–19
Python official website: http://www.python.org
Sahner RA, Trivedi KS (1987) Reliability modeling using SHARPE. IEEE Trans Reliab R-36(2):186–193
Acknowledgments
The authors would like to thank SHARPE developer Prof. Kishor Trivedi for his kindly help and recommendations which facilitated preparing of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: D. Gizopoulos
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Mashkov, V., Barilla, J. & Simr, P. Applying Petri Nets to Modeling of Many-Core Processor Self-Testing when Tests are Performed Randomly. J Electron Test 29, 25–34 (2013). https://doi.org/10.1007/s10836-012-5346-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10836-012-5346-8