Skip to main content
Log in

Software rejuvenation in cluster computing systems with dependency between nodes

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Software rejuvenation is a preventive and proactive fault management technique that is particularly useful for counteracting the phenomenon of software aging, aimed at cleaning up the system internal state to prevent the occurrence of future failure. The increasing interest in combing software rejuvenation with cluster systems has given rise to a prolific research activity in recent years. However, so far there have been few reports on the dependency between nodes in cluster systems when software rejuvenation is applied. This paper investigates the software rejuvenation policy for cluster computing systems with dependency between nodes, and reconstructs an stochastic reward net model of the software rejuvenation in such cluster systems. Simulation experiments and results reveal that the software rejuvenation strategy can decrease the failure rate and increase the availability of the cluster system. It also shows that the dependency between nodes affects software rejuvenation policy. Based on the theoretic analysis of the software rejuvenation model, a prototype is implemented on the Smart Platform cluster computing system. Performance measurement is carried out on this prototype, and experimental results reveal that software rejuvenation can effectively prevent systems from entering into disabled states, and thereby improving the ability of software fault-tolerance and the availability of cluster computing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Parnas D (1994) Software aging. In: Proceedings of the 16th international conference on software engineering, pp 279–287

  2. Huang Y, Kintala C, Kolettis N, Fulton N (1995) Software rejuvenation: analysis, module and applications. In: Proceedings of 25th symposium on fault tolerant, computing, pp 381–390

  3. Grottke M, Li L, Vaidyanathan K, Trivedi K (2006) Analysis of software aging in a web server. IEEE Trans Reliab 55(3):411–420

    Article  Google Scholar 

  4. Matias R, Freitas P, (2006) An emperimental study on software aging and rejuevenation in web servers. In: Proceedings of 30th annual international conference on computer software and applications, vol 1, pp 189–196

  5. Grottke M, Nikora A, Trivedi K (2010) An empirical investigation of fault types in space mission system software. In: Proceedings of IEEE conference on dependable systems and networks, pp 447–456

  6. Moorsel A, Wolter K (2006) Analysis of restart mechanisms in software systems. IEEE Trans Softw Eng 32(8):547–558

    Article  Google Scholar 

  7. Alonso J, Torres J, Berral J, Gavalda R (2010) Adaptive on-line software aging prediction based on machine learning. In: Proceedings of international conference on dependable systems and networks, pp 507–516

  8. Dugan J, Trivedi K (1989) Coverage modeling for dependability analysis of fault-tolerant systems. IEEE Trans Comput 38(6):775–787

    Article  Google Scholar 

  9. Gokhale S, Trivedi K (1998) Dependency characterization in path-based approaches to architecture-based software reliability prediction. In: Proceedings of international conference on application-specific software engineering technology, pp 86–89

  10. Popstojanova K, Trivdei K (2000) Failure correlation in software reliability models. IEEE Trans Reliab 49(1):37–48

    Article  Google Scholar 

  11. Fan X, Xu G, Ying R, Zhang H, Jiang L (2003) Performance analysis of software rejuvenation on dispatcher–worker based cluster system. In: Proceedings of the 4th international conference on parallel and distributed computing, applications and technologies, pp 562–566

  12. Vaidyanathan K, Haarper R, Hunter S, Trivedi K (2001) Analysis and implementation of software rejuvenation in cluster systems. In: Proceedings of joint international conference on measurement and modeling of computer systems, ACM SIGMETRICS, pp 62–71

  13. Bobbio A, Sereno A, Anglano C (2001) Fine grained software degradation models for optimal rejuvenation policies. J Perform Eval 46:45–62

    Article  MATH  Google Scholar 

  14. Dohi T, Popstojanova K, Trivedi K (2000) Statistical nonparametric algorithms to estimate the optimal software rejuvenation schedule. In: Proceedings of Pacific rim international symposium dependendable computing, pp 77–84

  15. Grag S, Puliafito A, Telek M, Trivedi K (1998) Analysis of preventive maintenance in transactions based software systems. IEEE Trans Comput 47(1):96–107

    Article  Google Scholar 

  16. Bao Y, Sun X, Trivedi K (2005) A workload-based analysis of software aging and rejuvenation. IEEE Trans Reliab 55(3):541–548

    Article  Google Scholar 

  17. Koutras V, Platis A, Gravvanis G (2009) Optimal server resource reservation policies for priority classes of users under cyclic non-homogeneous markov modeling. Eur J Oper Res 198(2):545–556

    Article  MATH  MathSciNet  Google Scholar 

  18. Garg S, Moorsel A, Vaidyanathan K, Trivedi K (1998) A methodology for detection and estimation of software aging. In: Proceedings of 9th international symposium on software, reliability engineering, pp 282–292

  19. Vaidyanathan K, Trivedi S (1999) A measurement-based model for estimation of resource exhaustion in operation systems. In: Proceedings of 10th international symposium on software, reliability engineering, pp 84–93

  20. Vaidyanathan K, Trivedi S (2005) A comprehensive model for software rejuvenation. IEEE Trans Dependable Secur Comput 2(2):124–137

    Article  Google Scholar 

  21. Cassidy K, Gross K, Malekpout A (2002) Advanced pattern recognition for detection of complex software aging in online transaxtion processing servers. In: Proceedings of dependable systems and networks, pp 478–482

  22. Gross K, Bhardwaj V, Bickford R (2002) Proactive detective of software aging mechanisms in performance critical computers. In: Proceedings of 27th IEEE annual symposium on software enginerring, pp 17–23

  23. Silva L, Alonso J, Torres J (2009) Using virtualization to improve software rejuvenation. IEEE Trans Comput 58(11):1525–1538

    Article  MathSciNet  Google Scholar 

  24. Matias R, Barbetta P, Trivedi K, Freitas P (2010) Accelerated degradation tests applied to software aging experiments. IEEE Trans Reliab 59(1):102–114

    Article  Google Scholar 

  25. Avritzer A, Weyuker E (1997) Monitoring smoothly degrading systems for increased dependability. Empir Softw Eng J 2(1):59–77

    Article  Google Scholar 

  26. Liu Y, Trivedi K, Ma Y, Han J, Levendel H (2002) Modeling and analysis of software rejuvenation in cable modem termination systems. In: Proceedings of 13th international symposium on software, reliability engineering, pp 159–170

  27. Tai A, Chau S, Alkalaj L, Hecht H (1997) On-board preventive maintenance: analysis of effectiveness and optimal duty period. In: Proceedings of 3rd international workshop on object oriented real-time dependable systems, pp 40–47

  28. Kourai K, Chiba S (2011) Fast software rejuvenation of virtual machine monitors. IEEE Trans Dependable Secur Comput 8(6):839–851

    Article  Google Scholar 

  29. Wang D, Xie W, Trivedi K (2007) Performance analysis of clustered systems with rejuvenation under varying workload. J Perform Eval 64:247–265

    Article  Google Scholar 

  30. Xie W, Shi Y, Xu G, Mao Y (2002) Smart Platform—a software infrastructure for smart space. In: Proceedings of 4th IEEE conference on multimodal, interfaces, pp 429–435

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Menghui Yang.

Additional information

This work was supported in part by National Natural Science Foundation of China under the grant No. 60872044, 71133006, and Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China .

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, M., Min, G., Yang, W. et al. Software rejuvenation in cluster computing systems with dependency between nodes. Computing 96, 503–526 (2014). https://doi.org/10.1007/s00607-014-0385-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-014-0385-x

Keywords

Mathematics Subject Classification

Navigation