Skip to main content

Maximize System Reliability for Long Lasting and Continuous Applications

  • Conference paper

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 353))

Abstract

In this paper, we use software rejuvenation as a preventive and proactive fault-tolerance technique to maximize the level of reliability for continuous and safety critical systems. We take both transient faults caused by software aging effects and network transmission faults into consideration and mathematically analyze the optimal software rejuvenation period that maximizes system’s reliability. The theoretical result is verified through empirical studies.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   369.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lions, J.-L., et al.: Ariane 5 flight 501 failure (1996)

    Google Scholar 

  2. Arthur, G., Stephenson, D.R., Mulville, F.H., Bauer, G.A.: Mars climate orbiter mishap investigation board phase i report, 44 p. NASA, Washington, DC (1999)

    Google Scholar 

  3. Tai, A., Chau, S.N., Alkalaj, L., Hecht, H.: On-board preventive maintenance: analysis of effectiveness and optimal duty period. In: Proceedings of Third International Workshop on Object-Oriented Real-Time Dependable Systems, pp. 40–47 (February 1997)

    Google Scholar 

  4. Tai, A., Alkalai, L., Chau, S.N.: On-board preventive maintenance for long-life deep-space missions: a model-based analysis. In: Proceedings of IEEE International Computer Performance and Dependability Symposium, IPDS 1998, pp. 196–205 (September 1998)

    Google Scholar 

  5. Chatterjee, S., Fawaz, M., Najm, F.N.: Redundancy-aware electromigration checking for mesh power grids. In: 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 540–547. IEEE (2013)

    Google Scholar 

  6. Black, J.R.: Black. Electromigration – a brief survey and some recent results. IEEE Transactions on Electron Devices 16(4), 338–347 (1969)

    Article  Google Scholar 

  7. Garg, S., van. Moorsel, A., Vaidyanathan, K., Trivedi, K.S.: A methodology for detection and estimation of software aging. In: Proceedings of the Ninth International Symposium on Software Reliability Engineering, pp. 283–292. IEEE (1998)

    Google Scholar 

  8. Huang, Y., Kintala, C., Kolettis, N., Fulton, N.D.: Software rejuvenation: analysis, module and applications. In: Twenty-Fifth International Symposium on Fault-Tolerant Computing, FTCS-25, Digest of Papers, pp. 381–390 (June 1995), doi:10.1109/FTCS.1995.466961

    Google Scholar 

  9. Matlab R2012b, http://www.mathworks.com/products/new_products/release2012b.html

  10. Bobbio, A., Garg, S., Gribaudo, M., Horvath, A., Sereno, M., Telek, M.: Modeling software systems with rejuvenation, restoration and checkpointing through fluid stochastic petri nets. In: Proceedings of the 8th International Workshop on Petri Nets and Performance Models, pp. 82–91 (1999), doi:10.1109/PNPM.1999.796555

    Google Scholar 

  11. Li, Z., Wang, L., Ren, S., Quan, G.: Energy minimization for checkpointing-based approach to guaranteeing real-time systems reliability. In: 2013 IEEE 16th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC), pp. 1–8 (June 2013)

    Google Scholar 

  12. Koutras, V.P., Platis, A.N.: Semi-markov availability modeling of a redundant system with partial and full rejuvenation actions. In: Third International Conference on Dependability of Computer Systems, DepCos-RELCOMEX 2008, pp. 127–134 (June 2008)

    Google Scholar 

  13. Hanmer, R.S., Mendiratta, V.B.: Rejuvenation with workload migration. In: 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 80–85 (June 2010)

    Google Scholar 

  14. Singh, C.: Reliability modeling of tmr computer systems with repair and common mode failures. Microelectronics Reliability 21(2), 259–262 (1981)

    Article  Google Scholar 

  15. Khoshgoftaar, T.M., Seliya, N.: Tree-based software quality estimation models for fault prediction. In: Proceedings of the Eighth IEEE Symposium on Software Metrics, pp. 203–214 (2002)

    Google Scholar 

  16. Pfening, A., Garg, S., Puliafito, A., Telek, M., Trivedi, K.S.: Optimal software rejuvenation for tolerating soft failures. Perform. Eval. 27–28, 491–506 (October 1996)

    Google Scholar 

  17. Tai, A., Alkalai, L.: On-board maintenance for long-life systems. In: Proceedings of the 1998 IEEE Workshop on Software Engineering Technology, ASSET-1998, pp. 69–74 (March 1998)

    Google Scholar 

  18. Sadek, A., Limnios, N.: Nonparametric estimation of reliability and survival function for continuous-time finite markov processes. Journal of Statistical Planning and Inference 133(1), 1–21 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  19. Trivedi, K.S., Vaidyanathan, K., Goseva-Popstojanova, K.: Modeling and analysis of software aging and rejuvenation. In: Proceedings of the 33rd Annual Simulation Symposium, SS 2000, pp. 270–279 (2000)

    Google Scholar 

  20. Okamura, H., Dohi, T.: Availability optimization in operational software system with aperiodic time-based software rejuvenation scheme. In: IEEE International Conference on Software Reliability Engineering Workshops, ISSRE Wksp 2008, pp. 1–6 (November 2008)

    Google Scholar 

  21. Koutras, V.P., Platis, A.N., Limnios, N.: Availability and reliability estimation for a system undergoing minimal, perfect and failed rejuvenation. In: International Conference on Software Reliability Engineering Workshops, ISSRE Wksp 2008, pp. 1–6 (November 2008)

    Google Scholar 

  22. Kandasamy, J.P.N., Hayes, Murray, B.T.: Transparent recovery from intermittent faults in time-triggered distributed systems. IEEE Transactions on Computers 52(2), 113–125 (2003)

    Article  Google Scholar 

  23. Barlow, R., Proschan, F.: Mathematical Theory of Reliability. Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunhui Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Guo, C., Wu, H., Hua, X., Ren, S., Nogiec, J.M. (2015). Maximize System Reliability for Long Lasting and Continuous Applications. In: Rocha, A., Correia, A., Costanzo, S., Reis, L. (eds) New Contributions in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol 353. Springer, Cham. https://doi.org/10.1007/978-3-319-16486-1_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16486-1_59

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16485-4

  • Online ISBN: 978-3-319-16486-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics