Skip to main content

Model Based Approach for Autonomic Availability Management

  • Conference paper
Service Availability (ISAS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4328))

Included in the following conference series:

Abstract

As increasingly complex computer systems have started playing a controlling role in all aspects of modern life, system availability and associated downtime of technical systems have acquired critical importance. Losses due to system downtime have risen manifold and become wide-ranging. Even though the component level availability of hardware and software has increased considerably, system wide availability still needs improvement as the heterogeneity of components and the complexity of interconnections has gone up considerably too. As systems become more interconnected and diverse, architects are less able to anticipate and design for every interaction among components, leaving such issues to be dealt with at runtime. Therefore, in this paper, we propose an approach for autonomic management of system availability, which provides real-time evaluation, monitoring and management of the availability of systems in critical applications. A hybrid approach is used where analytic models provide the behavioral abstraction of components/subsystems, their interconnections and dependencies and statistical inference is applied on the data from real time monitoring of those components and subsystems, to parameterize the system availability model. The model is solved online (that is, in real time) so that at any instant of time, both the point as well as the interval estimates of the overall system availability are obtained by propagating the point and the interval estimates of each of the input parameters, through the system model. The online monitoring and estimation of system availability can then lead to adaptive online control of system availability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Trivedi, K.S.: Probability and Statistics with Reliability, Queuing and Computer Science Applications. John Wiley & Sons, New York (2001)

    Google Scholar 

  2. Sahner, R.A., Trivedi, K.S., Puliafito, A.: Performance and Reliability Analysis of Computer Systems: An Example-Based Approach Using the SHARPE Software Package. Kluwer Academic Publishers, Dordrecht (1996)

    MATH  Google Scholar 

  3. Leemis, L.M.: Reliability. Probabilistic Models and Statistical Methods. Prentice Hall, New Jersey (1995)

    MATH  Google Scholar 

  4. Tang, D., Iyer, R.K.: Dependability Measurement and Modeling of a Multicomputer System. IEEE Transactions on Computers 42(1), 62–75 (1993)

    Article  Google Scholar 

  5. Malhotra, M., Trivedi, K.S.: Dependability Modeling Using Petri Net Based Models. IEEE Transactions on Reliability 44(3), 428–440 (1995)

    Article  Google Scholar 

  6. Cristian, F., Dancey, B., Dehn, J.: Fault Tolerance in Air Traffic Control Systems. ACM Transactions on Computer Systems 14, 265–286 (1996)

    Article  Google Scholar 

  7. Morgan, P., Gaffney, P., Melody, J., Condon, M., Hayden, M.: System Availability Monitoring. IEEE Transactions on Reliability 39(4), 480–485 (1990)

    Article  Google Scholar 

  8. Ibe, O., Howe, R., Trivedi, K.S.: Approximate availability analysis of VAXCluster systems. IEEE Transactions on Reliability R-38(1), 146–152 (1989)

    Article  Google Scholar 

  9. Blake, J.T., Trivedi, K.S.: Reliability analysis of interconnection networks using hierarchical composition. IEEE Transactions on Reliability 32, 111–120 (1989)

    Article  Google Scholar 

  10. Albin, S.L., Chao, S.: Preventive Replacement in Systems with Dependent Components. IEEE Transactions on Reliability 41(2), 230–238 (1992)

    Article  MATH  Google Scholar 

  11. Kephart, J.O., Chess, D.M.: The Vision of Autonomic Computing. Computer magazine, 41–50 (January 2003)

    Google Scholar 

  12. Li, L., Vaidyanathan, K., Trivedi, K.S.: An Approach for Estimation of Software Aging in a Web Server. In: Proc. of Intl. Symposium on Empirical Software Engineering (ISESE 2002) (2002)

    Google Scholar 

  13. Garzia, M.R.: Assessing the Reliability of Windows Servers. In: Proc. of Dependable Systems and Networks (DSN 2002) (2003)

    Google Scholar 

  14. Hunter, S.W., Smith, W.E.: Availability modeling and analysis of a two node cluster. In: Proc. of 5th Int. Conf. On Information Systems, Analysis and Synthesis (1999)

    Google Scholar 

  15. Yin, L., Smith, M.A.J., Trivedi, K.S.: Uncertainty Analysis in Reliability Modeling. In: Proc. of the Annual Reliability and Maintainability Symposium (RAMS 2001) (2001)

    Google Scholar 

  16. Dohi, T., Popstojanova, K.-G., Trivedi, K.S.: Statistical Non-Parametric Algorithms to estimate the Optimal Software Rejuvenation Schedule. In: Proc. of Pacific Rim Intl. Symposium on Dependable Computing (PRDC) (2000)

    Google Scholar 

  17. Garg, S., Huang, Y., Kintala, C.M.R., Trivedi, K.S., Yajnik, S.: Performance and Reliability Evaluation of Passive replication Schemes in Application Level fault Tolerance. In: Proc. of 29th Annual Intl. Symposium on Fault Tolerant Computing (FTCS) (1999)

    Google Scholar 

  18. Chen, D.Y., Trivedi, K.S.: Analysis of Periodic Preventive Maintenance with General System Failure Distribution. In: Pacific Rim Intl. Symposium on Dependable Computing (PRDC) (2001)

    Google Scholar 

  19. Long, D., Muir, a., Golding, R.: A Longitudinal Survey of Internet Host Reliability. In: Proc. of the 14th Symposium on Reliable Distributed Systems (1995)

    Google Scholar 

  20. Garg, S., Puliafito, A., Telek, M., Trivedi, K.S.: Analysis of Software Rejuvenation using Markov Regenerative Stochastic Petri Net. In: Proc. of Intl. Symposium on Software Reliability Engineering (ISSRE) (1995)

    Google Scholar 

  21. Fricks, R.M., Ketcham, M.: Steady State Availability Estimation Using Field Failure Data. In: Proc. Annual Reliability and Maintainability Symposium (RAMS 2004) (2004)

    Google Scholar 

  22. Sathaye, A., Ramani, S., Trivedi, K.S.: Availability Models in Practice. In: Proc. of Intl. Workshop on Fault-Tolerant Control and Computing (FTCC-1) (2000)

    Google Scholar 

  23. Logothetis, D., Trivedi, K.: Time-dependent behavior of redundant systems with deterministic repair. In: Proc. of 2nd International Workshop on the Numerical Solution of Markov Chains (1995)

    Google Scholar 

  24. Hughes-Fenchel, G.: A Flexible Clustered Approach to High Availability. In: 27th Int. Symp. on Fault-Tolerant Computing (FTCS-27) (1997)

    Google Scholar 

  25. Epylog Log Analyzer, http://linux.duke.edu/projects/epylog

  26. Sun SNMP Management Agent Guide for Sun Fire B1600, http://docs.sun.com/source/817-1010-10/SNMP_intro.html

  27. Simple Network Management Protocol, http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/snmp.htm

  28. Hardware Monitoring by lm_sensors, http://secure.netroedge.com/lm78/info.html

  29. Windows 2000 Cluster Service Architecture, http://www.microsoft.com/serviceproviders/whitepapers/win2k

  30. SwiFT for Windows NT, http://www.bell-labs.com/project/swift

  31. IBM Research — Autonomic Computing, http://www.research.ibm.com/autonomic/index.html

  32. Linux Syslog Man Page, http://www.die.net/doc/linux/man/man2/syslog.2.html

  33. Mosberger, D., Jin, T.: httperf—A Tool for Measuring Web Server Performance, http://www.hpl.hp.com/personal/David_Mosberger/httperf.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mishra, K., Trivedi, K.S. (2006). Model Based Approach for Autonomic Availability Management. In: Penkler, D., Reitenspiess, M., Tam, F. (eds) Service Availability. ISAS 2006. Lecture Notes in Computer Science, vol 4328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11955498_1

Download citation

  • DOI: https://doi.org/10.1007/11955498_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68724-5

  • Online ISBN: 978-3-540-68725-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics