Abstract
Software aging accumulation leads to increased resource consumption. In this context, the memory leak is one of the well-known problems related to software aging. A bursty workload can accelerate software aging bug activation as it requires instantaneous resource allocation. Then, the rapid resource allocation and deallocation may lead to software aging through memory leaks. Moreover, a bursty workload may cause a resource exhaustion failure in a system already overloaded by software aging accumulation. Virtual Machine (VM) migration schedules can be used to mitigate software aging moving services away from a compromised physical host. Despite the considerable progress made in this area, the state-of-the-art still lacks a modeling framework for performability and dependability evaluation of VM migration as rejuvenation in a system under bursty workloads. This paper proposes a set of Stochastic Reward Net (SRN), aiming at filling this research gap. We consider five scenarios covering different bursty workload conditions, and present a specific model to cover the uncertainties related to bursty workloads. Our results present the specific rejuvenation schedule to maximize system performability and dependability for each scenario. The proposed modeling framework may be useful to support virtualized environment management decisions.













Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Bursty workload occurrence usually causes a system utilization peak.
Arcs terminating in a circle instead of an arrowhead.
Receiving and returning.
We consider a year with 365 days.
Considering a month with 30 days. \(30 \cdot 24 \cdot 60 = 43,200\).
Note that, in this case, the performance is degradable due to software aging accumulation issues, then the computed metrics are related to system performability [37].
References
Akoush, S., Sohan, R., Rice, A., Moore, A.W., Hopper, A.: Predicting the performance of virtual machine migration. In: 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 37–46. IEEE (2010)
Araujo, J., Matos, R., Maciel, P., Matias, R., Beicker, I.: Experimental evaluation of software aging effects on the eucalyptus cloud computing infrastructure. In: Proceedings of the Middleware 2011 Industry Track Workshop, p. 4. ACM (2011)
Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Depend. Secure Comput. 1(1), 11–33 (2004)
Avritzer, A., Weyuker, E.J.: Monitoring smoothly degrading systems for increased dependability. Empir. Softw. Eng. 2(1), 59–77 (1997)
Bause, F.: Queueing petri nets-a formalism for the combined qualitative and quantitative analysis of systems. In: Proceedings of 5th International Workshop on Petri Nets and Performance Models, pp. 14–23. IEEE (1993)
Bobbio, A.: System modelling with petri nets. In: Systems Reliability Assessment, pp. 103–143. Springer (1990)
Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation-Volume 2, pp. 273–286. USENIX Association (2005)
Cotroneo, D., Natella, R., Pietrantuono, R., Russo, S.: A survey of software aging and rejuvenation studies. ACM J. Emerg. Technol. Comput. Syst. 10(1), 8 (2014)
Dohi, T., Zheng, J., Okamura, H., Trivedi, K.S.: Optimal periodic software rejuvenation policies based on interval reliability criteria. Reliab. Eng. Syst. Saf. 180, 463–475 (2018)
Escheikh, M., Tayachi, Z., Barkaoui, K.: Performability evaluation of server virtualized systems under bursty workload. IFAC-PapersOnLine 51(7), 45–50 (2018)
Feuerlicht, G., Burkon, L., Sebesta, M.: Cloud computing adoption: what are the issues. Syst. Integr. 18(2), 187–192 (2011)
Garg, S., Van Moorsel, A., Vaidyanathan, K., Trivedi, K.S.: A methodology for detection and estimation of software aging. In: Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No. 98TB100257), pp. 283–292. IEEE (1998)
Grottke, M., Matias, R., Trivedi, K.S.: The fundamentals of software aging. In: 2008 IEEE International Conference on Software Reliability Engineering Workshops (ISSRE Wksp), pp. 1–6. IEEE (2008)
Gupta, A.K., Zeng, W.B., Wu, Y.: Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics. Springer, New York (2010)
Huang, Y., Kintala, C., Kolettis, N., Fulton, N.D.: Software rejuvenation: analysis, module and applications. In: Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers, pp. 381–390. IEEE (1995)
Jain, R.: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley, New York (1990)
Kleinrock, L.: Queueing Systems, vol. i: Theory (1975)
Kounev, S.: Performance modeling and evaluation of distributed component-based systems using queueing petri nets. IEEE Trans. Softw. Eng. 32(7), 486–502 (2006)
Kuchárik, M., Balogh, Z.: Modeling of uncertainty with petri nets. In: Asian Conference on Intelligent Information and Database Systems, pp. 499–509. Springer (2019)
Liu, H., Xu, C.Z., Jin, H., Gong, J., Liao, X.: Performance and energy modeling for live migration of virtual machines. In: Proceedings of the 20th International Symposium on High Performance Distributed Computing, pp. 171–182. ACM (2011)
Low, C., Chen, Y., Wu, M.: Understanding the determinants of cloud computing adoption. Ind. Manag. Data Syst. 111(7), 1006–1023 (2011)
Macêdo, A., Ferreira, T.B., Matias, R.: The mechanics of memory-related software aging. In: 2010 IEEE Second International Workshop on Software Aging and Rejuvenation, pp. 1–5. IEEE (2010)
Machida, F., Kim, D.S., Trivedi, K.S.: Modeling and analysis of software rejuvenation in a server virtualized system. In: 2010 IEEE Second International Workshop on Software Aging and Rejuvenation, pp. 1–6. IEEE (2010)
Machida, F., Kim, D.S., Trivedi, K.S.: Modeling and analysis of software rejuvenation in a server virtualized system with live vm migration. Perform. Eval. 70(3), 212–230 (2013)
Machida, F., Miyoshi, N.: An optimal stopping problem for software rejuvenation in a job processing system. In: 2015 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 139–143. IEEE (2015)
Machida, F., Miyoshi, N.: Analysis of an optimal stopping problem for software rejuvenation in a deteriorating job processing system. Reliab. Eng. Syst. Saf. 168, 128–135 (2017)
Machida, F., Nicola, V.F., Trivedi, K.S.: Job completion time on a virtualized server subject to software aging and rejuvenation. In: 2011 IEEE Third International Workshop on Software Aging and Rejuvenation, pp. 44–49. IEEE (2011)
Machida, F., Nicola, V.F., Trivedi, K.S.: Job completion time on a virtualized server with software rejuvenation. ACM J. Emerg. Technol. Comput. Syst. 10(1), 10 (2014)
Machida, F., Xiang, J., Tadano, K., Maeno, Y.: Aging-related bugs in cloud computing software. In: 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops, pp. 287–292. IEEE (2012)
Machida, F., Xiang, J., Tadano, K., Maeno, Y.: Lifetime extension of software execution subject to aging. IEEE Trans. Reliab. 66(1), 123–134 (2016)
Maciel, P., Matos, R., Silva, B., Figueiredo, J., Oliveira, D., Fé, I., Maciel, R., Dantas, J.: Mercury: performance and dependability evaluation of systems with exponential, expolynomial, and general distributions. In: 2017 IEEE 22nd Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 50–57. IEEE (2017)
Matos, R., Araujo, J., Alves, V., Maciel, P.: Characterization of software aging effects in elastic storage mechanisms for private clouds. In: 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops, pp. 293–298. IEEE (2012)
Maziku, H., Shetty, S.: Towards a network aware vm migration: Evaluating the cost of vm migration in cloud data centers. In: 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet), pp. 114–119. IEEE (2014)
Mell, P., Grance, T., et al.: The nist definition of cloud computing (2011)
Melo, M., Araujo, J., Matos, R., Menezes, J., Maciel, P.: Comparative analysis of migration-based rejuvenation schedules on cloud availability. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics, pp. 4110–4115. IEEE (2013)
Melo, M., Maciel, P., Araujo, J., Matos, R., Araujo, C.: Availability study on cloud computing environments: live migration as a rejuvenation mechanism. In: 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 1–6. IEEE (2013)
Meyer, J.F.: Performability: a retrospective and some pointers to the future. Perform. Eval. 14(3–4), 139–156 (1992)
Mijumbi, R., Serrat, J., Gorricho, J.L., Bouten, N., De Turck, F., Boutaba, R.: Network function virtualization: state-of-the-art and research challenges. IEEE Commun. Surv. Tutor. 18(1), 236–262 (2015)
Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–580 (1989). https://doi.org/10.1109/5.24143
Myint, M.T.H., Thein, T.: Availability improvement in virtualized multiple servers with software rejuvenation and virtualization. In: 2010 Fourth International Conference on Secure Software Integration and Reliability Improvement, pp. 156–162. IEEE (2010)
Nguyen, T.A., Min, D., Choi, E., Tran, T.D.: Reliability and availability evaluation for cloud data center networks using hierarchical models. IEEE Access 7, 9273–9313 (2019)
Oliveira, T., Thomas, M., Espadanal, M.: Assessing the determinants of cloud computing adoption: an analysis of the manufacturing and services sectors. Inf. Manag. 51(5), 497–510 (2014)
Patterson, D.A., et al.: A simple way to estimate the cost of downtime. LISA 2, 185–188 (2002)
Pietrantuono, R., Russo, S.: A survey on software aging and rejuvenation in the cloud. Softw. Q. J. 1–32 (2019)
Salfner, F., Tröger, P., Polze, A.: Downtime analysis of virtual machine live migration. In: The Fourth International Conference on Dependability (DEPEND 2011). IARIA, pp. 100–105 (2011)
Schroeder, B., Gibson, G.A.: Disk failures in the real world: What does an mttf of 1, 000, 000 hours mean to you? FAST 7, 1–16 (2007)
Siddiqui, S., Darbari, M., Yagyasen, D., et al.: Modelling and simulation of queuing models through the concept of petri nets (2020)
Soltesz, S., Pötzl, H., Fiuczynski, M.E., Bavier, A., Peterson, L.: ACM: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. ACM SIGOPS Oper. Syst. Rev. 41, 275–287 (2007)
Strunk, A.: Costs of virtual machine live migration: a survey. In: 2012 IEEE Eighth World Congress on Services, pp. 323–329. IEEE (2012)
Thein, T., Park, J.S.: Availability analysis of application servers using software rejuvenation and virtualization. J. Comput. Sci. Technol. 24(2), 339–346 (2009)
Torquato, M., Araujo, J., Umesh, I., Maciel, P.: Sware: a methodology for software aging and rejuvenation experiments. J. Inf. Syst. Eng. Manag. 3(2), 15 (2018)
Torquato, M., Maciel, P., Araujo, J., Umesh, I.: An approach to investigate aging symptoms and rejuvenation effectiveness on software systems. In: 2017 12th Iberian Conference on Information Systems and Technologies (CISTI), pp. 1–6. IEEE (2017)
Torquato, M., Maciel, P., Vieira, M.: A model for availability and security risk evaluation for systems with vmm rejuvenation enabled by vm migration scheduling. IEEE Access 7, 138315–138326 (2019)
Torquato, M., Maciel, P., Vieira, M.: Availability and reliability modeling of vm migration as rejuvenation on a system under varying workload. Softw. Qual. J. 1–25 (2020)
Torquato, M., Torquato, L., Maciel, P., Vieira, M.: Iaas cloud availability planning using models and genetic algorithms. In: 2019 9th Latin-American Symposium on Dependable Computing (LADC), pp. 1–10. IEEE (2019)
Torquato, M., Umesh, I., Maciel, P.: Models for availability and power consumption evaluation of a private cloud with vmm rejuvenation enabled by vm live migration. J. Supercomput. 74(9), 4817–4841 (2018)
Torquato, M., Vieira, M.: Interacting srn models for availability evaluation of vm migration as rejuvenation on a system under varying workload. In: 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 300–307. IEEE (2018)
Torquato, M., Vieira, M.: An experimental study of software aging and rejuvenation in dockerd. In: 2019 15th European Dependable Computing Conference (EDCC), pp. 1–6. IEEE (2019)
Trivedi, K.S., Vaidyanathan, K., Goseva-Popstojanova, K.: Modeling and analysis of software aging and rejuvenation. In: Proceedings 33rd Annual Simulation Symposium (SS 2000), pp. 270–279. IEEE (2000)
Vaidyanathan, K., Trivedi, K.S.: A comprehensive model for software rejuvenation. IEEE Trans. Dependable Secure Comput. 2(2), 124–137 (2005)
Valmari, A.: The state explosion problem. In: Advanced Course on Petri Nets, pp. 429–528. Springer (1996)
Voorsluys, W., Broberg, J., Venugopal, S., Buyya, R.: Cost of virtual machine live migration in clouds: a performance evaluation. In: IEEE International Conference on Cloud Computing, pp. 254–265. Springer (2009)
Wang, D., Xie, W., Trivedi, K.S.: Performability analysis of clustered systems with rejuvenation under varying workload. Perform. Eval. 64(3), 247–265 (2007)
Yeboah-Boateng, E.O., Essandoh, K.A.: Factors influencing the adoption of cloud computing by small and medium enterprises in developing economies. Int. J. Emerg. Sci. Eng. 2(4), 13–20 (2014)
Zheng, J., Okamura, H., Dohi, T.: A transient interval reliability analysis for software rejuvenation models with phase expansion. Softw. Qual. J. 1–22 (2019)
Zimmermann, A.: Modelling and performance evaluation with timenet 4.4. In: International Conference on Quantitative Evaluation of Systems, pp. 300–303. Springer (2017)
Acknowledgements
This work has been partially supported by Portuguese Foundation for Science and Technology (FCT), through the PhD Grant SFRH/BD/146181/2019, within the scope of the project CISUC - UID/CEC/00326/2020. This work is also funded by the European Social Fund, through the Regional Operational Program Centro 2020. This work also received support from AIDA: (Adaptive, Intelligent and Distributed Assurance Platform) project, funded by Operational Program for Competitiveness and Internationalization (COMPETE 2020) and FCT (under CMU Portugal Program) through Grant POCI-01-0247-FEDER-045907. And, from project TalkConnect funded by COMPETE 2020 trough Grant POCI-01-0247-FEDER-039676.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Stochastic Reward Nets
Appendix: Stochastic Reward Nets
Stochastic Reward Nets (SRN) are a sub-type of Petri Nets (PN). A PN is a 5-tuple, \(PN = (P, T, F, W, M_0)\) where: \(P = \{p_1, p_2, \ldots , p_n\}\) is a finite set of places, \(T = \{t_1, t_2, \ldots , t_n\}\) is a finite set of transitions, \(F \subseteq (P \times T) \cup (T \times P)\) is a set of arcs, \(W: F \rightarrow \{0, 1, 2, 3, \ldots \}\) is a weight function, and \(M_0: P \rightarrow \{0, 1, 2, 3, \ldots \} \) is the initial marking [39].
The graphical representation of PN has four main components, as presented in Fig. 14. The places keep the tokens, the arcs indicate the relation between places and transitions, and the PN state is altered upon a transition firing, which moves tokens from one transition to other. In SRNs, it is possible to assign time delays to the transitions.
Let us consider the flow of a simple SRN availability model in the Fig. 15. In the initial state, the system is running, presented by the token in the UP place. The transition MTTF represents the system mean time to failure (MTTF). MTTF firing represents a system failure occurrence. The same transition moves the token from UP place to the DW place. The system repair is represented by the MTTR transition (mean time to repair (MTTR)). MTTR transition firing brings the model back to its initial state.
We can compute the system availability using the following reward measure \(Availability = P\{UP > 0\}\), which captures the probability of tokens presence in the UP place.
Rights and permissions
About this article
Cite this article
Torquato, M., Maciel, P. & Vieira, M. Model-Based Performability and Dependability Evaluation of a System with VM Migration as Rejuvenation in the Presence of Bursty Workloads. J Netw Syst Manage 30, 3 (2022). https://doi.org/10.1007/s10922-021-09619-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10922-021-09619-3
Keywords
Profiles
- Matheus Torquato View author profile