Abstract
Cloud computing, such as Infrastructure as a Service (IaaS), enables vendors to use virtualization technology to rent computing resources on a physical machine to execute the desired applications of users. IaaS is the most common business model of cloud computing; however, its availability remains a concern among users. Several factors affect the availability of a cloud computing center, such as interruption of service caused by hardware component damage. In this study, we focused on the thermal emergency event of CPU overheating caused by chassis fan damage, and determined a method to resolve the crisis before a crash occurs. We designed a thermal-aware VM migration manager (TAVMM) that can determine the health of a physical machine from its temperature and resource use information. By leveraging VM migration, the risk to the physical machine can be removed by transferring its load to a normal one and reducing the CPU temperature. We propose heat transfer and migration time as criteria for a VM selection policy and the load balance algorithm regarding thermal tolerance as the VM allocation policy. The simulation results show that a TAVMM with the proposed VM selection and allocation policy can enhance system ability and reduce the number of VM failures.



















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. NIST Special Publication 800-145.
Nines (engineering). (1998). http://en.wikipedia.org/wiki/Nines_(engineering).
IT Cloud Services User Survey, pt.2: Top benefits & challenges. (2008). http://blogs.idc.com/ie/?p=210.
Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region. (2011). http://aws.amazon.com/message/65648/.
Amazon EC2 Service Level Agreement. 2013. http://aws.amazon.com/ec2-sla/.
Moore, J., Chase, J., Ranganathan, P., & Sharma, R. (2005). Making scheduling “cool”: Temperature-aware workload placement in data centers. In Proceedings of USENIX annual technical conference (pp. 61–75).
Ramos, L., & Bianchini, R. (2008). C-oracle: Predictive thermal management for data centers. In Proceedings of the fourteenth international symposium on high-performance computer architecture (HPCA’08).
Fu, S. (2010). Failure-aware resource management for high-availability computing clusters with distributed virtual machines. Proceedings of journal of parallel and distributed computing, 70, 384–393.
Guan, Q., Zhang, Z., & Fu, S. (2012). Ensemble of Bayesian predictors and decision trees for proactive failure management in cloud computing systems. Proceedings of Journal of Communications, 7(1), 52–61.
Wang, Y., & Qiao, M. (2010). Virtual machine auto-configuration for web application. In Proceedings of performance computing and communications conference (IPCCC), 2010 IEEE 29th international.
Salami, H., Saadatfar, H., Fard, F. R., Shekofteh S. K., & Deldari, H. (2010). Improving cluster computing performance based on job futurity prediction. In Proceedings of 2010 3rd international conference on advanced computer theory and engineering (ICACTE).
Sahoo, R. K., Oliner, A. J., Rish, I., Gupta, M., Moreira, J. E., &Ma, S. (2003). Critical event prediction for proactive management in large-scale computer clusters. In Proceedings of ACM international conference on knowledge discovery and data dining (SIGKDD).
Mickens, J. W., & Noble, B. D. (2006). Exploiting availability prediction in distributed systems. In Proceedings of USENIX symposium on networked systems design and implementation (NSDI).
Fu, S., & Xu, C. (2010). Quantifying event correlations for proactive failure management in networked computing systems. Proceedings of Journal of Parallel and Distributed Computing, 70(11), 1100–1109.
Gu, J., Zheng, Z., Lan, Z., White, J., Hocks, E., & Park, B. H. (2008). Dynamic meta-learning for failure prediction in large-scale systems: A case study. In Proceedings of IEEE international conference on parallel processing (ICPP).
Frank, R. J., Davey, N., & Hunt, S. P. (2001). Time series prediction and neural networks. Journal of Intelligent & Robotic Systems, 31(1), 91–103. http://www.smartquant.com/references/NeuralNetworks/neural30.pdf.
R2012a documentation neural network toolbox: Time series prediction. http://www.mathworks.com/help/toolbox/nnet/gs/f9-56659.html.
Ivan, R., Lee, E. K., Pompili, D., Parashar, M., Gamell, M., & Figueiredo, R. J. (2010). Towards energy-efficient reactive thermal management in instrumented datacenters. In Proceedings of IEEE/ACM international conference on energy efficient grids, clouds and clusters workshop (E2GC2), Brussels, Belgium.
Choi, J., Kim, Y., Sivasubramaniam, A., Srebric, J., Wang, Q., & Lee, J. (2007). Modeling and managing thermal profiles of rack-mounted servers with ThermoStat. In Proceedings of IEEE 13th international symposium on high performance computer architecture.
Beloglazov, A., & Buyya, R. (2012). Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Journal Concurrency and Computation: Practice & Experience, 24(13), 1397–1420.
Wang, X., & Wang, Y. (2011). Coordinating power control and performance management for virtualized server clusters. IEEE transactions on parallel and distributed systems, 22(2), 245–259.
Zhao, Y. (2009). Adaptive distributed load balancing algorithm based on live migration of virtual machines in cloud. In Proceedings of the 2009 fifth international joint conference on INC, IMS and IDC, NCM '09 (pp. 170–175).
Kansal, N. J., & Chana, I. (2012). Cloud load balancing techniques: A step towards green computing. In Proceedings of IJCSI.
Epping, D., &Denneman, F. (2010). VMware vSphere 4.1 HA and DRS technical deepdive. CreateSpace, USA.
Wang, J., von Laszewski, G., Dayal, J., He, X., Younge, A., & Furlani, T. (2009). Towards thermal aware workload scheduling in a data center. In Proceedings of the 2009 10th international symposium on pervasive systems, algorithms, and networks.
Pakbaznia, E., Ghasemazar, M., & Pedram, M. (2010). Temperature aware dynamic resource provisioning in a power optimized datacenter. In Proceedings of design automation and test in Europe.
Tang, Q., Gupta, S. K. S., Stanzione, D., & Cayton, P. (2006). Thermal-aware task scheduling to minimize energy usage of blade server based datacenters. In Proceedings of IEEE international symposium on dependable, autonomic and secure computing (DASC’06).
Weissel, A., & Bellosa, F. (2004). Dynamic thermal management for distributed systems. In Proceedings of the first workshop on temperature-aware computer systems (TACS’04).
Ferreira, A., Mosse, D., & Oh, J. (2007). Thermal faults modeling using a RC model with an application to web farms. In Proceedings of the 19th euromicro conference on real-time systems (ECRTS’07), (pp. 113-124).
Heath, T., Centeno, A. P., George, P., Ramos, L., Jaluria, Y., & Bianchini, R. (2006, October). Mercury and Freon: Temperature emulation and management for server systems. In Proceedings of the 12th international conference on architectural support for programming languages and operating systems (ASPLOS XII), (pp. 106-116).
Clark, C., Fraser, K., Hand, S., Hansen, J. G., Jul, E., Limpach, C., et al. (2005). Live migration of virtual machines. In NSDI ’05: 2nd symposium on networked systems design & implementation, (pp. 273–286).
Zhao, M., & Figueiredom, R. J. (2007). Experimental study of virtual machine migration in support of reservation of cluster resources. In Virtualization technology in distributed computing (VTDC), 2007 second international workshop on (pp. 1–8).
Wang, P.-H., & Chen, C. (2011). Energy aware load-balancing for cloud computing. Thesis paper.
Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A. F., & Buyya, R. (2011). CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Proceedings of Software: Practice and Experience (SPE), 41(1), 23–50. ISSN: 0038-0644.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, YJ., Horng, GJ., Li, JH. et al. Using Thermal-Aware VM Migration Mechanism for High-Availability Cloud Computing. Wireless Pers Commun 97, 1475–1502 (2017). https://doi.org/10.1007/s11277-017-4582-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-017-4582-8