Improving Cost-Efficiency through Failure-Aware Server Management and Scheduling in Cloud

Zhao, Laiping; Sakurai, Kouichi

doi:10.1007/978-3-319-04519-1_2

Laiping Zhao⁵ &
Kouichi Sakurai⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 367))

Included in the following conference series:

International Conference on Cloud Computing and Services Science

889 Accesses

Abstract

We examine the problem of managing a server farm in a cost-efficient way that reduces the cost caused by server failures, according to an Infrastructure-as-a-Service model in cloud. Specifically, failures in cloud systems are so frequent that severely affect the normal operation of job requests and incurring high penalty cost. It is possible to increase the net revenue through reducing the energy cost and penalty by leveraging failure predictiors. First, we incorporate the malfunction and recovery states into the server management process, and improve the cost-efficiency of each server using failure predictor-based proactive recovery. Second, we present a revenue-driven cloud scheduling algorithm, which further increases net revenue in collaboration with server management algorithm. The formal and experimental analysis manifests our expected net revenue improvement.

This work is based on ”On Revenue Driven Server Management in Cloud”, by L. Zhao and K. Sakurai, which appeared in Proc. of 2nd International Conference on Cloud Computing and Service Science, Portugal, April 2012.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Understanding Cost Dynamics of Serverless Computing: An Empirical Study

A Comparative Cost Study of Fault-Tolerant Techniques for Availability on the Cloud

A Methodology for Automating the Cloud Data Center Availability Assessment

References

Bobroff, N., Kochut, A., Beaty, K.: Dynamic Placement of Virtual Machines for Managing SLA Violations. In: 10th IFIP/IEEE International Symposium on Integrated Network Management, pp. 119–128 (2007)
Google Scholar
Schroeder, B., Gibson, G.A.: A large-scale study of failures in high-performance computing systems. In: DSN 2006, pp. 249–258 (2006)
Google Scholar
Hoelzle, U., Barroso, L.A.: The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 1st edn. Morgan and Claypool Publishers (2009)
Google Scholar
Dean, J.: Experiences with mapreduce, an abstraction for large-scale computation. In: PACT 2006, pp. 1–6. ACM (2006)
Google Scholar
Vishwanath, K.V., Nagappan, N.: Characterizing cloud computing hardware reliability. In: SoCC 2010, pp. 193–204 (2010)
Google Scholar
Nightingale, E.B., Douceur, J.R., Orgovan, V.: Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer pcs. In: EuroSys 2011, pp. 343–356. ACM (2011)
Google Scholar
Javadi, B., Kondo, D., Vincent, J.M., Anderson, D.P.: Discovering statistical models of availability in large distributed systems: An empirical study of seti@home. IEEE Transactions on Parallel and Distributed Systems 22, 1896–1903 (2011)
Article Google Scholar
Fu, S., Xu, C.Z.: Exploring event correlation for failure prediction in coalitions of clusters. In: SC 2007, pp. 41:1–41:12. ACM (2007)
Google Scholar
Pinheiro, E., Weber, W.D., Barroso, L.A.: Failure trends in a large disk drive population. In: FAST 2007, pp. 17–28 (2007)
Google Scholar
Salfner, F., Lenk, M., Malek, M.: A survey of online failure prediction methods. ACM Comput. Surv. 42, 10:1–10:42 (2010)
Google Scholar
Koomey, J., Brill, K., Turner, P., et al.: A simple model for determining true total cost of ownership for data centers. Uptime institute white paper (2007)
Google Scholar
Patel, C.D., Shah, A.J.: A simple model for determining true total cost of ownership for data centers. Hewlett-Packard Development Company report HPL-2005-107 (2005)
Google Scholar
Fitó, J.O., Presa, I.G., Guitart, J.: Sla-driven elastic cloud hosting provider. In: PDP 2010, pp. 111–118 (2010)
Google Scholar
Macías, M., Rana, O., Smith, G., Guitart, J., Torres, J.: Maximizing revenue in grid markets using an economically enhanced resource manager. Concurrency and Computation Practice and Experience 22, 1990–2011 (2010)
Article Google Scholar
Mazzucco, M., Dyachuk, D., Deters, R.: Maximizing cloud providers’ revenues via energy aware allocation policies. In: IEEE CLOUD 2010, pp. 131–138 (2010)
Google Scholar
Mazzucco, M., Dyachuk, D., Dikaiakos, M.: Profit-aware server allocation for green internet services. In: MASCOTS 2010, pp. 277–284 (2010)
Google Scholar
Abraham, A., Grosan, C.: Genetic programming approach for fault modeling of electronic hardware. In: The 2005 IEEE Congress on Evolutionary Computation, vol. 2, pp. 1563–1569 (2005)
Google Scholar
Marbukh, V., Mills, K.: Demand pricing & resource allocation in market-based compute grids: A model and initial results. In: ICN 2008, pp. 752–757 (2008)
Google Scholar
Zheng, Q., Veeravalli, B.: Utilization-based pricing for power management and profit optimization in data centers. JPDC 72, 27–34 (2012)
Google Scholar
Macías, M., Guitart, J.: A genetic model for pricing in cloud computing markets. In: SAC 2011, pp. 113–118. ACM, New York (2011)
Google Scholar
Mastroianni, C., Meo, M., Papuzzo, G.: Self-economy in cloud data centers: statistical assignment and migration of vms. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part I. LNCS, vol. 6852, pp. 407–418. Springer, Heidelberg (2011)
Chapter Google Scholar
Rackspace (2012), http://www.rackspace.com (Online; accessed January 31, 2012)
Lewis, P.A.: A branching poisson process model for the analysis of computer failure patterns. Journal of the Royal Statistical Society, Series B 26, 398–456 (1964)
MATH Google Scholar
IBM: Ibm system x 71451ru entry-level server (2012), http://www.amazon.com/System-71451RU-Entry-level-Server-E7520/dp/B003U772W4

Download references

Author information

Authors and Affiliations

School of Computer Software, Tianjin University, China
Laiping Zhao
Department of Informatics, Kyushu University, Japan
Kouichi Sakurai

Authors

Laiping Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Kouichi Sakurai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Empire State College, Long Island Center, State University of New York, 11788, NY, U.S.A.
Ivan I. Ivanov
University of Twente, Enschede, The Netherlands
Marten van Sinderen
Institute of Architecture of Application Systems, University of Stuttgart, Universittsstraße 38, 70569, Stuttgart, Germany
Frank Leymann
CTS, Charlotte, NC, USA
Tony Shan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, L., Sakurai, K. (2013). Improving Cost-Efficiency through Failure-Aware Server Management and Scheduling in Cloud. In: Ivanov, I.I., van Sinderen, M., Leymann, F., Shan, T. (eds) Cloud Computing and Services Science. CLOSER 2012. Communications in Computer and Information Science, vol 367. Springer, Cham. https://doi.org/10.1007/978-3-319-04519-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-04519-1_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04518-4
Online ISBN: 978-3-319-04519-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics