A fault-tolerant strategy for virtualized HPC clusters

Walters, John Paul; Chaudhary, Vipin

doi:10.1007/s11227-008-0259-0

A fault-tolerant strategy for virtualized HPC clusters

Published: 13 December 2008

Volume 50, pages 209–239, (2009)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

John Paul Walters¹ &
Vipin Chaudhary¹

206 Accesses
12 Citations
Explore all metrics

Abstract

Virtualization is a common strategy for improving the utilization of existing computing resources, particularly within data centers. However, its use for high performance computing (HPC) applications is currently limited despite its potential for both improving resource utilization as well as providing resource guarantees to its users. In this article, we systematically evaluate three major virtual machine implementations for computationally intensive HPC applications using various standard benchmarks. Using VMWare Server, Xen, and OpenVZ, we examine the suitability of full virtualization (VMWare), paravirtualization (Xen), and operating system-level virtualization (OpenVZ) in terms of network utilization, SMP performance, file system performance, and MPI scalability. We show that the operating system-level virtualization provided by OpenVZ provides the best overall performance, particularly for MPI scalability. With the knowledge gained by our VM evaluation, we extend OpenVZ to include support for checkpointing and fault-tolerance for MPI-based virtual server distributed computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantum Computing

References

Adams K, Agesen O (2006) A comparison of software and hardware techniques for x86 virtualization. In: ASPLOS-XII: proceedings of the 12th international conference on architectural support for programming languages and operating systems, 2006. ACM Press, New York, pp 2–13
Chapter Google Scholar
Ahmad I, Anderson JM, Holler AM, Kambo R, Makhija V (2003) An analysis of disk performance in VMware ESX server virtual machines. In: WWC ’03: proceedings of the 6th international workshop on workload characterization, 2003. IEEE Computer Society Press, Los Alamitos, pp 65–76
Google Scholar
Altman ER, Kaeli D, Sheffer Y (2000) Guest editors’ introduction: welcome to the opportunities of binary translation. Computer 33(3):40–45
Article Google Scholar
Bailey DH, Barszcz E, Barton JT, Browning DS, Carter RL, Dagum L, Fatoohi RA, Frederickson PO, Lasinski TA, Schreiber RS, Simon HD, Venkatakrishnan V, Weeratunga SK (1991) The NAS parallel benchmarks. Int J High Perform Comput Appl 5(3):63–73
Article Google Scholar
Barham P, Dragovic B, Fraser K, Hand S, Harris T, Ho A, Neugebauer R, Pratt I, Warfield A (2003) Xen and the art of virtualization. In: SOSP ’03: proceedings of the 19th symposium on operating systems principles, 2003. ACM Press, New York, pp 164–177
Chapter Google Scholar
Batsakis A, Burns R (2008) NFS-CD: write-enabled cooperative caching in NFS. IEEE Trans Parallel Distrib Syst 19(3):323–333
Article Google Scholar
Beguelin A, Seligman E, Stephan P (1997) Application level fault tolerance in heterogeneous networks of workstations. J Parallel Distrib Comput 43(2):147–155
Article Google Scholar
Bosilca G, Bouteiller A, Cappello F, Djilali S, Fedak G, Germain C, Herault T, Lemarinier P, Lodygensky O, Magniette F, Neri V, Selikhov A (2002) MPICH-V: toward a scalable fault tolerant MPI for volatile nodes. In: SC ’02: proceedings of the 19th annual supercomputing conference, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press, Los Alamitos, pp 1–18
Google Scholar
Bronevetsky G, Marques D, Pingali K, Stodghill P (2003) Automated application-level checkpointing of MPI programs. In: PPoPP ’03: proceedings of the 9th symposium on principles and practice of parallel programming, 2003. ACM Press, New York, pp 84–94
Chapter Google Scholar
Burns G, Daoud R, Vaigl J (1994) LAM: an open cluster environment for MPI. In: Proceedings of supercomputing symposium, 1994. IEEE Computer Society Press, Los Alamitos, pp 379–386
Google Scholar
Cherkasova L, Gardner R (2005) Measuring CPU overhead for I/O processing in the Xen virtual machine monitor. In: USENIX 2005 annual technical conference, general track. USENIX Association, pp 387–390
Clark B, Deshane T, Dow E, Evanchik S, Finlayson M, Herne J, Matthews J (2004) Xen and the art of repeated research. In: USENIX technical conference FREENIX track, 2004. USENIX Association, pp 135–144
Dongarra JJ, Luszczek P, Petitet A (2003) The LINPACK benchmark: Past, present, and future. Concurr Comput Pract Exp 15:1–18
Article Google Scholar
Duell J (2002) The design and implementation of Berkeley Lab’s Linux checkpoint/restart. Technical Report LBNL-54941, Lawrence Berkeley National Lab
Elnozahy EN, Alvisi L, Wang Y-M, Johnson DB (2002) A survey of rollback-recovery protocols in message-passing systems. ACM Comput Surv 34(3):375–408
Article Google Scholar
Emeneker W, Stanzione D (2006) HPC cluster readiness of Xen and user mode Linux. In: CLUSTER ’06: proceedings of the international conference on cluster computing, 2006. IEEE Computer Society Press, Los Alamitos, pp 1–8
Chapter Google Scholar
Goldberg RP (1974) Survey of virtual machine research. IEEE Comput 7(6):34–45
Google Scholar
Graham RL, Choi SE, Daniel DJ, Desai NN, Minnich RG, Rasmussen CE, Risinger LD, Sukalski MW (2003) A network-failure-tolerant message-passing system for terascale clusters. Int J Parallel Program 31(4):285–303
Article MATH Google Scholar
Gropp WD, Lusk E (2004) Fault tolerance in MPI programs. Int J High Perform Comput Appl 18(3):363–372
Article Google Scholar
Hewlett-Packard. Netperf. http://www.netperf.org
Litzkow M, Tannenbaum T, Basney J, Livny M (1997) Checkpoint and migration of Unix processes in the Condor distributed processing system. Technical Report 1346, University of Wisconsin-Madison
Liu J, Huang W, Abali B, Panda DK (2006) High performance VMM-bypass I/O in virtual machines. In: Proceedings of the USENIX annual technical conference, 2006. USENIX Association, pp 3–16
Menon A, Santos JR, Turner Y, Janakiraman G, Zwaenepoel W (2005) Diagnosing performance overheads in the Xen virtual machine environment. In: VEE ’05: proceedings of the 1st ACM/USENIX international conference on virtual execution environments, 2005. ACM Press, New York, pp 13–23
Chapter Google Scholar
The MPI Forum (1993) MPI: A message passing interface. In: SC ’93: proceedings of the 6th annual supercomputing conference, 1993. IEEE Computer Society Press, Los Alamitos, pp 878–883
Chapter Google Scholar
Nagarajan AB, Mueller F, Engelmann C, Scott SL (2007) Proactive fault tolerance for HPC with Xen virtualization. In: ICS ’07: proceedings of the 21st annual international conference on supercomputing, 2007. ACM Press, New York, pp 23–32
Google Scholar
Norcott WD, Capps D (2008) The IOZone filesystem benchmark. http://www.iozone.org
Plank JS, Beck M, Kingsley G, Li K (1994) Libckpt: transparent checkpointing under Unix. Technical Report UT-CS-94-242
Raj H, Schwan K (2007) High performance and scalable I/O virtualization via self-virtualized devices. In: HPDC ’07: proceedings of the international symposium on high performance distributed computing, 2007. IEEE Computer Society Press, Los Alamitos, pp 179–188
Chapter Google Scholar
Sacerdoti F, Katz MJ, Massie ML, Culler DE (2003) Wide area cluster monitoring with Ganglia. In: CLUSTER ’03: the international conference on cluster computing, 2003. IEEE Computer Society Press, Los Alamitos, pp 289–298
Google Scholar
Sankaran S, Squyres JM, Barrett B, Lumsdaine A, Duell J, Hargrove P, Roman E (2005) The LAM/MPI checkpoint/restart framework: system-initiated checkpointing. Int J High Perform Comput Appl 19(4):479–493
Article Google Scholar
Smith JE, Nair R (2005) The architecture of virtual machines. Computer 38(5):32–38
Article Google Scholar
Soltesz S, Pötzl H, Fiuczynski ME, Bavier A, Peterson L (2007) Container-based operating system virtualization: A scalable, high-performance alternative to hypervisors. SIGOPS Oper Syst Rev 41(3):275–287
Article Google Scholar
Spainhower L, Gregg TA (1999) IBM S/390 parallel enterprise server G5 fault tolerance: a historical perspective. IBM J Res Devel 43(5/6):863–873
Article Google Scholar
Squyres JM, Lumsdaine A (2003) A component architecture for LAM/MPI. In: Proceedings of the 10th European PVM/MPI users’ group meeting, 2003. LNCS, vol 2840. Springer, Berlin, pp 379–387
Google Scholar
Sridhar S, Shapiro JS, Northup E, Bungale PP (2006) HDTrans: An open source, low-level dynamic instrumentation system. In: VEE ’06: proceedings of the 2nd international conference on virtual execution environments, 2006. ACM Press, New York, pp 175–185
Google Scholar
SWSoft (2006) OpenVZ—server virtualization. http://www.openvz.org/
VMWare (2006) VMWare. http://www.vmware.com
Waldspurger CA (2002) Memory resource management in VMware ESX server. SIGOPS Oper Syst Rev 36(SI):181–194
Article Google Scholar
Walters JP, Chaudhary V (2007) A scalable asynchronous replication-based strategy for fault tolerant MPI applications. In: HiPC ’07: the international conference on high performance computing, 2007. LNCS, vol 4873. Springer, Berlin, pp 257–268
Chapter Google Scholar
Walters JP, Chaudhary V (2008) Replication-based fault-tolerance for MPI applications. IEEE Trans Parallel Distrib Syst. IEEE computer society digital library. IEEE Computer Society, 5 December 2008. http://doi.ieeecomputersociety.org/10.1109/TPDS.2008.172
Weiss A (2007) Computing in the clouds. netWorker 11(4):16–25
Article Google Scholar
Wong FC, Martin RP, Arpaci-Dusseau RH, Culler DE (1999) Architectural requirements and scalability of the NAS parallel benchmarks. In: ICS ’99: proceedings of the 13th international conference on supercomputing, 1999. ACM Press, New York, pp 41–58
Google Scholar
Zandy V (2000) Ckpt: User-level checkpointing. http://www.cs.wisc.edu/~zandy/ckpt/
Zhang Y, Wong D, Zheng W (2005) User-level checkpoint and recovery for LAM/MPI. SIGOPS Oper Syst Rev 39(3):72–81
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University at Buffalo, The State University of New York, Buffalo, USA
John Paul Walters & Vipin Chaudhary

Authors

John Paul Walters
View author publications
You can also search for this author inPubMed Google Scholar
Vipin Chaudhary
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to John Paul Walters.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Walters, J.P., Chaudhary, V. A fault-tolerant strategy for virtualized HPC clusters. J Supercomput 50, 209–239 (2009). https://doi.org/10.1007/s11227-008-0259-0

Download citation

Received: 26 February 2008
Accepted: 26 November 2008
Published: 13 December 2008
Issue Date: December 2009
DOI: https://doi.org/10.1007/s11227-008-0259-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fault-tolerant strategy for virtualized HPC clusters

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accelerating Application Migration in HPC

Viability of Virtual Machines in HPC

HermitCore

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A fault-tolerant strategy for virtualized HPC clusters

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Accelerating Application Migration in HPC

Viability of Virtual Machines in HPC

HermitCore

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now