Skip to main content
Log in

Fast VMM-based overlay networking for bridging the cloud and high performance computing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

A collection of virtual machines (VMs) interconnected with an overlay network with a layer 2 abstraction has proven to be a powerful, unifying abstraction for adaptive distributed and parallel computing on loosely-coupled environments. It is now feasible to allow VMs hosting high performance computing (HPC) applications to seamlessly bridge distributed cloud resources and tightly-coupled supercomputing and cluster resources. However, to achieve the application performance that the tightly-coupled resources are capable of, it is important that the overlay network not introduce significant overhead relative to the native hardware, which is not the case for current user-level tools, including our own existing VNET/U system. In response, we describe the design, implementation, and evaluation of a virtual networking system that has negligible latency and bandwidth overheads in 1–10 Gbps networks. Our system, VNET/P, is directly embedded into our publicly available Palacios virtual machine monitor (VMM). VNET/P achieves native performance on 1 Gbps Ethernet networks and very high performance on 10 Gbps Ethernet networks. The NAS benchmarks generally achieve over 95 % of their native performance on both 1 and 10 Gbps. We have further demonstrated that VNET/P can operate successfully over more specialized tightly-coupled networks, such as Infiniband and Cray Gemini. Our results suggest it is feasible to extend a software-based overlay network designed for computing at wide-area scales into tightly-coupled environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. This may be expanded in the future. Currently, it has been sized to support the largest possible IPv4 packet size.

References

  1. Abu-Libdeh, H., Costa, P., Rowstron, A., O’Shea, G., Donnelly, A.: Symbiotic routing in future data centers. In: Proceedings of SIGCOMM, August (2010)

    Google Scholar 

  2. AMD Corporation: AMD64 Virtualization Codenamed “Pacific” Technology: Secure Virtual Machine Architecture Reference Manual, May (2005)

  3. Andersen, D., Balakrishnan, H., Kaashoek, F., Morris, R.: Resilient overlay networks. In: Proceedings of SOSP, March (2001)

    Google Scholar 

  4. Bavier, A.C., Feamster, N., Huang, M., Peterson, L.L., Rexford, J.: In vini veritas: realistic and controlled network experimentation. In: Proceedings of SIGCOMM, September (2006)

    Google Scholar 

  5. Cui, Z., Xia, L., Bridges, P., Dinda, P., Lange, J.: Optimizing overlay-based virtual networking through optimistic interrupts and cut-through forwarding. In: Proceedings of the ACM/IEEE International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12—Supercomputing), November (2012)

    Google Scholar 

  6. Dinda, P., Sundararaj, A., Lange, J., Gupta, A., Lin, B.: Methods and Systems for Automatic Inference and Adaptation of Virtualized Computing Environments, March 2012. United States patent number 8,145,760

  7. Evangelinos, C., Hill, C.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon’s EC2. In: Proceedings of Cloud Computing and Its Applications (CCA), October (2008)

    Google Scholar 

  8. Figueiredo, R., Dinda, P.A., Fortes, J.: A case for grid computing on virtual machines. In: Proceedings of the 23rd International Conference on Distributed Computing Systems (ICDCS 2003), May (2003)

    Google Scholar 

  9. Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: open MPI

  10. Ganguly, A., Agrawal, A., Boykin, P.O., Figueiredo, R.: IP over P2P: enabling self-configuring virtual IP networks for grid computing. In: Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April (2006)

    Google Scholar 

  11. Gordon, A., Amit, N., Har’El, N., Ben-Yehuda, M., Landau, A., Schuster, A., Tsafrir, D.: ELI:: bare-metal performance for I/O virtualization. In: Proceedings of the 17th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012), March (2012)

    Google Scholar 

  12. Greenberg, A., Hamilton, J.R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D.A., Patel, P., Sengupta, S.: VL2: a scalable and flexible data center network. In: Proceedings of SIGCOMM, August (2009)

    Google Scholar 

  13. Guo, C., Lu, G., Li, D., Wu, H., Zhang, X., Shi, Y., Tian, C., Zhang, Y., Lu, S.: Bcube: a high performance, server-centric network architecture for modular data centers. In: Proceedings of SIGCOMM, August (2009)

    Google Scholar 

  14. Gupta, A.: Black Box Methods for Inferring Parallel Applications’ Properties in Virtual Environments. PhD thesis, Northwestern University, May 2008. Technical report NWU-EECS-08-04, Department of Electrical Engineering and Computer Science

  15. Gupta, A., Dinda, P.A.: Inferring the topology and traffic load of parallel programs running in a virtual machine environment. In: Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), June (2004)

    Google Scholar 

  16. Gupta, A., Zangrilli, M., Sundararaj, A., Huang, A., Dinda, P., Lowekamp, B.: Free network measurement for virtual machine distributed computing. In: Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2006)

    Google Scholar 

  17. Hua Chu, Y., Rao, S., Sheshan, S., Zhang, H.: Enabling conferencing applications on the Internet using an overlay multicast architecture. In: Proceedings of ACM SIGCOMM (2001)

    Google Scholar 

  18. Huang, W., Liu, J., Abali, B., Panda, D.: A case for high performance computing with virtual machines. In: Proceedings of the 20th ACM International Conference on Supercomputing (ICS), June–July (2006)

    Google Scholar 

  19. Innovative Computing Laboratory: HPC challenge benchmark. http://icl.cs.utk.edu/hpcc/

  20. Intel: Intel cluster toolkit 3.0 for Linux. http://software.intel.com/en-us/articles/intel-mpi-benchmarks/

  21. Jiang, X., Xu, D.: Violin: virtual internetworking on overlay infrastructure. Tech. rep. CSD TR 03-027, Department of Computer Sciences, Purdue University, July (2003)

  22. Joseph, D.A., Kannan, J., Kubota, A., Lakshminarayanan, K., Stoica, I., Wehrle, K.: Ocala: an architecture for supporting legacy applications over overlays. In: Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI), May (2006)

    Google Scholar 

  23. Kallahalla, M., Uysal, M., Swaminathan, R., Lowell, D.E., Wray, M., Christian, T., Edwards, N., Dalton, C.I., Gittler, F.: Softudc: a software-based data center for utility computing. Computer 37(11), 38–46 (2004)

    Article  Google Scholar 

  24. Kashyap, V.: IP over Infiniband (IPoIB) Architecture. IETF Network Working Group Request for Comments. RFC 4392, April (2006). Current expiration: August 2012

  25. Kim, C., Caesar, M., Rexford, J.: Floodless in Seattle: a scalable Ethernet architecture for large enterprises. In: Proceedings of SIGCOMM, August (2008)

    Google Scholar 

  26. Kumar, S., Raj, H., Schwan, K., Ganev, I.: Re-architecting VMMS for multicore systems: the sidecore approach. In: Proceedings of the 2007 Workshop on the Interaction Between Operating Systems and Computer Architecture, June (2007)

    Google Scholar 

  27. Lange, J., Dinda, P.: Transparent network services via a virtual traffic layer for virtual machines. In: Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing (HPDC), June (2007)

    Google Scholar 

  28. Lange, J., Dinda, P., Hale, K., Xia, L.: An introduction to the Palacios virtual machine monitor—release 1.3. Tech. rep. NWU-EECS-11-10, Department of Electrical Engineering and Computer Science, Northwestern University, October (2011)

  29. Lange, J., Pedretti, K., Dinda, P., Bae, C., Bridges, P., Soltero, P., Merritt, A.: Minimal-overhead virtualization of a large scale supercomputer. In: Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), March (2011)

    Google Scholar 

  30. Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., Bridges, P., Gocke, A., Jaconette, S., Levenhagen, M., Brightwell, R.: Palacios and Kitten: new high performance operating systems for scalable virtualized and native supercomputing. In: Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April (2010)

    Google Scholar 

  31. Lange, J., Sundararaj, A., Dinda, P.: Automatic dynamic run-time optical network reservations. In: Proceedings of the 14th International Symposium on High Performance Distributed Computing (HPDC), July (2005)

    Google Scholar 

  32. Lin, B., Dinda, P.: Vsched: mixing batch and interactive virtual machines using periodic real-time scheduling. In: Proceedings of ACM/IEEE SC (Supercomputing), November (2005)

    Google Scholar 

  33. Lin, B., Sundararaj, A., Dinda, P.: Time-sharing parallel applications with performance isolation and control. In: Proceedings of the 4th IEEE International Conference on Autonomic Computing (ICAC), June (2007)

    Google Scholar 

  34. Liu, J., Huang, W., Abali, B., Panda, D.: High performance VMM-Bypass I/O in virtual machines. In: Proceedings of the USENIX Annual Technical Conference, May (2006)

    Google Scholar 

  35. Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M., Wright, C.: VXLAN: A framework for overlaying virtualized layer 2 networks over layer 3 networks. IETF Network Working Group Internet Draft, February (2012). Current expiration: August 2012

  36. Menon, A., Cox, A.L., Zwaenepoel, W.: Optimizing network virtualization in Xen. In: Proceedings of the USENIX Annual Technical Conference (USENIX), May (2006)

    Google Scholar 

  37. Mergen, M.F., Uhlig, V., Krieger, O., Xenidis, J.: Virtualization for high-performance computing. Oper. Syst. Rev. 40(2), 8–11 (2006)

    Article  Google Scholar 

  38. Mysore, R.N., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., Vahdat, A.: Portland: a scalable fault-tolerant layer 2 data center network fabric. In: Proceedings of SIGCOMM, August (2009)

    Google Scholar 

  39. Nurmi, D., Wolski, R., Grzegorzyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The Eucalyptus open-source cloud-computing system. In: Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), May (2009)

    Google Scholar 

  40. Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: An early performance analysis of cloud computing services for scientific computing. Tech. Rep. PDS2008-006, Delft University of Technology, Parallel and Distributed Systems Report Series, December (2008)

  41. Raj, H., Schwan, K.: High performance and scalable i/o virtualization via self-virtualized devices. In: Proceedings of the 16th IEEE International Symposium on High Performance Distributed Computing (HPDC), July (2007)

    Google Scholar 

  42. Russell, R.: Virtio: towards a de-facto standard for virtual I/O devices. Oper. Syst. Rev. 42(5), 95–103 (2008)

    Article  Google Scholar 

  43. Ruth, P., Jiang, X., Xu, D., Goasguen, S.: Towards virtual distributed environments in a shared infrastructure. Computer 38(5), 63–69 (2005)

    Article  Google Scholar 

  44. Ruth, P., McGachey, P., Jiang, X., Xu D.: Viocluster: virtualization for dynamic computational domains. In: Proceedings of the IEEE International Conference on Cluster Computing (Cluster), September (2005)

    Google Scholar 

  45. Shafer, J., Carr, D., Menon, A., Rixner, S., Cox, A.L., Zwaenepoel, W., Willmann, P.: Concurrent direct network access for virtual machine monitors. In: Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA), February (2007)

    Google Scholar 

  46. Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for Internet applications. In: Proceedings of ACM SIGCOMM 2001, pp. 149–160 (2001)

    Google Scholar 

  47. Sugerman, J., Venkitachalan, G., Lim, B.-H.: Virtualizing I/O devices on VMware workstation’s hosted virtual machine monitor. In: Proceedings of the USENIX Annual Technical Conference, June (2001)

    Google Scholar 

  48. Sundararaj, A.: Automatic Run-time, and Dynamic Adaptation of Distributed Applications Executing in Virtual Environments. PhD thesis, Northwestern University, December (2006). Technical report NWU-EECS-06-18, Department of Electrical Engineering and Computer Science

  49. Sundararaj, A., Dinda, P.: Towards virtual networks for virtual machine grid computing. In: Proceedings of the 3rd USENIX Virtual Machine Research and Technology Symposium (VM 2004), May (2004). Earlier version available as technical report NWU-CS-03-27, Department of Computer Science, Northwestern University

  50. Sundararaj, A., Gupta, A., Dinda, P.: Increasing application performance in virtual environments through run-time inference and adaptation. In: Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC), July (2005)

    Google Scholar 

  51. Sundararaj, A., Sanghi, M., Lange, J., Dinda, P.: An optimization problem in adaptive virtual enviroments. In: Proceedings of the Seventh Workshop on Mathematical Performance Modeling and Analysis (MAMA), June (2005)

    Google Scholar 

  52. Tsugawa, M.O., Fortes, J.A.B.: A virtual network (vine) architecture for grid computing. In: 20th International Parallel and Distributed Processing Symposium (IPDPS), April (2006)

    Google Scholar 

  53. Uhlig, R., Neiger, G., Rodgers, D., Santoni, A., Martin, F., Anderson, A., Bennettt, S., Kagi, A., Leung, F., Smith, L.: Intel virtualization technology. IEEE Computer, 48–56 (2005)

  54. Van der Wijngaart, R.: NAS parallel benchmarks version 2.4. Tech. rep. NAS-02-007, NASA Advanced Supercomputing (NAS Division), NASA Ames Research Center, October (2002)

  55. Vaughan, C., Rajan, M., Barrett, R., Doerfler, D., Pedretti, K.: Investigating the impact of the Cielo Cray XE6 architecture on scientific application codes. In: Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum (IPDPSW 2011) (2011)

    Google Scholar 

  56. Wolinsky, D., Liu, Y., Juste, P.S., Venkatasubramanian, G., Figueiredo, R.: On the design of scalable, self-configuring virtual networks. In: Proceedings of 21st ACM/IEEE International Conference of High Performance Computing, Networking, Storage, and Analysis (SuperComputing—SC), November (2009)

    Google Scholar 

  57. Xia, L., Cui, Z., Lange, J., Tang, Y., Dinda, P., Bridges, P.: VNET/P: bridging the cloud and high performance computing through fast overlay networking. In: Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2012), June (2012)

    Google Scholar 

  58. Xia, L., Kumar, S., Yang, X., Gopalakrishnan, P., Liu, Y., Schoenberg, S., Guo, X.: Virtual WiFi: bring virtualization from wired to wireless. In: Proceedings of the 7th International Conference on Virtual Execution Environments (VEE’11) (2011)

    Google Scholar 

  59. Xia, L., Lange, J., Dinda, P., Bae, C.: Investigating virtual passthrough I/O on commodity devices. Oper. Syst. Rev. 43(3), 83–94 (July 2009). Initial version appeared at WIOV 2008

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Kevin Pedretti and Kyle Hale for their efforts in bringing up Palacios and VNET/P under CNL on the Cray XK6. This project is made possible by support from the United States National Science Foundation (NSF) via grants CNS-0709168 and CNS-0707365, and by the Department of Energy (DOE) via grant DE-SC0005343. Yuan Tang’s visiting scholar position at Northwestern was supported by the China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Xia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, L., Cui, Z., Lange, J. et al. Fast VMM-based overlay networking for bridging the cloud and high performance computing. Cluster Comput 17, 39–59 (2014). https://doi.org/10.1007/s10586-013-0274-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0274-7

Keywords

Navigation