Skip to main content
Log in

Large-scale virtual machines provisioning in clouds: challenges and approaches

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

The scale of global data center market has been explosive in recent years. As the market grows, the demand for fast provisioning of the virtual resources to support elastic, manageable, and economical computing over the cloud becomes high. Fast provisioning of large-scale virtual machines (VMs), in particular, is critical to guarantee quality of service (QoS). In this paper, we systematically review the existing VM provisioning schemes and classify them in three main categories. We discuss the features and research status of each category, and introduce two recent solutions, VMThunder and VMThunder+, both of which can provision hundreds of VMs in seconds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lu X, Wang H, Wang J, Xu J, Li D. Internet-based virtual computing environment: beyond the data center as a computer. Future Generation Computer Systems, 2013, 29(1): 309–322

    Article  Google Scholar 

  2. Ahmed W, Wu Y. Estimation of cloud node acquisition. Tsinghua Science and Technology, 2014, 19(1): 1–12

    Article  Google Scholar 

  3. Mao M, Humphrey M. A performance study on the VM startup time in the cloud. In: Proceedings of IEEE International Conference on Cloud Computing (CLOUD). 2012, 423–430

    Google Scholar 

  4. Zhang Z, Li Z, Wu K, Li D, Li H, Peng Y, Lu X. VMThunder: fast provisioning of large-scale virtual machine clusters. IEEE Transactions on Parallel and Distributed System, 2014, 25(12): 3328–3338

    Article  Google Scholar 

  5. Sotomayor B, Keahey K, Foster I. Combining batch execution and leasing using virtual machines. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing. 2008, 87–96

    Google Scholar 

  6. Sotomayor B, Montero R S, Llorente I M, Foster I. Virtual infrastructure management in private and hybrid clouds. IEEE Internet Computing, 2009, 13(5): 14–22

    Article  Google Scholar 

  7. Li J, Li D, Ye Y, Lu X. Efficient multi-tenant virtual machine allocation in cloud data centers. Tsinghua Science and Technology, 2015, 20(1): 81–89

    Article  Google Scholar 

  8. Le D, Huang H, Wang H. Understanding performance implications of nested file systems in a virtualized environment. In: Proceedings of USENIX Conference on File and Storage Technologies. 2012, 8

    Google Scholar 

  9. Bellard F. Qemu, a fast and portable dynamic translator. In: Proceedings of USENIX Annual Technical Conference. 2005, 41–46

    Google Scholar 

  10. Nicolae B, Bresnahan J, Keahey K, Antoniu G. Going back and forth: efficient multideployment and multisnapshotting on clouds. In: Proceedings of ACM Symposium on High Performance Distributed Computing. 2011, 147–158

    Google Scholar 

  11. Xiao W, Liu Y, Yang Q, Ren J, Xie C. Implementation and performance evaluation of two snapshot methods on iSCSI target storages. In: Proceedings of IEEE Conference on Mass Storage Systems and Technologies. 2006

    Google Scholar 

  12. Jayaram K R, Peng C, Zhang Z, Kim M, Chen H, Lei H. An empirical analysis of similarity in virtual machine images. In: Proceedings of the Middleware. 2011, 6

  13. Peng C, Kim M, Zhang Z, Lei H. Vdn: Virtual machine image distribution network for cloud data centers. In: Proceedings of IEEE Infocom. 2012, 181–189

    Google Scholar 

  14. Razavi K, Ion A, Kielmann T. Squirrel: Scatter hoarding VM image contents on IaaS compute nodes. In: Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing. 2014, 265–278

    Google Scholar 

  15. Jin K, Miller E L. The effectiveness of deduplication on virtual machine disk images. In: Proceedings of SYSTOR. 2009, 7

  16. Ng C H, Ma M, Wong T Y, Lee P P C, Lui J C S. Live deduplication storage of virtual machine images in an open-source cloud. In: Proceedings of the 12th International Middleware Conference. 2011, 80–99

    Google Scholar 

  17. Srinivasan K, Bisson T, Goodson G, Voruganti K. iDedup: Latencyaware, inline data deduplication for primary storage. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies. 2012, 12: 1–14

    Google Scholar 

  18. Ammons G, Bala V, Mummert T, Reimer D, Zhang X. Virtual machine images as structured data: the mirage image library. In: Proceedings of USENIX HotCloud. 2011

    Google Scholar 

  19. Reimer D, Thomas A, Ammons G, Mummert T, Alpern B, Bala V. Opening black boxes: using semantic information to combat virtual machine image sprawl. In: Proceedings of International Conference on Virtual Execution Environments. 2008, 111–120

    Google Scholar 

  20. Tang C. Fvd: a high-performance virtual machine image format for cloud. In: Proceedings of USENIX Annual Technical Conference. 2011

    Google Scholar 

  21. Papadopoulos P. Extending clusters to Amazon EC2 using the rocks toolkit. International Journal of High Performance Computing Applications, 2011, 25(3): 317–327

    Article  Google Scholar 

  22. Nurmi D, Wolski R, Grzegorczyk C, Obertelli G, Soman S, Youseff L, Zagorodnov D. The eucalyptus open-source cloud-computing system. In: Proceedings of the 9th IEEE/ACM International Symposium on CCGrid. 2009, 124–131

    Google Scholar 

  23. LiD, Cao J, Lu X, Chen K. Efficient range query processing in peer-topeer systems. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(1): 78–91

    Article  Google Scholar 

  24. Zhang Z, Lu X, Peng Y, Li H. A reality check of multiple snowball tree file dissemination in large scale cloud cluster. In: Proceedings of the 15th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops. 2012, 76–80

    Chapter  Google Scholar 

  25. Wartel R, Cass T, Moreira B, Roche E, Guijarro M, Goasguen S, Schwickerath U. Image distribution mechanisms in large scale cloud providers. In: Proceedings of IEEE CloudCom. 2010, 112–117

    Google Scholar 

  26. Chen Z, Zhao Y, Miao X, Chen Y, Wang Q. Rapid provisioning of cloud infrastructure leveraging peer-to-peer networks. In: Proceedings of the 29th IEEE International Conference on Distributed Computing Systems Workshops. 2009, 324–329

    Google Scholar 

  27. O’Donnell C M. Using bittorrent to distribute virtual machine images for classes. In: Proceedings of the 36th Annual ACM SIGUCCS Fall Conference: Moving Mountains, Blazing Trails. 2008, 287–290

    Google Scholar 

  28. Reich J, Laadan O, Brosh E, Sherman A, Misra V, Nieh J, Rubenstein D. Vmtorrent: scalable P2P virtual machine streaming. In: Proceedings of ACM Conference on Emerging Network Experiment and Technology. 2012, 289–300

    Google Scholar 

  29. Morgan Jr T. Drbl: Diskless remote boot in linux. NETWORK, 2006, 192: 100–0

    Google Scholar 

  30. Weil S A, Brandt S A, Miller E L, Long D D, Maltzahn C. Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. 2006, 307–320

    Google Scholar 

  31. Shamma M, Meyer D T, Wires J, Ivanova M, Hutchinson NC, Warfield A. Capo: recapitulating storage for virtual desktops. In: Proceedings of USENIX Conference on File and Storage Technologies. 2011

    Google Scholar 

  32. Liao X, Xiong X, Jin H, Hu L. Lvd: A lightweight virtual desktop management architecture. Systems and Virtualization Management. Standards and New Technologies, 2008, 25–36

    Google Scholar 

  33. Wo T, Wang H, Hu C, Cui Y. Dvce: the virtual computing environment supported by distributed VM images. In: Proceedings of ISORC iVCE Workshop. 2012

    Google Scholar 

  34. Carns P H, Ligon III W B, Ross R B, Thakur R. PVFS: A parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase Conference. 2000, 28–28

    Google Scholar 

  35. Flouris M D, Lachaize R, Bilas A. Orchestra: Extensible block-level support for resource and data sharing in networked storage systems. In: Proceedings of the 14th IEEE International Conference on Parallel and Distributed Systems. 2008, 237–244

    Google Scholar 

  36. Flouris M D, Bilas A. Violin: a framework for extensible block-level storage. In: Proceedings of the 22nd IEEE Goddard Conference on Mass Storage Systems and Technologies. 2005, 128–142

    Google Scholar 

  37. Meyer D, Aggarwal G, Cully B, Lefebvre G, Feeley M, Hutchinson N, Warfield A. Parallax: virtual disks for virtual machines. In: Proceedings of EuroSys. 2008

    Google Scholar 

  38. Razavi K, Kielmann T. Scalable virtual machine deployment using vm image caches. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 2013, 65

    Google Scholar 

  39. Zhao X, Zhang Y, Wu Y, Chen K, Jiang J, Li K. Liquid: A scalable deduplication file system for virtual machine images. IEEE Transactions on Parallel and Distributed System, 2014, 25(5): 1257–1266

    Article  Google Scholar 

  40. Morrey III C B, Grunwald D. Content-based block caching. In: Proceedings of the 23rd IEEE Conference on Mass Storage Systems and Technologies. 2006

    Google Scholar 

  41. Ng C H, Ma M, Wong T Y, Lee P P C, Lui J C S. Live deduplication storage of virtual machine images in an open-source cloud. In: Proceedings of the 12th International Middleware Conference. 2011, 80–99

    Google Scholar 

  42. Lagar-Cavilla H A, Whitney J A, Scannell A M, Patchin P, Rumble S M, De Lara E, Brudno M, Satyanarayanan M. Snowflock: rapid virtual machine cloning for cloud computing. In: Proceedings of the 4th ACM European Conference on Computer Systems. 2009, 1–12

    Google Scholar 

  43. Zhu J, Jiang Z, Xiao Z. Twinkle a fast resource provisioning mechanism for internet services. In: Proceedings of IEEE Infocom. 2011, 802–810

    Google Scholar 

  44. Cui L, Li J, Li B, Huai J, Hu C, Wo T, Al-Aqrabi H, Liu L.VMScatter: Migrate virtual machines to many hosts. In: Proceedings of International Conference on Virtual Execution Environments. 2013, 63–72

    Google Scholar 

  45. Merkel D. Docker: Lightweight linux containers for consistent development and deployment. Linux Journal, 2014, 2014(239): 2

    Google Scholar 

  46. Zhao Y, Wu J, Liu C. On peer-assisted data dissemination in data center networks: Analysis and implementation. Tsinghua Science and Technology, 2014, 19(1): 51–64

    Article  Google Scholar 

  47. Zhang Z, Wu K, Li H, Feng J, Peng Y, Lu X. Raflow: Read ahead accelerated I/O flow through multiple virtual layers. In: Proceedings of the 9th IEEE International conference on networking, architecture and storage. 2014, 33–42

    Google Scholar 

  48. Zhang P, Chu R, Wang H. Swapcached: An effective method to promote guest paging performance on virtualization platform. In: Proceedings of the 7th IEEE International Symposium on Service Oriented System Engineering. 2013, 379–384

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongsheng Li.

Additional information

Zhaoning Zhang received his MS in Computer Science at the National University of Defense Technology (NUDT), China in 2009. He was a visiting scholar in the Computer Science Department, University of Victoria, Canada and is currently a PhD student in the Computer School of NUDT. His research focuses on virtualization and block-level distributed storage.

Dongsheng Li received his PhD (with honor) at National University of Defense Technology (NUDT), China in 2005. He is currently a professor in the Computer School of NUDT. His research interests include cloud computing, computer networks and data management.

Kui Wu received his BS and the MS in Computer Science from Wuhan University, China in 1990 and 1993, respectively, and his PhD in Computing Science from the University of Alberta, Canada in 2002. He joined the Department of Computer Science at the University of Victoria, Canada in 2002 and is currently a Professor there. His research interests include mobile and wireless networks, network performance evaluation, and cloud computing.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Li, D. & Wu, K. Large-scale virtual machines provisioning in clouds: challenges and approaches. Front. Comput. Sci. 10, 2–18 (2016). https://doi.org/10.1007/s11704-015-4420-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-015-4420-7

Keywords

Navigation