research-article

Memory-harvesting VMs in cloud platforms

Authors:

Alexander Fuerst,

Stanko Novaković,

Gohar Irfan Chaudhry,

Prateek Sharma,

Ricardo BianchiniAuthors Info & Claims

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 583 - 594

https://doi.org/10.1145/3503222.3507725

Published: 22 February 2022 Publication History

Abstract

loud platforms monetize their spare capacity by renting “Spot” virtual machines (VMs) that can be evicted in favor of higher-priority VMs. Recent work has shown that resource-harvesting VMs are more effective at exploiting spare capacity than Spot VMs, while also reducing the number of evictions. However, the prior work focused on harvesting CPU cores while keeping memory size fixed. This wastes a substantial monetization opportunity and may even limit the ability of harvesting VMs to leverage spare cores. Thus, in this paper, we explore memory harvesting and its challenges in real cloud platforms, namely its impact on VM creation time, NUMA spanning, and page fragmentation. We start by characterizing the amount and dynamics of the spare memory in Azure. We then design and implement memory-harvesting VMs (MHVMs), introducing new techniques for memory buffering, batching, and pre-reclamation. To demonstrate the use of MHVMs, we also extend a popular cluster scheduling framework (Hadoop) and a FaaS platform to adapt to them. Our main results show that (1) there is plenty of scope for memory harvesting in real platforms; (2) MHVMs are effective at mitigating the negative impacts of harvesting; and (3) our extensions of Hadoop and FaaS successfully hide the MHVMs’ varying memory size from the users’ data-processing jobs and functions. We conclude that memory harvesting has great potential for practical deployment and users can save up to 93% of their costs when running workloads on MHVMs.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, and Michael Isard. 2016. Tensorflow: A System for Large-Scale Machine Learning. In OSDI.

Digital Library

[2]

Mohammad Agbarya, Idan Yaniv, and Dan Tsafrir. 2018. Memomania: From Huge to Huge-Huge Pages. In SYSTOR.

Digital Library

[3]

Amazon Elastic Compute Cloud. 2019. Amazon EC2 Spot Instances. https://aws.amazon.com/ec2/spot/

[4]

Pradeep Ambati, Íñigo Goiri, Felipe Frujeri, Alper Gun, Ke Wang, Brian Dolan, Brian Corell, Sekhar Pasupuleti, Thomas Moscibroda, Sameh Elnikety, Marcus Fontoura, and Ricardo Bianchini. 2020. Providing SLOs for Resource-Harvesting VMs in Cloud Platforms. In OSDI.

[5]

Microsoft Azure. 2019. Introducing B-Series, Our New Burstable VM Size. https://azure.microsoft.com/en-us/blog/introducing-b-series-our-new-burstable-vm-size/

[6]

Microsoft Azure. 2020. Azure Spot Virtual Machines. https://azure.microsoft.com/en-us/pricing/spot

[7]

Marcus Carvalho, Walfredo Cirne, Francisco Brasileiro, and John Wilkes. 2014. Long-term SLOs for reclaimed cloud computing resources. In SoCC.

[8]

Jui-Hao Chiang, Han-Lin Li, and Tzi-cker Chiueh. 2013. Working set-based physical memory ballooning. In ICAC.

[9]

Amazon Elastic Compute Cloud. 2019. Burstable Performance Instances. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-performance-instances.html

[10]

Google Cloud. 2020. Preemptible VM Instances. https://cloud.google.com/compute/docs/instances/preemptible

[11]

Standard Performance Evaluation Corporation. 2015. SPECjbb 2015. https://www.spec.org/jbb2015/

[12]

Eli Cortez, Anand Bonde, Alexandre Muzi, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and predicting workloads for improved resource management in large cloud platforms. In SOSP.

Digital Library

[13]

Lu Fang, Khanh Nguyen, Guoqing Xu, Brian Demsky, and Shan Lu. 2015. Interruptible tasks: Treating memory pressure as interrupts for highly scalable data-parallel programs. In SOSP.

[14]

Alexander Fuerst, Ahmed Ali-Eldin, Prashant Shenoy, and Prateek Sharma. 2020. Cloud-scale VM-deflation for Running Interactive Applications On Transient Servers. In HPDC.

[15]

Alexander Fuerst and Prateek Sharma. 2021. FaasCache: Keeping Serverless Computing Alive with Greedy-Dual Caching. In ASPLOS.

[16]

Ligang He, Deqing Zou, Zhang Zhang, Chao Chen, Hai Jin, and Stephen A Jarvis. 2014. Developing resource consolidation frameworks for moldable virtual machines in clouds. Future Generation Computer Systems, 32 (2014), 69–81.

Digital Library

[17]

David Hildenbrand and Martin Schulz. 2021. virtio-mem: Paravirtualized Memory Hot(Un)Plug. In VEE.

[18]

Jingyuan Hu, Xiaokuang Bai, Sai Sha, Yingwei Luo, Xiaolin Wang, and Zhenlin Wang. 2018. HUB: Hugepage ballooning in kernel-based Virtual Machines. In MEMSYS.

Digital Library

[19]

Jinchun Kim, Viacheslav Fedorov, Paul V Gratz, and AL Narasimha Reddy. 2015. Dynamic memory pressure aware ballooning. In MEMSYS.

[20]

Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J Rossbach, and Emmett Witchel. 2016. Coordinated and efficient huge page management with ingens. In OSDI.

[21]

Jacob Leverich and Christos Kozyrakis. 2014. Reconciling High Server Utilization and Sub-Millisecond Quality-of-Service. In EuroSys.

[22]

Linux. 2021. KVM. https://www.linux-kvm.org/

[23]

Haikun Liu, Hai Jin, Xiaofei Liao, Wei Deng, Bingsheng He, and Cheng-zhong Xu. 2014. Hotplug or ballooning: A comparative study on dynamic memory management techniques for Virtual Machines. IEEE Transactions on parallel and distributed systems, 26, 5 (2014), 1350–1363.

Digital Library

[24]

David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: improving resource efficiency at scale. In ISCA.

[25]

Zoltán Ádám Mann. 2016. Multicore-aware Virtual Machine Placement in Cloud Data Centers. Transactions on Computers, 65, 11 (2016), 3357–3369.

Digital Library

[26]

Ming Mao and Marty Humphrey. 2012. A Performance Study on the VM Startup Time in the Cloud. In CLOUD.

[27]

Memcached. 2021. Memcached. https://www.memcached.org/

[28]

Microsoft. 2016. Hyper-V Technology Overview. https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-technology-overview

[29]

Microsoft. 2019. Hyper-V Integration Services. https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/integration-services

[30]

Microsoft. 2021. Azure Functions. https://azure.microsoft.com/en-us/services/functions

[31]

Microsoft. 2021. Bing. https://www.bing.com/

[32]

Microsoft. 2021. Power BI. https://powerbi.microsoft.com/en-us/

[33]

Microsoft. 2021. SQL Server. https://www.microsoft.com/en-us/sql-server

[34]

OpenWhisk. 2021. Apache OpenWhisk Open Source Serverless Cloud Platform. https://openwhisk.apache.org/

[35]

Ashish Panwar, Sorav Bansal, and K. Gopinath. 2019. HawkEye: Efficient Fine-grained OS Support for Huge Pages. In ASPLOS.

[36]

Ashish Panwar, Aravinda Prasad, and K. Gopinath. 2018. Making Huge Pages Actually Useful. In ASPLOS.

[37]

PCI-SIG. [n. d.]. Address Translation Services, Revision 1.1. https://composter.com.ua/documents/ats_r1.1_26Jan09.pdf

[38]

Shlomit S Pinter, Yariv Aridor, Steven S Shultz, and Sergey Guenender. 2008. Improving machine virtualisation with ‘hotplug memory’. International Journal of High Performance Computing and Networking, 5, 4 (2008), 241–250.

Digital Library

[39]

The Next Platform. 2021. CXL and Gen-Z Iron Out A Coherent Interconnect Strategy. https://www.nextplatform.com/2020/04/03/cxl-and-gen-z-iron-out-a-coherent-interconnect-strategy/ accessed 5 May 2021

[40]

Kaveh Razavi and Thilo Kielmann. 2013. Scalable Virtual Machine Deployment Using VM Image Caches. In SC.

[41]

Tudor-Ioan Salomie, Gustavo Alonso, Timothy Roscoe, and Kevin Elphinstone. 2013. Application Level Ballooning for Efficient Server Consolidation. In EuroSys.

[42]

Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider. In USENIX ATC.

[43]

Prateek Sharma, Ahmed Ali-Edlin, and Prashant Shenoy. 2019. Resource Deflation: A New Approach For Transient Resource Reclamation. In EuroSys.

Digital Library

[44]

Lingjia Tang, Jason Mars, Xiao Zhang, Robert Hagmann, Robert Hundt, and Eric Tune. 2013. Optimizing Google’s warehouse scale computers: The NUMA experience. In HPCA.

[45]

Muhammad Tirmazi, Adam Barker, Nan Deng, Md E Haque, Zhijing Gene Qin, Steven Hand, Mor Harchol-Balter, and John Wilkes. 2020. Borg: the next generation. In EuroSys.

[46]

Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. Large-scale Cluster Management at Google with Borg. In EuroSys.

[47]

Marcel Wagenländer, Luo Mai, Guo Li, and Peter Pietzuch. 2020. Spotnik: Designing Distributed Machine Learning for Transient Cloud Resources. In 12th $USENIX$ Workshop on Hot Topics in Cloud Computing (HotCloud 20).

[48]

Carl A Waldspurger. 2002. Memory resource management in VMware ESX server. ACM SIGOPS Operating Systems Review, 36, SI (2002), 181–194.

[49]

Yawen Wang, Kapil Arya, Marios Kogias, Manohar Vanga, Aditya Bhandari, Neeraja J. Yadwadkar, Siddhartha Sen, Sameh Elnikety, Christos Kozyrakis, and Ricardo Bianchini. 2021. SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In EuroSys.

[50]

Timothy Wood, Prashant J Shenoy, Arun Venkataramani, and Mazin Yousif. 2007. Black-box and Gray-box Strategies for Virtual Machine Migration. In NSDI.

[51]

Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. In ISCA.

Digital Library

[52]

Apache Hadoop YARN. [n. d.]. Dynamic Resource Configuration. https://issues.apache.org/jira/browse/YARN-999

[53]

Apache Hadoop YARN. [n. d.]. In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task. https://issues.apache.org/jira/browse/YARN-999

[54]

Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In HotCloud.

Digital Library

[55]

Qizhen Zhang, Philip A. Bernstein, Daniel S. Berger, Badrish Chandramouli, Vincent Liu, and Boon Thau Loo. 2022. CompuCache: Remote Computable Caching using Spot VMs. In CIDR.

[56]

Qi Zhang, Ling Liu, Jiangchun Ren, Gong Su, and Arun Iyengar. 2016. iBalloon: Efficient VM Memory Balancing as a Service. In ICWS.

[57]

Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI2: CPU Performance Isolation for Shared Compute Clusters. In EuroSys.

[58]

Yanqi Zhang, Íñigo Goiri, Gohar Irfan Chaudhry, Sameh Elnikety Rodrigo Fonseca, Christina Delimitrou, and Ricardo Bianchini. 2021. Faster and Cheaper Serverless Computing on Harvested Resources. In SOSP.

[59]

Yunqi Zhang, George Prekas, Giovanni Matteo Fumarola, Marcus Fontoura, Íñigo Goiri, and Ricardo Bianchini. 2016. History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters. In OSDI.

[60]

Weiming Zhao, Zhenlin Wang, and Yingwei Luo. 2009. Dynamic Memory Balancing for Virtual Machines. SIGOPS Operating Systems Review, 43, 3 (2009), 37–47.

Digital Library

Cited By

Reidys BZardoshti PGoiri ÍIrvene CBerger DMa HArya KCortez EStark TBak EIyigun MNovakovic SHsu LTrueba KPan ABansal CRajmohan SHuang JBianchini REeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud PlatformsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707226(164-181)Online publication date: 30-Mar-2025
https://dl.acm.org/doi/10.1145/3669940.3707226
Xu TXue BSong YWu XPeng XLyu YWang XTian CYe BNguyen CLyu BWen RZong ZZhu SBagchi SZhang Y(2024)CyberStarProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692006(227-246)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.5555/3691992.3692006
Chen LLiu SWang CMa HQiao YWang ZWu CLu YFeng XCui HLu SXu HGavrilovska ATerry D(2024)A tale of two pathsProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691943(77-95)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.5555/3691938.3691943
Show More Cited By

Index Terms

Memory-harvesting VMs in cloud platforms
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing

Recommendations

Automatic memory-based vertical elasticity and oversubscription on cloud platforms

Hypervisors and Operating Systems support vertical elasticity techniques such as memory ballooning to dynamically assign the memory of Virtual Machines (VMs). However, current Cloud Management Platforms (CMPs), such as OpenNebula or OpenStack, do not ...
Tiered Memory: An Iso-Power Memory Architecture to Address the Memory Power Wall

Moore's Law improvement in transistor density is driving a rapid increase in the number of cores per processor. DRAM device capacity and energy efficiency are increasing at a slower pace, so the importance of DRAM power is increasing. This problem ...
Enabling Hybrid PCM Memory System with Inherent Memory Management
RACS '16: Proceedings of the International Conference on Research in Adaptive and Convergent Systems

Replacing the traditional volatile main memory, e.g., DRAM, with a non-volatile phase change memory (PCM) has become a possible solution to reduce the energy consumption of computing systems. To further reduce the bit cost of PCM, the development trend ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

February 2022

1164 pages

ISBN:9781450392051

DOI:10.1145/3503222

General Chairs:
Babak Falsafi
EPFL, Switzerland
,
Michael Ferdman
Stony Brook University, USA
,
Program Chairs:
Shan Lu
University of Chicago, USA
,
Tom Wenisch
University of Michigan, USA / Google, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASPLOS '22

Sponsor:

ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

February 28 - March 4, 2022

Lausanne, Switzerland

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
1,345
Total Downloads

Downloads (Last 12 months)212
Downloads (Last 6 weeks)39

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Reidys BZardoshti PGoiri ÍIrvene CBerger DMa HArya KCortez EStark TBak EIyigun MNovakovic SHsu LTrueba KPan ABansal CRajmohan SHuang JBianchini REeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud PlatformsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707226(164-181)Online publication date: 30-Mar-2025
https://dl.acm.org/doi/10.1145/3669940.3707226
Xu TXue BSong YWu XPeng XLyu YWang XTian CYe BNguyen CLyu BWen RZong ZZhu SBagchi SZhang Y(2024)CyberStarProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692006(227-246)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.5555/3691992.3692006
Chen LLiu SWang CMa HQiao YWang ZWu CLu YFeng XCui HLu SXu HGavrilovska ATerry D(2024)A tale of two pathsProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691943(77-95)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.5555/3691938.3691943
Qiao YRuan ZMa HBelay AKim MXu HVanbever LZhang I(2024)Harvesting idle memory for application-managed soft state with midasProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691894(1247-1265)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691894
Fried JChaudhry GSaurez EChoukse EGoiri ÍElnikety SFonseca RBelay AVanbever LZhang I(2024)Making kernel bypass practical for the cloud with junctionProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691829(55-73)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.5555/3691825.3691829
Gupta NNarayanan IHanda SChakraborti SThapar PShan BRao ALiu YWang PWu YGao QCheng CYou SHuang LFan JYu KLin KMu TMalani PWang HLu TZhang P(2024)Dynamic Idle Resource Leasing To Safely Oversubscribe Capacity At MetaProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698537(792-810)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698537
Zhang XHe QFan HWu S(2024)Faascale: Scaling MicroVM Vertically for Serverless Computing with Memory ElasticityProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698512(196-212)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698512
Huang JZhang MMa TLiu ZLin SChen KJiang JLiao XShan YZhang NLu MMa TGong HWu YWitchel EArpaci-Dusseau ARossbach CKeeton K(2024)TrEnv: Transparently Share Serverless Execution Environments Across Different Functions and NodesProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695967(421-437)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3694715.3695967
Luo XZuo PShen JGu JWang XLyu MZhou Y(2024)A Memory-Disaggregated Radix TreeACM Transactions on Storage10.1145/366428920:3(1-41)Online publication date: 6-Jun-2024
https://dl.acm.org/doi/10.1145/3664289
Zhao ZWu MChen HZang B(2024)Characterization and Reclamation of Frozen Garbage in Managed FaaS WorkloadsProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629579(281-297)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3629579
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten