skip to main content
10.1145/3503222.3507725acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Memory-harvesting VMs in cloud platforms

Published: 22 February 2022 Publication History

Abstract

loud platforms monetize their spare capacity by renting “Spot” virtual machines (VMs) that can be evicted in favor of higher-priority VMs. Recent work has shown that resource-harvesting VMs are more effective at exploiting spare capacity than Spot VMs, while also reducing the number of evictions. However, the prior work focused on harvesting CPU cores while keeping memory size fixed. This wastes a substantial monetization opportunity and may even limit the ability of harvesting VMs to leverage spare cores. Thus, in this paper, we explore memory harvesting and its challenges in real cloud platforms, namely its impact on VM creation time, NUMA spanning, and page fragmentation. We start by characterizing the amount and dynamics of the spare memory in Azure. We then design and implement memory-harvesting VMs (MHVMs), introducing new techniques for memory buffering, batching, and pre-reclamation. To demonstrate the use of MHVMs, we also extend a popular cluster scheduling framework (Hadoop) and a FaaS platform to adapt to them. Our main results show that (1) there is plenty of scope for memory harvesting in real platforms; (2) MHVMs are effective at mitigating the negative impacts of harvesting; and (3) our extensions of Hadoop and FaaS successfully hide the MHVMs’ varying memory size from the users’ data-processing jobs and functions. We conclude that memory harvesting has great potential for practical deployment and users can save up to 93% of their costs when running workloads on MHVMs.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, and Michael Isard. 2016. Tensorflow: A System for Large-Scale Machine Learning. In OSDI.
[2]
Mohammad Agbarya, Idan Yaniv, and Dan Tsafrir. 2018. Memomania: From Huge to Huge-Huge Pages. In SYSTOR.
[3]
Amazon Elastic Compute Cloud. 2019. Amazon EC2 Spot Instances. https://aws.amazon.com/ec2/spot/
[4]
Pradeep Ambati, Íñigo Goiri, Felipe Frujeri, Alper Gun, Ke Wang, Brian Dolan, Brian Corell, Sekhar Pasupuleti, Thomas Moscibroda, Sameh Elnikety, Marcus Fontoura, and Ricardo Bianchini. 2020. Providing SLOs for Resource-Harvesting VMs in Cloud Platforms. In OSDI.
[5]
Microsoft Azure. 2019. Introducing B-Series, Our New Burstable VM Size. https://azure.microsoft.com/en-us/blog/introducing-b-series-our-new-burstable-vm-size/
[6]
Microsoft Azure. 2020. Azure Spot Virtual Machines. https://azure.microsoft.com/en-us/pricing/spot
[7]
Marcus Carvalho, Walfredo Cirne, Francisco Brasileiro, and John Wilkes. 2014. Long-term SLOs for reclaimed cloud computing resources. In SoCC.
[8]
Jui-Hao Chiang, Han-Lin Li, and Tzi-cker Chiueh. 2013. Working set-based physical memory ballooning. In ICAC.
[9]
Amazon Elastic Compute Cloud. 2019. Burstable Performance Instances. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-performance-instances.html
[10]
Google Cloud. 2020. Preemptible VM Instances. https://cloud.google.com/compute/docs/instances/preemptible
[11]
Standard Performance Evaluation Corporation. 2015. SPECjbb 2015. https://www.spec.org/jbb2015/
[12]
Eli Cortez, Anand Bonde, Alexandre Muzi, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and predicting workloads for improved resource management in large cloud platforms. In SOSP.
[13]
Lu Fang, Khanh Nguyen, Guoqing Xu, Brian Demsky, and Shan Lu. 2015. Interruptible tasks: Treating memory pressure as interrupts for highly scalable data-parallel programs. In SOSP.
[14]
Alexander Fuerst, Ahmed Ali-Eldin, Prashant Shenoy, and Prateek Sharma. 2020. Cloud-scale VM-deflation for Running Interactive Applications On Transient Servers. In HPDC.
[15]
Alexander Fuerst and Prateek Sharma. 2021. FaasCache: Keeping Serverless Computing Alive with Greedy-Dual Caching. In ASPLOS.
[16]
Ligang He, Deqing Zou, Zhang Zhang, Chao Chen, Hai Jin, and Stephen A Jarvis. 2014. Developing resource consolidation frameworks for moldable virtual machines in clouds. Future Generation Computer Systems, 32 (2014), 69–81.
[17]
David Hildenbrand and Martin Schulz. 2021. virtio-mem: Paravirtualized Memory Hot(Un)Plug. In VEE.
[18]
Jingyuan Hu, Xiaokuang Bai, Sai Sha, Yingwei Luo, Xiaolin Wang, and Zhenlin Wang. 2018. HUB: Hugepage ballooning in kernel-based Virtual Machines. In MEMSYS.
[19]
Jinchun Kim, Viacheslav Fedorov, Paul V Gratz, and AL Narasimha Reddy. 2015. Dynamic memory pressure aware ballooning. In MEMSYS.
[20]
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J Rossbach, and Emmett Witchel. 2016. Coordinated and efficient huge page management with ingens. In OSDI.
[21]
Jacob Leverich and Christos Kozyrakis. 2014. Reconciling High Server Utilization and Sub-Millisecond Quality-of-Service. In EuroSys.
[22]
Linux. 2021. KVM. https://www.linux-kvm.org/
[23]
Haikun Liu, Hai Jin, Xiaofei Liao, Wei Deng, Bingsheng He, and Cheng-zhong Xu. 2014. Hotplug or ballooning: A comparative study on dynamic memory management techniques for Virtual Machines. IEEE Transactions on parallel and distributed systems, 26, 5 (2014), 1350–1363.
[24]
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: improving resource efficiency at scale. In ISCA.
[25]
Zoltán Ádám Mann. 2016. Multicore-aware Virtual Machine Placement in Cloud Data Centers. Transactions on Computers, 65, 11 (2016), 3357–3369.
[26]
Ming Mao and Marty Humphrey. 2012. A Performance Study on the VM Startup Time in the Cloud. In CLOUD.
[27]
Memcached. 2021. Memcached. https://www.memcached.org/
[28]
Microsoft. 2016. Hyper-V Technology Overview. https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/hyper-v-technology-overview
[29]
Microsoft. 2019. Hyper-V Integration Services. https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/integration-services
[30]
Microsoft. 2021. Azure Functions. https://azure.microsoft.com/en-us/services/functions
[31]
Microsoft. 2021. Bing. https://www.bing.com/
[32]
Microsoft. 2021. Power BI. https://powerbi.microsoft.com/en-us/
[33]
Microsoft. 2021. SQL Server. https://www.microsoft.com/en-us/sql-server
[34]
OpenWhisk. 2021. Apache OpenWhisk Open Source Serverless Cloud Platform. https://openwhisk.apache.org/
[35]
Ashish Panwar, Sorav Bansal, and K. Gopinath. 2019. HawkEye: Efficient Fine-grained OS Support for Huge Pages. In ASPLOS.
[36]
Ashish Panwar, Aravinda Prasad, and K. Gopinath. 2018. Making Huge Pages Actually Useful. In ASPLOS.
[37]
PCI-SIG. [n. d.]. Address Translation Services, Revision 1.1. https://composter.com.ua/documents/ats_r1.1_26Jan09.pdf
[38]
Shlomit S Pinter, Yariv Aridor, Steven S Shultz, and Sergey Guenender. 2008. Improving machine virtualisation with ‘hotplug memory’. International Journal of High Performance Computing and Networking, 5, 4 (2008), 241–250.
[39]
The Next Platform. 2021. CXL and Gen-Z Iron Out A Coherent Interconnect Strategy. https://www.nextplatform.com/2020/04/03/cxl-and-gen-z-iron-out-a-coherent-interconnect-strategy/ accessed 5 May 2021
[40]
Kaveh Razavi and Thilo Kielmann. 2013. Scalable Virtual Machine Deployment Using VM Image Caches. In SC.
[41]
Tudor-Ioan Salomie, Gustavo Alonso, Timothy Roscoe, and Kevin Elphinstone. 2013. Application Level Ballooning for Efficient Server Consolidation. In EuroSys.
[42]
Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider. In USENIX ATC.
[43]
Prateek Sharma, Ahmed Ali-Edlin, and Prashant Shenoy. 2019. Resource Deflation: A New Approach For Transient Resource Reclamation. In EuroSys.
[44]
Lingjia Tang, Jason Mars, Xiao Zhang, Robert Hagmann, Robert Hundt, and Eric Tune. 2013. Optimizing Google’s warehouse scale computers: The NUMA experience. In HPCA.
[45]
Muhammad Tirmazi, Adam Barker, Nan Deng, Md E Haque, Zhijing Gene Qin, Steven Hand, Mor Harchol-Balter, and John Wilkes. 2020. Borg: the next generation. In EuroSys.
[46]
Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. Large-scale Cluster Management at Google with Borg. In EuroSys.
[47]
Marcel Wagenländer, Luo Mai, Guo Li, and Peter Pietzuch. 2020. Spotnik: Designing Distributed Machine Learning for Transient Cloud Resources. In 12th $USENIX$ Workshop on Hot Topics in Cloud Computing (HotCloud 20).
[48]
Carl A Waldspurger. 2002. Memory resource management in VMware ESX server. ACM SIGOPS Operating Systems Review, 36, SI (2002), 181–194.
[49]
Yawen Wang, Kapil Arya, Marios Kogias, Manohar Vanga, Aditya Bhandari, Neeraja J. Yadwadkar, Siddhartha Sen, Sameh Elnikety, Christos Kozyrakis, and Ricardo Bianchini. 2021. SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In EuroSys.
[50]
Timothy Wood, Prashant J Shenoy, Arun Venkataramani, and Mazin Yousif. 2007. Black-box and Gray-box Strategies for Virtual Machine Migration. In NSDI.
[51]
Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. In ISCA.
[52]
Apache Hadoop YARN. [n. d.]. Dynamic Resource Configuration. https://issues.apache.org/jira/browse/YARN-999
[53]
Apache Hadoop YARN. [n. d.]. In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task. https://issues.apache.org/jira/browse/YARN-999
[54]
Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In HotCloud.
[55]
Qizhen Zhang, Philip A. Bernstein, Daniel S. Berger, Badrish Chandramouli, Vincent Liu, and Boon Thau Loo. 2022. CompuCache: Remote Computable Caching using Spot VMs. In CIDR.
[56]
Qi Zhang, Ling Liu, Jiangchun Ren, Gong Su, and Arun Iyengar. 2016. iBalloon: Efficient VM Memory Balancing as a Service. In ICWS.
[57]
Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. 2013. CPI2: CPU Performance Isolation for Shared Compute Clusters. In EuroSys.
[58]
Yanqi Zhang, Íñigo Goiri, Gohar Irfan Chaudhry, Sameh Elnikety Rodrigo Fonseca, Christina Delimitrou, and Ricardo Bianchini. 2021. Faster and Cheaper Serverless Computing on Harvested Resources. In SOSP.
[59]
Yunqi Zhang, George Prekas, Giovanni Matteo Fumarola, Marcus Fontoura, Íñigo Goiri, and Ricardo Bianchini. 2016. History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters. In OSDI.
[60]
Weiming Zhao, Zhenlin Wang, and Yingwei Luo. 2009. Dynamic Memory Balancing for Virtual Machines. SIGOPS Operating Systems Review, 43, 3 (2009), 37–47.

Cited By

View all
  • (2025)Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud PlatformsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707226(164-181)Online publication date: 30-Mar-2025
  • (2024)CyberStarProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692006(227-246)Online publication date: 10-Jul-2024
  • (2024)A tale of two pathsProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691943(77-95)Online publication date: 10-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
February 2022
1164 pages
ISBN:9781450392051
DOI:10.1145/3503222
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cloud computing
  2. memory management
  3. resource harvesting

Qualifiers

  • Research-article

Conference

ASPLOS '22

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)212
  • Downloads (Last 6 weeks)39
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Coach: Exploiting Temporal Patterns for All-Resource Oversubscription in Cloud PlatformsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707226(164-181)Online publication date: 30-Mar-2025
  • (2024)CyberStarProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692006(227-246)Online publication date: 10-Jul-2024
  • (2024)A tale of two pathsProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691943(77-95)Online publication date: 10-Jul-2024
  • (2024)Harvesting idle memory for application-managed soft state with midasProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691894(1247-1265)Online publication date: 16-Apr-2024
  • (2024)Making kernel bypass practical for the cloud with junctionProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691829(55-73)Online publication date: 16-Apr-2024
  • (2024)Dynamic Idle Resource Leasing To Safely Oversubscribe Capacity At MetaProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698537(792-810)Online publication date: 20-Nov-2024
  • (2024)Faascale: Scaling MicroVM Vertically for Serverless Computing with Memory ElasticityProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698512(196-212)Online publication date: 20-Nov-2024
  • (2024)TrEnv: Transparently Share Serverless Execution Environments Across Different Functions and NodesProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695967(421-437)Online publication date: 4-Nov-2024
  • (2024)A Memory-Disaggregated Radix TreeACM Transactions on Storage10.1145/366428920:3(1-41)Online publication date: 6-Jun-2024
  • (2024)Characterization and Reclamation of Frozen Garbage in Managed FaaS WorkloadsProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629579(281-297)Online publication date: 22-Apr-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media