research-article

Ingens: Huge Page Support for the OS and Hypervisor

Authors:

Christopher J. Rossbach,

Emmett WitchelAuthors Info & Claims

ACM SIGOPS Operating Systems Review, Volume 51, Issue 1

Pages 83 - 93

https://doi.org/10.1145/3139645.3139659

Published: 11 September 2017 Publication History

Abstract

Memory capacity and demand have grown hand in hand in recent years. However, overheads for memory virtualization, in particular for address translation, grow with memory capacity as well, motivating hardware manufacturers to provide TLBs with thousands of entries for larger pages, or huge pages. Current OSes and hypervisors support huge pages with a hodge-podge of best-effort algorithms and spot fixes that make less and less sense as architectural support for huge pages matures. The time has come for a more fundamental redesign.

Ingens is a framework for providing transparent huge page support in a coordinated way. Ingens manages contiguity as a first-class resource, and tracks utilization and access frequency of memory pages, enabling it to eliminate pathologies that plague current systems. Experiments with a Linux/KVM-based prototype show improved fairness and performance, and reduced tail latency and memory bloat for important applications such as Web services and Redis. We report early experiences with our in-progress port of Ingens to the ESX Hypervisor.

References

[1]

http://www.7-cpu.com/cpu/Skylake.html. {Accessed April, 2016}.

[2]

http://www.7-cpu.com/cpu/Haswell.html. {Accessed April, 2016}.

[3]

Apache Cloudstack. https://en.wikipedia.org/wiki/Apache_CloudStack. {Accessed April, 2016}.

[4]

Apache Hadoop. http://hadoop.apache.org/. {Accessed April, 2016}.

[5]

Apache Spark. http://spark.apache.org/docs/latest/index.html. {Accessed April, 2016}.

[6]

Application-friendly kernel interfaces. https://lwn.net/Articles/227818/. {March, 2007}.

[7]

Cloudera recommends turning off memory compaction due to high CPU utilization. http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_admin_performance.html. {Accessed April, 2016}.

[8]

Cloudsuite. http://parsa.epfl.ch/cloudsuite/graph.html. {Accessed April, 2016}.

[9]

CouchBase recommends disabling huge pages. http://blog.couchbase.com/often-overlookedlinux-os-tweaks. {March, 2014}.

[10]

DokuDB recommends disabling huge pages. https://www.percona.com/blog/2014/07/23/whytokudb-hates-transparent-hugepages/. {July, 2014}.

[11]

Exponential moving average. https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average. {Accessed April, 2016}.

[12]

High CPU utilization in Hadoop due to transparent huge pages. https://www.ghostar.org/2015/02/transparent-huge-pages-on-hadoop-makesme-sad/. {February, 2015}.

[13]

High CPU utilization in Mysql due to transparent huge pages. http://developer.okta.com/blog/2015/05/22/tcmalloc. {May, 2015}.

[14]

Huge page support in Mac OS X. https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man2/mmap.2.html. {Accessed April-2016}.

[15]

IBM cloud with KVM hypervisor. http://www.networkworld.com/article/2230172/opensource-subnet/red-hat-s-kvmvirtualization-proves-itself-in-ibm-scloud.html. {March, 2010}.

[16]

IBM recommends turning off huge pages due to high CPU utilization. http://www-01.ibm.com/support/docview.wss?uid=swg21677458. {July, 2014}.

[17]

Intel HiBench. https://github.com/intel-hadoop/HiBench/tree/master/workloads. {Accessed April, 2016}.

[18]

Jemalloc. http://www.canonware.com/jemalloc/. {Accessed April-2016}.

[19]

Large-page support in Windows. https://msdn.microsoft.com/en-us/library/windows/desktop/aa366720(v=vs.85).aspx. {Accessed April-2016}.

[20]

Liblinear. https://www.csie.ntu.edu.tw/¿cjlin/liblinear/. {Accessed April, 2016}.

[21]

MongoDB. https://www.mongodb.com/. {Accessed April, 2016}.

[22]

MongoDB recommends disabling huge pages. https://docs.mongodb.org/manual/tutorial/transparenthuge-pages/. {Accessed April, 2016}.

[23]

Movie recommendation with Spark. http://ampcamp.berkeley.edu/big-data-mini-course/movierecommendation-with-mllib.html. {Accessed April, 2016}.

[24]

NuoDB recommends disabling huge pages. http://www.nuodb.com/techblog/linux-transparenthuge-pages-jemalloc-and-nuodb. {May, 2014}.

[25]

OpenStack. https://openvirtualizationalliance.org/what-kvm/openstack. {Accessed April-2016}.

[26]

PARSEC 3.0 benchmark suite. http://parsec.cs.princeton.edu/. {Accessed April, 2016}.

[27]

Redis. http://redis.io/. {Accessed April, 2016}.

[28]

Redis recommends disabling huge pages. http://redis.io/topics/latency. {Accessed April, 2016}.

[29]

SAP IQ recommends disabling huge pages. http://scn.sap.com/people/markmumy/blog/2014/05/22/sap-iqand-linux-hugepagestransparent-hugepages. {May, 2014}.

[30]

SPEC CPU 2006. https://www.spec.org/cpu2006/. {Accessed April, 2016}.

[31]

Splunk recommends disabling huge pages. http://docs.splunk.com/Documentation/Splunk/6.1.3/ReleaseNotes/SplunkandTHP. {December, 2013}.

[32]

Thread-caching malloc. http://goog-perftools.sourceforge.net/doc/tcmalloc.html. {Accessed April-2016}.

[33]

Transparent huge pages in 2.6.38. https://lwn.net/Articles/423584/. {January, 2011}.

[34]

VoltDB recommends disabling huge pages. https://docs.voltdb.com/AdminGuide/adminmemmgt.php. {Accessed April, 2016}.

[35]

J. Ahn, S. Jin, and J. Huh. Revisiting hardware-assisted page walks for virtualized systems. In International Symposium on Computer Architecture (ISCA), 2012.

Digital Library

[36]

J. Ahn, S. Jin, and J. Huh. Fast two-level address translation for virtualized systems. In IEEE Transactions on Computers, 2015.

Digital Library

[37]

AMD. AMD-V Nested Paging, 2010. http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf.

[38]

J. Araujo, R. Matos, P. Maciel, R. Matias, and I. Beicker. Experimental evaluation of software aging effects on the eucalyptus cloud computing infrastructure. In Middleware Industry Track Workshop, 2011.

Digital Library

[39]

T. W. Barr, A. L. Cox, and S. Rixner. Translation caching: Skip, don't walk (the page table). In International Symposium on Computer Architecture (ISCA), 2010.

Digital Library

[40]

T. W. Barr, A. L. Cox, and S. Rixner. Spectlb: A mechanism for speculative address translation. In International Symposium on Computer Architecture (ISCA), 2011.

Digital Library

[41]

A. Basu, J. Gandhi, J. Chang, M. D. Hill, and M. M. Swift. Efficient virtual memory for big memory servers. In International Symposium on Computer Architecture (ISCA), 2013.

Digital Library

[42]

A. Beitch, B. Liu, T. Yung, R. Griffith, A. Fox, and D. Patterson. Rain: A workload generation toolkit for cloud computing applications. In U.C. Berkeley Technical Publications (UCB/EECS-2010-14), 2010.

[43]

A. Bhattacharjee. Large-reach memory management unit caches. In International Symposium on Microarchitecture, 2013.

Digital Library

[44]

A. Bhattacharjee, D. Lustig, and M. Martonosi. Shared last-level TLBs for chip multiprocessors. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2011.

Digital Library

[45]

A. Bhattacharjee and M. Martonosi. Characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors. In International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.

Digital Library

[46]

Y. Du, M. Zhou, B. Childers, D. Mosse, and R. Melhem. Supporting superpages in non-contiguous physical memory. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2015.

[47]

M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 37--48, New York, NY, USA, 2012. ACM.

Digital Library

[48]

J. Gandhi, M. D. Hill, and M. M. Swift. Exceeding the best of nested and shadow paging. In International Symposium on Computer Architecture (ISCA), 2016.

Digital Library

[49]

J. Gandhi, A. Basu, M. D. Hill, and M. M. Swift. Efficient memory virtualization. In International Symposium on Microarchitecture, 2014.

Digital Library

[50]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pages 17--30, Hollywood, CA, 2012. USENIX.

Digital Library

[51]

M. Gorman and P. Healy. Supporting superpage allocation without additional hardware support. In Proceedings of the 7th International Symposium on Memory Management, 2008.

Digital Library

[52]

M. Gorman and P. Healy. Performance characteristics of explicit superpage support. In Workshorp on the Interaction between Operating Systems and Computer Architecture (WIOSCA), 2010.

Digital Library

[53]

M. Gorman and A. Whitcroft. The what, the why and the where to of anti-fragmentation. In Linux Symposium, 2005.

[54]

Intel Corporation. Intel 64 and IA-32 Architectures Software Developers Manual, 2016. https://wwwssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architecturessoftware-developer-manual-325462.pdf.

[55]

G. B. Kandiraju and A. Sivasubramaniam. Going the distance for TLB prefetching: An application-driven study. In International Symposium on Computer Architecture (ISCA), 2002.

Digital Library

[56]

V. Karakostas, J. Gandhi, F. Ayar, A. Cristal, M. D. Hill, K. S. McKinley, M. Nemirovsky, M. M. Swift, and O. nsal. Redundant memory mappings for fast access to large memories. In International Symposium on Computer Architecture (ISCA), 2015.

Digital Library

[57]

A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: The linux virtual machine monitor. In Linux Symposium, 2007.

[58]

Y. Kwon, H. Yu, S. Peter, C. J. Rossbach, and E. Witchel. Coordinated and efficient huge page management with ingens. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 705--721, GA, 2016. USENIX Association.

Digital Library

[59]

C.-P. Lee and C.-J. Lin. Large-scale linear RankSVM. Neural Comput., 26(4):781--817, Apr. 2014.

Digital Library

[60]

Huge Pages Part 2 (Interfaces). https://lwn.net/Articles/375096/. {February, 2010}.

[61]

D. Lustig, A. Bhattacharjee, and M. Martonosi. TLB improvements for chip multiprocessors: Inter-core cooperative prefetchers and shared last-level TLBs. ACM Transactions on Architecture and Code Optimization (TACO), 2013.

Digital Library

[62]

T. Merrifield and H. R. Taheri. Performance implications of extended page tables on virtualized x86 processors. In Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '16, pages 25--35, New York, NY, USA, 2016. ACM.

Digital Library

[63]

J. Navarro, S. Iyer, P. Druschel, and A. Cox. Practical, transparent operating system support for superpages. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2002.

Digital Library

[64]

M.-M. Papadopoulou, X. Tong, A. Seznec, and A. Moshovos. Prediction-based superpage-friendly TLB designs. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2015.

[65]

Idle Page Tracking. http://lxr.free-electrons.com/source/Documentation/vm/idle_page_tracking.txt. {November, 2015}.

[66]

B. Pham, A. Bhattacharjee, Y. Eckert, and G. H. Loh. Increasing TLB reach by exploiting clustering in page translations. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2014.

[67]

B. Pham, V. Vaidyanathan, A. Jaleel, and A. Bhattacharjee. CoLT: Coalesced large-reach TLBs. In International Symposium on Microarchitecture, 2012.

Digital Library

[68]

B. Pham, J. Vesely, G. Loh, and A. Bhattacharjee. Large pages and lightweight memory management in virtualized systems: Can you have it both ways? In International Symposium on Microarchitecture, 2015.

Digital Library

[69]

A. Saulsbury, F. Dahlgren, and P. Stenström. Recency-based TLB preloading. In International Symposium on Computer Architecture (ISCA), 2000.

Digital Library

[70]

T. Shanley. Pentium Pro Processor System Architecture. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edition, 1996.

Digital Library

[71]

R. L. Sites and R. T.Witek. ALPHA architecture reference manual. Digital Press, Boston, Oxford, Melbourne, 1998.

Digital Library

[72]

W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, A. Klepchukov, S. Patil, O. Fox, and D. Patterson. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0, 2008.

[73]

S. Srikantaiah and M. Kandemir. Synergistic tlbs for high performance address translation in chip multiprocessors. In International Symposium on Microarchitecture, 2010.

Digital Library

[74]

M. Talluri and M. D. Hill. Surpassing the TLB performance of superpages with less operating system support. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 1994.

Digital Library

[75]

Transparent Hugepages. https://lwn.net/Articles/359158/. {October, 2009}.

[76]

C. A. Waldspurger. Memory resource management in VMware ESX server. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2002.

Digital Library

Cited By

Mannino MPeccerillo BMondelli ABartolini S(2024)DeVAS: Decoupled Virtual Address Spaces2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD63648.2024.00024(182-193)Online publication date: 13-Nov-2024
https://doi.org/10.1109/SBAC-PAD63648.2024.00024
Panwar GLaghari MChoukse EJian X(2024)DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00085(1129-1143)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00085
Yue LWu TShen YZhang J(2024)A Review of Memory Management Mechanisms Based on Hot Page Monitoring2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730021(1-4)Online publication date: 20-Sep-2024
https://doi.org/10.1109/AICIT62434.2024.10730021
Show More Cited By

Ingens: Huge Page Support for the OS and Hypervisor

Recommendations

Coordinated and efficient huge page management with ingens
OSDI'16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation

Modern computing is hungry for RAM, with today's enormous capacities eagerly consumed by diverse workloads. Hardware address translation overheads have grown with memory capacity, motivating hardware manufacturers to provide TLBs with thousands of ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review

ACM SIGOPS Operating Systems Review Volume 51, Issue 1

Special Topics

August 2017

123 pages

ISSN:0163-5980

DOI:10.1145/3139645

Editors:
Mark Silberstein
Technion, Hafia, Israel
,
Christopher J. Rossbach
Stop D9500, Austin, TX

Issue’s Table of Contents

Copyright © 2017 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2017

Published in SIGOPS Volume 51, Issue 1

Check for updates

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
407
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)11

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mannino MPeccerillo BMondelli ABartolini S(2024)DeVAS: Decoupled Virtual Address Spaces2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD63648.2024.00024(182-193)Online publication date: 13-Nov-2024
https://doi.org/10.1109/SBAC-PAD63648.2024.00024
Panwar GLaghari MChoukse EJian X(2024)DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00085(1129-1143)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00085
Yue LWu TShen YZhang J(2024)A Review of Memory Management Mechanisms Based on Hot Page Monitoring2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730021(1-4)Online publication date: 20-Sep-2024
https://doi.org/10.1109/AICIT62434.2024.10730021
Tang DMao MYao YBao CShi QXie CXu RHaghighat MWang YQi ZGuan HCao X(2023)rShareJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2023.103009145:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.sysarc.2023.103009
Ha MKim S(2022)CCoW: Optimizing Copy-on-Write Considering the Spatial Locality in WorkloadsElectronics10.3390/electronics1103046111:3(461)Online publication date: 3-Feb-2022
https://doi.org/10.3390/electronics11030461
Kwon OLee YHong S(2022)Pinning Page Structure Entries to Last-Level Cache for Fast Address TranslationIEEE Access10.1109/ACCESS.2022.321748410(114552-114565)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3217484
Ainsworth SJones TWang ZWrigstad T(2021)Compendia: reducing virtual-memory costs via selective densificationProceedings of the 2021 ACM SIGPLAN International Symposium on Memory Management10.1145/3459898.3463902(52-65)Online publication date: 22-Jun-2021
https://dl.acm.org/doi/10.1145/3459898.3463902
Teabe BYuhala PTchana AHermenier FHagimont DMuller GTitzer BXu HZhang I(2021)(No)Compromis: paging virtualization is not a fatalityProceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3453933.3454013(43-56)Online publication date: 7-Apr-2021
https://dl.acm.org/doi/10.1145/3453933.3454013
Panwar AAchermann RBasu ABhattacharjee AGopinath KGandhi JSherwood TBerger EKozyrakis C(2021)Fast local page-tables for virtualized NUMA servers with vMitosisProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446709(194-210)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3445814.3446709
Ma JZuo GLoughlin KCheng XLiu YEneyew AQi ZKasikci BLarus JCeze LStrauss K(2020)A Hypervisor for Shared-Memory FPGA PlatformsProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378482(827-844)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378482
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents