skip to main content
research-article

Ingens: Huge Page Support for the OS and Hypervisor

Published: 11 September 2017 Publication History

Abstract

Memory capacity and demand have grown hand in hand in recent years. However, overheads for memory virtualization, in particular for address translation, grow with memory capacity as well, motivating hardware manufacturers to provide TLBs with thousands of entries for larger pages, or huge pages. Current OSes and hypervisors support huge pages with a hodge-podge of best-effort algorithms and spot fixes that make less and less sense as architectural support for huge pages matures. The time has come for a more fundamental redesign.
Ingens is a framework for providing transparent huge page support in a coordinated way. Ingens manages contiguity as a first-class resource, and tracks utilization and access frequency of memory pages, enabling it to eliminate pathologies that plague current systems. Experiments with a Linux/KVM-based prototype show improved fairness and performance, and reduced tail latency and memory bloat for important applications such as Web services and Redis. We report early experiences with our in-progress port of Ingens to the ESX Hypervisor.

References

[1]
http://www.7-cpu.com/cpu/Skylake.html. {Accessed April, 2016}.
[2]
http://www.7-cpu.com/cpu/Haswell.html. {Accessed April, 2016}.
[3]
Apache Cloudstack. https://en.wikipedia.org/wiki/Apache_CloudStack. {Accessed April, 2016}.
[4]
Apache Hadoop. http://hadoop.apache.org/. {Accessed April, 2016}.
[5]
Apache Spark. http://spark.apache.org/docs/latest/index.html. {Accessed April, 2016}.
[6]
Application-friendly kernel interfaces. https://lwn.net/Articles/227818/. {March, 2007}.
[7]
Cloudera recommends turning off memory compaction due to high CPU utilization. http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_admin_performance.html. {Accessed April, 2016}.
[8]
Cloudsuite. http://parsa.epfl.ch/cloudsuite/graph.html. {Accessed April, 2016}.
[9]
CouchBase recommends disabling huge pages. http://blog.couchbase.com/often-overlookedlinux-os-tweaks. {March, 2014}.
[10]
DokuDB recommends disabling huge pages. https://www.percona.com/blog/2014/07/23/whytokudb-hates-transparent-hugepages/. {July, 2014}.
[11]
Exponential moving average. https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average. {Accessed April, 2016}.
[12]
High CPU utilization in Hadoop due to transparent huge pages. https://www.ghostar.org/2015/02/transparent-huge-pages-on-hadoop-makesme-sad/. {February, 2015}.
[13]
High CPU utilization in Mysql due to transparent huge pages. http://developer.okta.com/blog/2015/05/22/tcmalloc. {May, 2015}.
[14]
Huge page support in Mac OS X. https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man2/mmap.2.html. {Accessed April-2016}.
[15]
IBM cloud with KVM hypervisor. http://www.networkworld.com/article/2230172/opensource-subnet/red-hat-s-kvmvirtualization-proves-itself-in-ibm-scloud.html. {March, 2010}.
[16]
IBM recommends turning off huge pages due to high CPU utilization. http://www-01.ibm.com/support/docview.wss?uid=swg21677458. {July, 2014}.
[17]
Intel HiBench. https://github.com/intel-hadoop/HiBench/tree/master/workloads. {Accessed April, 2016}.
[18]
Jemalloc. http://www.canonware.com/jemalloc/. {Accessed April-2016}.
[19]
Large-page support in Windows. https://msdn.microsoft.com/en-us/library/windows/desktop/aa366720(v=vs.85).aspx. {Accessed April-2016}.
[20]
Liblinear. https://www.csie.ntu.edu.tw/¿cjlin/liblinear/. {Accessed April, 2016}.
[21]
MongoDB. https://www.mongodb.com/. {Accessed April, 2016}.
[22]
MongoDB recommends disabling huge pages. https://docs.mongodb.org/manual/tutorial/transparenthuge-pages/. {Accessed April, 2016}.
[23]
Movie recommendation with Spark. http://ampcamp.berkeley.edu/big-data-mini-course/movierecommendation-with-mllib.html. {Accessed April, 2016}.
[24]
NuoDB recommends disabling huge pages. http://www.nuodb.com/techblog/linux-transparenthuge-pages-jemalloc-and-nuodb. {May, 2014}.
[25]
OpenStack. https://openvirtualizationalliance.org/what-kvm/openstack. {Accessed April-2016}.
[26]
PARSEC 3.0 benchmark suite. http://parsec.cs.princeton.edu/. {Accessed April, 2016}.
[27]
Redis. http://redis.io/. {Accessed April, 2016}.
[28]
Redis recommends disabling huge pages. http://redis.io/topics/latency. {Accessed April, 2016}.
[29]
SAP IQ recommends disabling huge pages. http://scn.sap.com/people/markmumy/blog/2014/05/22/sap-iqand-linux-hugepagestransparent-hugepages. {May, 2014}.
[30]
SPEC CPU 2006. https://www.spec.org/cpu2006/. {Accessed April, 2016}.
[31]
Splunk recommends disabling huge pages. http://docs.splunk.com/Documentation/Splunk/6.1.3/ReleaseNotes/SplunkandTHP. {December, 2013}.
[32]
Thread-caching malloc. http://goog-perftools.sourceforge.net/doc/tcmalloc.html. {Accessed April-2016}.
[33]
Transparent huge pages in 2.6.38. https://lwn.net/Articles/423584/. {January, 2011}.
[34]
VoltDB recommends disabling huge pages. https://docs.voltdb.com/AdminGuide/adminmemmgt.php. {Accessed April, 2016}.
[35]
J. Ahn, S. Jin, and J. Huh. Revisiting hardware-assisted page walks for virtualized systems. In International Symposium on Computer Architecture (ISCA), 2012.
[36]
J. Ahn, S. Jin, and J. Huh. Fast two-level address translation for virtualized systems. In IEEE Transactions on Computers, 2015.
[37]
AMD. AMD-V Nested Paging, 2010. http://developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf.
[38]
J. Araujo, R. Matos, P. Maciel, R. Matias, and I. Beicker. Experimental evaluation of software aging effects on the eucalyptus cloud computing infrastructure. In Middleware Industry Track Workshop, 2011.
[39]
T. W. Barr, A. L. Cox, and S. Rixner. Translation caching: Skip, don't walk (the page table). In International Symposium on Computer Architecture (ISCA), 2010.
[40]
T. W. Barr, A. L. Cox, and S. Rixner. Spectlb: A mechanism for speculative address translation. In International Symposium on Computer Architecture (ISCA), 2011.
[41]
A. Basu, J. Gandhi, J. Chang, M. D. Hill, and M. M. Swift. Efficient virtual memory for big memory servers. In International Symposium on Computer Architecture (ISCA), 2013.
[42]
A. Beitch, B. Liu, T. Yung, R. Griffith, A. Fox, and D. Patterson. Rain: A workload generation toolkit for cloud computing applications. In U.C. Berkeley Technical Publications (UCB/EECS-2010-14), 2010.
[43]
A. Bhattacharjee. Large-reach memory management unit caches. In International Symposium on Microarchitecture, 2013.
[44]
A. Bhattacharjee, D. Lustig, and M. Martonosi. Shared last-level TLBs for chip multiprocessors. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2011.
[45]
A. Bhattacharjee and M. Martonosi. Characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors. In International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.
[46]
Y. Du, M. Zhou, B. Childers, D. Mosse, and R. Melhem. Supporting superpages in non-contiguous physical memory. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2015.
[47]
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 37--48, New York, NY, USA, 2012. ACM.
[48]
J. Gandhi, M. D. Hill, and M. M. Swift. Exceeding the best of nested and shadow paging. In International Symposium on Computer Architecture (ISCA), 2016.
[49]
J. Gandhi, A. Basu, M. D. Hill, and M. M. Swift. Efficient memory virtualization. In International Symposium on Microarchitecture, 2014.
[50]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pages 17--30, Hollywood, CA, 2012. USENIX.
[51]
M. Gorman and P. Healy. Supporting superpage allocation without additional hardware support. In Proceedings of the 7th International Symposium on Memory Management, 2008.
[52]
M. Gorman and P. Healy. Performance characteristics of explicit superpage support. In Workshorp on the Interaction between Operating Systems and Computer Architecture (WIOSCA), 2010.
[53]
M. Gorman and A. Whitcroft. The what, the why and the where to of anti-fragmentation. In Linux Symposium, 2005.
[54]
Intel Corporation. Intel 64 and IA-32 Architectures Software Developers Manual, 2016. https://wwwssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architecturessoftware-developer-manual-325462.pdf.
[55]
G. B. Kandiraju and A. Sivasubramaniam. Going the distance for TLB prefetching: An application-driven study. In International Symposium on Computer Architecture (ISCA), 2002.
[56]
V. Karakostas, J. Gandhi, F. Ayar, A. Cristal, M. D. Hill, K. S. McKinley, M. Nemirovsky, M. M. Swift, and O. nsal. Redundant memory mappings for fast access to large memories. In International Symposium on Computer Architecture (ISCA), 2015.
[57]
A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: The linux virtual machine monitor. In Linux Symposium, 2007.
[58]
Y. Kwon, H. Yu, S. Peter, C. J. Rossbach, and E. Witchel. Coordinated and efficient huge page management with ingens. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 705--721, GA, 2016. USENIX Association.
[59]
C.-P. Lee and C.-J. Lin. Large-scale linear RankSVM. Neural Comput., 26(4):781--817, Apr. 2014.
[60]
Huge Pages Part 2 (Interfaces). https://lwn.net/Articles/375096/. {February, 2010}.
[61]
D. Lustig, A. Bhattacharjee, and M. Martonosi. TLB improvements for chip multiprocessors: Inter-core cooperative prefetchers and shared last-level TLBs. ACM Transactions on Architecture and Code Optimization (TACO), 2013.
[62]
T. Merrifield and H. R. Taheri. Performance implications of extended page tables on virtualized x86 processors. In Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '16, pages 25--35, New York, NY, USA, 2016. ACM.
[63]
J. Navarro, S. Iyer, P. Druschel, and A. Cox. Practical, transparent operating system support for superpages. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2002.
[64]
M.-M. Papadopoulou, X. Tong, A. Seznec, and A. Moshovos. Prediction-based superpage-friendly TLB designs. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2015.
[65]
Idle Page Tracking. http://lxr.free-electrons.com/source/Documentation/vm/idle_page_tracking.txt. {November, 2015}.
[66]
B. Pham, A. Bhattacharjee, Y. Eckert, and G. H. Loh. Increasing TLB reach by exploiting clustering in page translations. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2014.
[67]
B. Pham, V. Vaidyanathan, A. Jaleel, and A. Bhattacharjee. CoLT: Coalesced large-reach TLBs. In International Symposium on Microarchitecture, 2012.
[68]
B. Pham, J. Vesely, G. Loh, and A. Bhattacharjee. Large pages and lightweight memory management in virtualized systems: Can you have it both ways? In International Symposium on Microarchitecture, 2015.
[69]
A. Saulsbury, F. Dahlgren, and P. Stenström. Recency-based TLB preloading. In International Symposium on Computer Architecture (ISCA), 2000.
[70]
T. Shanley. Pentium Pro Processor System Architecture. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edition, 1996.
[71]
R. L. Sites and R. T.Witek. ALPHA architecture reference manual. Digital Press, Boston, Oxford, Melbourne, 1998.
[72]
W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, A. Klepchukov, S. Patil, O. Fox, and D. Patterson. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0, 2008.
[73]
S. Srikantaiah and M. Kandemir. Synergistic tlbs for high performance address translation in chip multiprocessors. In International Symposium on Microarchitecture, 2010.
[74]
M. Talluri and M. D. Hill. Surpassing the TLB performance of superpages with less operating system support. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 1994.
[75]
Transparent Hugepages. https://lwn.net/Articles/359158/. {October, 2009}.
[76]
C. A. Waldspurger. Memory resource management in VMware ESX server. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2002.

Cited By

View all
  • (2024)DeVAS: Decoupled Virtual Address Spaces2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD63648.2024.00024(182-193)Online publication date: 13-Nov-2024
  • (2024)DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00085(1129-1143)Online publication date: 29-Jun-2024
  • (2024)A Review of Memory Management Mechanisms Based on Hot Page Monitoring2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730021(1-4)Online publication date: 20-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 51, Issue 1
Special Topics
August 2017
123 pages
ISSN:0163-5980
DOI:10.1145/3139645
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2017
Published in SIGOPS Volume 51, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)55
  • Downloads (Last 6 weeks)11
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)DeVAS: Decoupled Virtual Address Spaces2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD63648.2024.00024(182-193)Online publication date: 13-Nov-2024
  • (2024)DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00085(1129-1143)Online publication date: 29-Jun-2024
  • (2024)A Review of Memory Management Mechanisms Based on Hot Page Monitoring2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730021(1-4)Online publication date: 20-Sep-2024
  • (2023)rShareJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2023.103009145:COnline publication date: 1-Dec-2023
  • (2022)CCoW: Optimizing Copy-on-Write Considering the Spatial Locality in WorkloadsElectronics10.3390/electronics1103046111:3(461)Online publication date: 3-Feb-2022
  • (2022)Pinning Page Structure Entries to Last-Level Cache for Fast Address TranslationIEEE Access10.1109/ACCESS.2022.321748410(114552-114565)Online publication date: 2022
  • (2021)Compendia: reducing virtual-memory costs via selective densificationProceedings of the 2021 ACM SIGPLAN International Symposium on Memory Management10.1145/3459898.3463902(52-65)Online publication date: 22-Jun-2021
  • (2021)(No)Compromis: paging virtualization is not a fatalityProceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3453933.3454013(43-56)Online publication date: 7-Apr-2021
  • (2021)Fast local page-tables for virtualized NUMA servers with vMitosisProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446709(194-210)Online publication date: 19-Apr-2021
  • (2020)A Hypervisor for Shared-Memory FPGA PlatformsProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378482(827-844)Online publication date: 9-Mar-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media