skip to main content
10.1145/3343737.3343745acmconferencesArticle/Chapter ViewAbstractPublication PagesapsysConference Proceedingsconference-collections
research-article

Thinking about A New Mechanism for Huge Page Management

Published: 19 August 2019 Publication History

Abstract

The Huge page mechanism is proposed to reduce the TLB misses and benefit the overall system performance. On the system with large memory capacity, using huge pages is an ideal choice to alleviate the virtual-to-physical address translation overheads. However, using huge pages might incur expensive memory compaction operations due to memory fragmentation problem, and lead to memory bloating as many huge pages are often underutilized in practice.
In order to address these problems, in this paper, we propose SysMon-H, a sampling module in OS kernel, which is able to obtain the huge page utilization in a low overhead for both cloud and desktop applications. Furthermore, we propose H-Policy, a huge page management policy, which splits the underutilized huge pages to mitigate the memory bloating or promotes the base 4KB pages to huge pages for reducing the TLB misses based on the information provided by SysMon-H. In our prototype, SysMon-H and H-Policy work cooperatively in OS kernel.

References

[1]
Performance Tuning: HugePages In Linux. https://blog.pythian.com/performance-tuning-hugepages-in-linux.
[2]
Recommendation for disabling huge pages for Redis. http://redis.io/topics/ latency.
[3]
Recommendation for disabling huge pages for MongoDB.https://docs.mongodb.org/manual/tutorial/transparent-huge-pages.
[4]
Tales from the Field: Taming Transparent Huge Pages on Linux. https://www.perforce.com/blog/tales-field-taming-transparent-huge-pages-linux.
[5]
http://en.wikipedia.org/wiki/Buddy_memory_allocation.
[6]
https://en.wikipedia.org/wiki/Memcached.
[7]
N. Agarwal and Thomas F. Wenisch, Thermostat: Applicationtransparent page management for two-tiered main memory. In ASPLOS, 2017.
[8]
R. Ausavarungnirun, et al, Mosaic: a GPU memory manager with application-transparent support for multiple page sizes. In Micro, 2017.
[9]
A. Awad, et al, Avoiding TLB shootdowns through self-invalidating TLB entries. In PACT, 2017.
[10]
A. Basu, et al, Efficient virtual memory for big memory servers. In ISCA, 2013.
[11]
D. P. Bovet and M. Cesati, Understanding The Linux Kernel. O'Reilly Media, Inc. 2005.
[12]
J. Corbet, Memory compaction. https://lwn.net/Articles/368869/.
[13]
J. Gandhi, et al, BadgerTrap: a tool to instrument x86-64 TLB misses. In ACM SIGARCH Computer Architecture News (CAN), 2014.
[14]
J. Gandhi, et al, Efficient memory virtualization: Reducing dimensionality of nested page walks. In Micro, 2014.
[15]
B. Gras, et al, Translation Leak-aside Buffer: Defeating Cache Sidechannel Protections with TLB Attacks. In USENIX Security, 2018.
[16]
Y. Kwon, et al, Coordinated and efficient huge page management with ingens. In OSDI, 2016.
[17]
S. Lee, et al, CLOCK-DWF: A writehistory-aware page replacement algorithm for hybrid PCM and DRAM memory architectures. In IEEE TC, 2014.
[18]
L. Liu, et al, A software memory partition approach for eliminating bank-level interference in multicore systems. In PACT, 2012.
[19]
L. Liu, et al, Going Vertical in Memory Management: Handling Multiplicity by Multi-policy. In ISCA, 2014.
[20]
L. Liu, et al, Memos: A full hierarchy hybrid memory management framework. In ICCD, 2016.
[21]
L. Liu, et al, Rethinking Memory Management in Modern Operating System: Horizontal, Vertical or Random? In IEEE TC, 2016.
[22]
O. Mutlu, More than Moore Technologies for Next Generation Computer Design. Springer, Chapter Main Memory Scaling: Challenges and Solution Directions, 2015
[23]
J. Navarro, et al, Practical, transparent operating system support for superpages. In OSDI, 2002.
[24]
A. Panwar, et al, Making Huge Pages Actually Useful. In ASPLOS, 2018.
[25]
M. Xie, et al, SysMon: Monitoring Memory Behaviors via OS Approach. In APPT, 2017.
[26]
X. Zhang, et al, Towards practical page coloring-based multicore cache management. In EuroSys, 2009.
[27]
Y. Zhang, et al, Mojim: A Reliable and Highly-Available Non-Volatile Memory System. In ASPLOS, 2015.
[28]
H. Zhao, et al, Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory. In ACM TACO, 2018.
[29]
L. Liu, et al, Hierarchical Hybrid Memory Management in OS for Tiered Memory Systems. In IEEE TPDS, 2019.
[30]
S. Chen, et al, Efficient GPU NVMRAM Persistence with Helper Wraps. In ACM/IEEE DAC, 2019.
[31]
H. Liu, et al, HMFS: A hybrid in-memory file system with version consistency. In JPDC, 2018.
[32]
F. Lv, et al, Dynamic I/O-aware scheduling for batch-mode applications on chip multiprocessor systems of cluster platforms. In JCST, 2014.
[33]
F. Lv, et al, WiseThrottling: a new asynchronous task scheduler for mitigating I/O bottleneck in large-scale datacenter servers. In J. of Supercomputing, 2014.

Cited By

View all
  • (2024)ATA-Cache: Contention Mitigation for GPU Shared L1 Cache With Aggregated Tag ArrayIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333719243:5(1429-1441)Online publication date: May-2024
  • (2024)HMO: Host Memory Optimization for Model Inference Acceleration on Edge Devices2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831215(2813-2819)Online publication date: 6-Oct-2024
  • (2024)A Review of Memory Management Mechanisms Based on Hot Page Monitoring2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730021(1-4)Online publication date: 20-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
APSys '19: Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems
August 2019
115 pages
ISBN:9781450368933
DOI:10.1145/3343737
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 August 2019

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

APSys '19
Sponsor:

Acceptance Rates

APSys '19 Paper Acceptance Rate 15 of 36 submissions, 42%;
Overall Acceptance Rate 169 of 430 submissions, 39%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)2
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ATA-Cache: Contention Mitigation for GPU Shared L1 Cache With Aggregated Tag ArrayIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333719243:5(1429-1441)Online publication date: May-2024
  • (2024)HMO: Host Memory Optimization for Model Inference Acceleration on Edge Devices2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831215(2813-2819)Online publication date: 6-Oct-2024
  • (2024)A Review of Memory Management Mechanisms Based on Hot Page Monitoring2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730021(1-4)Online publication date: 20-Sep-2024
  • (2023)MEMTIS: Efficient Memory Tiering with Dynamic Page Classification and Page Size DeterminationProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613167(17-34)Online publication date: 23-Oct-2023
  • (2022)Dynamic memory allocation prediction algorithm based on Markov chain2022 14th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA)10.1109/ICMTMA54903.2022.00126(605-609)Online publication date: Jan-2022
  • (2021)On-demand-forkProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456258(540-555)Online publication date: 21-Apr-2021
  • (2021)An Efficient Parallel Secure Machine Learning Framework on GPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.305910832:9(2262-2276)Online publication date: 1-Sep-2021
  • (2021)Hardware-Based Address-Centric Acceleration of Key-Value Store2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00067(736-748)Online publication date: Feb-2021
  • (2021)Monitoring Memory Behaviors and Mitigating NUMA Drawbacks on Tiered NVM SystemsNetwork and Parallel Computing10.1007/978-3-030-79478-1_33(386-391)Online publication date: 23-Jun-2021
  • (2020)CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUsProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404400(1-11)Online publication date: 17-Aug-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media