Parallel virtualized memory translation with nested elastic cuckoo page tables

Authors:
Jovan Stojkovic

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

,
Dimitrios Skarlatos

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Apostolos Kokolis

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

,
Tianyin Xu

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA

0000-0003-4443-8170
View Profile

,
Josep Torrellas

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating SystemsFebruary 2022Pages 84–97https://doi.org/10.1145/3503222.3507720

Published:22 February 2022Publication History

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 84–97

ABSTRACT

A major reason why nested or virtualized address translations are slow is because current systems organize page tables in a multi-level tree that is accessed in a sequential manner. A nested translation may potentially require up to twenty-four sequential memory accesses. To address this problem, this paper presents the first page table design that supports parallel nested address translation. The design is based on using hashed page tables (HPTs) for both guest and host. However, directly extending a native HPT design to a nested environment leads to minor gains. Instead, our design solves a new set of challenges that appear in nested environments. Our scheme eliminates all but three of the potentially twenty-four sequential steps of a nested translation—while judiciously limiting the number of parallel memory accesses issued to avoid over-consuming cache bandwidth. As a result, compared to conventional nested radix tables, our design speeds-up the execution of a set of applications by an average of 1.19x (for 4KB pages) and 1.24x (when huge pages are used). In addition, we also show a migration path from current nested radix page tables to our design.

References

Keith Adams and Ole Agesen. 2006. A Comparison of Software and Hardware Techniques for x86 Virtualization. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XII).Google ScholarDigital Library
Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. In Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI’20).Google Scholar
Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh. 2012. Revisiting Hardware-assisted Page Walks for Virtualized Systems. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA’12).Google ScholarCross Ref
Hanna Alam, Tianhao Zhang, Mattan Erez, and Yoav Etsion. 2017. Do-It-Yourself Virtual Memory Translation. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA’17).Google ScholarDigital Library
Kursad Albayraktaroglu, Aamer Jaleel, Xue Wu, Manoj Franklin, Bruce Jacob, Chau-Wen Tseng, and Donald Yeung. 2005. BioBench: A Benchmark Suite of Bioinformatics Applications. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’05).Google Scholar
Chloe Alverti, Stratos Psomadakis, Vasileios Karakostas, Jayneel Gandhi, Konstantinos Nikas, Georgios Goumas, and Nectarios Koziris. 2020. Enhancing and Exploiting Contiguity for Fast Memory Virtualization. In Proceedings of the 47th International Symposium on Computer Architecture (ISCA’20).Google ScholarDigital Library
Amazon Web Services. 2021. Elastic Compute Cloud (EC2). https://aws.amazon.com/ec2Google Scholar
AMD. 2005. AMD64 Virtualization Codenamed “Pacifica” Technology: Secure Virtual Machine Architecture Reference Manual.Google Scholar
AMD. 2008. AMD-V^TM Nested Paging. http://developer.amd.com/wordpress/ media/2012/10/NPT-WP-11-final-TM.pdfGoogle Scholar
A. Awad, S. D. Hammond, G. R. Voskuilen, and R. J. Hoekstra. 2017. Samba: A Detailed Memory Management Unit (MMU) for the SST Simulation Framework. Sandia National Laboratories.Google Scholar
Kavita Bala, M. Frans Kaashoek, and William E. Weihl. 1994. Software Prefetching and Caching for Translation Lookaside Buffers. In Proceedings of the 1st USENIX Conference on Operating Systems Design and Implementation (OSDI’94).Google Scholar
Rajeev Balasubramonian, Andrew B. Kahng, Naveen Muralimanohar, Ali Shafiee, and Vaishnav Srinivas. 2017. CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories. ACM Transactions on Architecture and Code Optimization (TACO), 14, 2 (2017), June.Google Scholar
Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2010. Translation Caching: Skip, Don’t Walk (the Page Table). In Proceedings of the 2010 International Conference on Computer Architecture (ISCA’10).Google ScholarDigital Library
Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2011. SpecTLB: A Mechanism for Speculative Address Translation. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11).Google ScholarDigital Library
Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. 2013. Efficient Virtual Memory for Big Memory Servers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13).Google Scholar
Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. 2008. Accelerating Two-dimensional Page Walks for Virtualized Systems. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII).Google ScholarDigital Library
Abhishek Bhattacharjee. 2013. Large-reach Memory Management Unit Caches. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46).Google ScholarDigital Library
Abhishek Bhattacharjee. 2017. Translation-Triggered Prefetching. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17).Google Scholar
Abhishek Bhattacharjee, Daniel Lustig, and Margaret Martonosi. 2011. Shared Last-level TLBs for Chip Multiprocessors. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA’11).Google ScholarDigital Library
Abhishek Bhattacharjee and Margaret Martonosi. 2010. Inter-Core Cooperative TLB Prefetchers for Chip Multiprocessors. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XV).Google Scholar
Jeffrey Buell, Daniel Hecht, Jin Heo, Kalyan Saladi, and H. Reza Taheri. 2013. Methodology for Performance Analysis of VMware vSphere under Tier-1 Applications. VMWare Technical Journal.Google Scholar
Nadav Chachmon, Daniel Richins, Robert Cohn, Magnus Christensson, Wenzhi Cui, and Vijay Janapa Reddi. 2016. Simulation and Analysis Engine for Scale-Out Workloads. In Proceedings of the 2016 International Conference on Supercomputing (ICS’16).Google ScholarDigital Library
Xiaotao Chang, Hubertus Franke, Yi Ge, Tao Liu, Kun Wang, Jimi Xenidis, Fei Chen, and Yu Zhang. 2013. Improving Virtualization in the Presence of Software Managed Translation Lookaside Buffers. In Proceedings of the 2013 International Conference on Computer Architecture (ISCA’13).Google ScholarDigital Library
Guilherme Cox and Abhishek Bhattacharjee. 2017. Efficient Address Translation for Architectures with Multiple Page Sizes. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17).Google ScholarDigital Library
Christoffer Dall and Jason Nieh. 2014. KVM/ARM: The Design and Implementation of the Linux ARM Hypervisor. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14).Google ScholarDigital Library
Cort Dougan, Paul Mackerras, and Victor Yodaiken. 1999. Optimizing the Idle Task and Other MMU Tricks. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI’99).Google ScholarDigital Library
Stephane Eranian and David Mosberger. 2000. The Linux/ia64 Project: Kernel Design and Status Update. HP Labs.Google Scholar
Dimitris Fotakis, Rasmus Pagh, Peter Sanders, and Paul Spirakis. 2005. Space Efficient Hash Tables with Worst Case Constant Access Time. Theory of Computing Systems, 38, 2 (2005), Feb., 229–248.Google ScholarCross Ref
Jayneel Gandhi, Arkaprava Basu, Mark D. Hill, and Michael M. Swift. 2014. Efficient Memory Virtualization: Reducing Dimensionality of Nested Page Walks. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47).Google Scholar
Jayneel Gandhi, Mark D. Hill, and Michael M. Swift. 2016. Agile Paging: Exceeding the Best of Nested and Shadow Paging. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA’16).Google Scholar
Fabien Gaud, Baptiste Lepers, Jeremie Decouchant, Justin Funston, Alexandra Fedorova, and Vivien Quéma. 2014. Large Pages May Be Harmful on NUMA Systems. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC’14).Google ScholarDigital Library
Google. 2021. Cloud Compute Engine. https://cloud.google.com/computeGoogle Scholar
Google. 2021. gVisor: Container Runtime Sandbox. https://gvisor.dev/docs/Google Scholar
Mel Gorman and Patrick Healy. 2010. Performance Characteristics of Explicit Superpage Support. In Proceedings of the 2010 International Conference on Computer Architecture (ISCA’10).Google ScholarDigital Library
Charles Gray, Matthew Chapman, Peter Chubb, David Mosberger-Tang, and Gernot Heiser. 2005. Itanium — A System Implementor’s Tale. In Proceedings of the 2005 USENIX Annual Technical Conference (USENIX ATC’05).Google Scholar
Fei Guo, Seongbeom Kim, Yury Baskakov, and Ishan Banerjee. 2015. Proactively Breaking Large Pages to Improve Memory Overcommitment Performance in VMware ESXi. In Proceedings of the 11th ACM International Conference on Virtual Execution Environments (VEE’15).Google ScholarDigital Library
F. Guvenilir and Y. N. Patt. 2020. Tailored Page Sizes. In Proceedings of the 2020 47th Annual International Symposium on Computer Architecture (ISCA’20).Google Scholar
Nastaran Hajinazar, Pratyush Patel, Minesh Patel, Konstantinos Kanellopoulos, Saugata Ghose, Rachata Ausavarungnirun, Geraldo F. Oliveira, Jonathan Appavoo, Vivek Seshadri, and Onur Mutlu. 2020. The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework. In Proceedings of the 47th Annual International Symposium on Computer Architecture (ISCA’20).Google ScholarDigital Library
Swapnil Haria, Mark D. Hill, and Michael M. Swift. 2018. Devirtualizing Memory in Heterogeneous Systems. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’18).Google Scholar
Jerry Huck and Jim Hays. 1993. Architectural Support for Translation Table Management in Large Address Space Machines. In Proceedings of the 20th Annual International Symposium on Computer Architecture (ISCA’93).Google ScholarDigital Library
IBM. 2005. PowerPC^® Microprocessor Family: The Programming Environments Manual for 32 and 64-bit Microprocessors. https://wiki.alcf.anl.gov/images/f/fb/PowerPC_-_Assembly_-_IBM_Programming_Environment_2.3.pdfGoogle Scholar
Intel^®. 2005. Intel^® Virtualization Technology Specification for the IA-32 Intel^® Architecture.Google Scholar
Intel^®. 2010. Itanium^® Architecture Software Developer’s Manual (Volume 2). https://www.intel.com/content/www/us/en/products/docs/processors/itanium/itanium-architecture-vol-1-2-3-4-reference-set-manual.htmlGoogle Scholar
Intel. 2015. 5-Level Paging and 5-Level EPT (White Paper). https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdfGoogle Scholar
Intel. 2018. Sunny Cove Microarchitecture. https://en.wikichip.org/wiki/intel/microarchitectures/sunny_coveGoogle Scholar
Intel^®. 2019. 64 and IA-32 Architectures Software Developer’s Manual.Google Scholar
Intel. 2021. Intel® Optane™ Persistent Memory. https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.htmlGoogle Scholar
Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R. Dulloor, Jishen Zhao, and Steven Swanson. 2019. Basic Performance Measurements of the Intel Optane DC Persistent Memory Module. arXiv:1903.05714.Google Scholar
Bruce L. Jacob and Trevor N. Mudge. 1998. A Look at Several Memory Management Units, TLB-refill Mechanisms, and Page Table Organizations. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII).Google Scholar
Joefon Jann, Paul Mackerras, John Ludden, Michael Gschwind, Wade Ouren, Stuart Jacobs, Brian F. Veale, and David Edelsohn. 2018. IBM POWER9 system software. IBM Journal of Research and Development, 62, 4/5 (2018), June.Google ScholarDigital Library
Gokul B. Kandiraju and Anand Sivasubramaniam. 2002. Going the Distance for TLB Prefetching: An Application-driven Study. In Proceedings of the 29th International Symposium on Computer Architecture (ISCA’02).Google Scholar
Vasileios Karakostas, Jayneel Gandhi, Furkan Ayar, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, and Osman Ünsal. 2015. Redundant Memory Mappings for Fast Access to Large Memories. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15).Google ScholarDigital Library
V. Karakostas, J. Gandhi, A. Cristal, M. D. Hill, K. S. McKinley, M. Nemirovsky, M. M. Swift, and O. S. Unsal. 2016. Energy-Efficient Address Translation. In Proceedings of 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA’16).Google Scholar
Samuel T. King, George W. Dunlap, and Peter M. Chen. 2003. Operating System Support for Virtual Machines. In Proceedings of the 2003 USENIX Annual Technical Conference (USENIX ATC’03).Google Scholar
KVM. 2021. Page Table Allocation in KVM. https://git.kernel.org/pub/scm/virt/kvm/kvm.git/tree/arch/x86/mm/init_64.c##n224Google Scholar
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel. 2016. Coordinated and Efficient Huge Page Management with Ingens. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16).Google ScholarDigital Library
Linux Kernel. 2021. Page Table Header File. https://github.com/torvalds/linux/blob/master/include/linux/pgtable.hGoogle Scholar
Piotr R. Luszczek, David H. Bailey, Jack J. Dongarra, Jeremy Kepner, Robert F. Lucas, Rolf Rabenseifner, and Daisuke Takahashi. 2006. The HPC Challenge (HPCC) Benchmark Suite. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC’06).Google ScholarDigital Library
Peter S. Magnusson, Magnus Christensson, Jesper Eskilson, Daniel Forsgren, Gustav Hållberg, Johan Högberg, Fredrik Larsson, Andreas Moestedt, and Bengt Werner. 2002. Simics: A Full System Simulation Platform. IEEE Computer.Google Scholar
Yashwant Marathe, Nagendra Gulur, Jee Ho Ryoo, Shuang Song, and Lizy K. John. 2017. CSALT: Context Switch Aware Large TLB. In Proceedings of the 50th IEEE/ACM International Symposium on Microarchitecture (MICRO-50).Google Scholar
Artemiy Margaritov, Dmitrii Ustiugov, Edouard Bugnion, and Boris Grot. 2019. Prefetched Address Translation. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-52).Google ScholarDigital Library
Microsoft Azure. 2021. Cloud Computing Services. https://azure.microsoft.comGoogle Scholar
S. Mirbagher-Ajorpaz, E. Garza, G. Pokam, and D. A. Jimenez. 2020. CHiRP: Control-Flow History Reuse Prediction. In Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-53).Google Scholar
Lifeng Nai, Yinglong Xia, Ilie G. Tanase, Hyesoon Kim, and Ching-Yung Lin. 2015. GraphBIG: Understanding Graph Computing in the Context of Industrial Solutions. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15).Google ScholarDigital Library
Rasmus Pagh and Flemming Friche Rodler. 2004. Cuckoo Hashing. Journal of Algorithms, 51, 2 (2004), May, 122–144.Google ScholarDigital Library
Ashish Panwar, Sorav Bansal, and K. Gopinath. 2019. HawkEye: Efficient Fine-grained OS Support for Huge Pages. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’19).Google Scholar
Ashish Panwar, Aravinda Prasad, and K. Gopinath. 2018. Making Huge Pages Actually Useful. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’18).Google Scholar
Chang Hyun Park, Sanghoon Cha, Bokyeong Kim, Youngjin Kwon, David Black-Schaffer, and Jaehyuk Huh. 2020. Perforated Page: Supporting Fragmented Memory Allocation for Large Pages. In Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA’20).Google ScholarDigital Library
Chang Hyun Park, Taekyung Heo, Jungi Jeong, and Jaehyuk Huh. 2017. Hybrid TLB Coalescing: Improving TLB Translation Coverage under Diverse Fragmented Memory Allocations. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA’17).Google ScholarDigital Library
Binh Pham, Abhishek Bhattacharjee, Yasuko Eckert, and Gabriel H. Loh. 2014. Increasing TLB Reach by Exploiting Clustering in Page Translations. In Proceedings of the 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
Binh Pham, Viswanathan Vaidyanathan, Aamer Jaleel, and Abhishek Bhattacharjee. 2012. CoLT: Coalesced Large-Reach TLBs. In Proceedings of the 45th IEEE/ACM International Symposium on Microarchitecture (MICRO-45).Google ScholarDigital Library
Binh Pham, Ján Veselŷ, Gabriel H. Loh, and Abhishek Bhattacharjee. 2015. Large Pages and Lightweight Memory Management in Virtualized Environments: Can You Have it Both Ways? In 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-48).Google ScholarDigital Library
Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable High Performance Main Memory System Using Phase-Change Memory Technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA ’09).Google Scholar
Arun F. Rodrigues, Jeanine Cook, Elliott Cooper-Balis, K. Scott Hemmert, Chad Kersey, Rolf Riesen, Paul Rosenfeld, Ron Oldfield, and Marlow Weston. 2006. The Structural Simulation Toolkit. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC’10).Google ScholarDigital Library
Paul Rosenfeld, Elliott Cooper-Balis, and Bruce Jacob. 2011. DRAMSim2: A Cycle Accurate Memory System Simulator. IEEE Computer Architecture Letters.Google Scholar
Jee Ho Ryoo, Nagendra Gulur, Shuang Song, and Lizy K. John. 2017. Rethinking TLB Designs in Virtualized Environments: A Very Large Part-of-Memory TLB. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA’17).Google Scholar
Ashley Saulsbury, Fredrik Dahlgren, and Per Stenström. 2000. Recency-Based TLB Preloading. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA’00).Google ScholarDigital Library
D. Skarlatos, U. Darbaz, B. Gopireddy, N. S. Kim, and J. Torrellas. 2020. BabelFish: Fusing Address Translations for Containers. In Proceedings of the 47th Annual International Symposium on Computer Architecture (ISCA’20).Google Scholar
Dimitrios Skarlatos, Apostolos Kokolis, Tianyin Xu, and Josep Torrellas. 2020. Elastic Cuckoo Page Tables: Rethinking Virtual Memory Translation for Parallelism. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’20).Google ScholarDigital Library
J. E. Smith and Ravi Nair. 2005. The Architecture of Virtual Machines. IEEE Computer, 38, 5 (2005), 32–38. https://doi.org/10.1109/MC.2005.173 Google ScholarDigital Library
Shekhar Srikantaiah and Mahmut Kandemir. 2010. Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors. In Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-43).Google ScholarDigital Library
SysBench. 2019. A modular, cross-platform and multi-threaded benchmark tool.. http://manpages.ubuntu.com/manpages/trusty/man1/sysbench.1.htmlGoogle Scholar
Madhusudhan Talluri and Mark D. Hill. 1994. Surpassing the TLB Performance of Superpages with Less Operating System Support. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI).Google Scholar
The Linux Kernel Archives. 2019. Transparent Hugepage Support. https://www.kernel.org/doc/Documentation/vm/transhuge.txtGoogle Scholar
Carl A. Waldspurger. 2002. Memory Resource Management in VMware ESX Server. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI’02).Google ScholarDigital Library
Z. Wang, X. Liu, J. Yang, T. Michailidis, S. Swanson, and J. Zhao. 2020. Characterizing and Modeling Non-Volatile Memory Systems. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 496–508. https://doi.org/10.1109/MICRO50266.2020.00049 Google ScholarCross Ref
www.7-cpu.com. 2021. Intel Skylake Timing. https://www.7-cpu.com/cpu/Skylake.htmlGoogle Scholar
Zi Yan, Daniel Lustig, David Nellans, and Abhishek Bhattacharjee. 2019. Translation Ranger: Operating System Support for Contiguity-Aware TLBs. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA’19).Google ScholarDigital Library
Idan Yaniv and Dan Tsafrir. 2016. Hash, Don’t Cache (the Page Table). In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science (SIGMETRICS’16).Google ScholarDigital Library

Index Terms

Parallel virtualized memory translation with nested elastic cuckoo page tables
1. Computer systems organization
  1. Architectures
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Virtual memory

Recommendations

Direct Memory Translation for Virtualized Clouds
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2

Virtual memory translation has become a key performance bottleneck of memory-intensive workloads in virtualized cloud environments. On the x86 architecture, a nested translation needs to sequentially fetch up to 24 page table entries (PTEs). This paper ...
Read More
Reducing The Virtual Memory Overhead in Nested Virtualization
SYSTOR '23: Proceedings of the 16th ACM International Conference on Systems and Storage

Virtualization has become a critical aspect of modern computing, and with the advent of virtualization-based containers, fast nested virtualization has become increasingly important. Nested virtualization is implemented by emulating virtualization ...
Read More
Performance Implications of Extended Page Tables on Virtualized x86 Processors
VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Process- ing TLB misses on a virtualized x86 server requires a two-dimensional page walk that can have 6x more page table lookups, hence 6x more ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
February 2022
1164 pages
ISBN:9781450392051
DOI:10.1145/3503222
General Chairs:
Babak Falsafi
EPFL, Switzerland
,
Michael Ferdman
Stony Brook University, USA
,
Program Chairs:
Shan Lu
University of Chicago, USA
,
Tom Wenisch
University of Michigan, USA / Google, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 February 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Page Tables
Virtual Memory
Virtualization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate535of2,713submissions,20%
Upcoming Conference
ASPLOS '24

Sponsor:

sigarch

sigarch

sigarch

29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 27 - May 1, 2024

La Jolla , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 1,133
  Total Downloads
- Downloads (Last 12 months)390
- Downloads (Last 6 weeks)54
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Parallel virtualized memory translation with nested elastic cuckoo page tables

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Direct Memory Translation for Virtualized Clouds

Reducing The Virtual Memory Overhead in Nested Virtualization

Performance Implications of Extended Page Tables on Virtualized x86 Processors