skip to main content
10.1145/3079856.3080209acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Do-It-Yourself Virtual Memory Translation

Published: 24 June 2017 Publication History

Abstract

In this paper, we introduce the Do-It-Yourself virtual memory translation (DVMT) architecture as a flexible complement for current hardware-fixed translation flows. DVMT decouples the virtual-to-physical mapping process from the access permissions, giving applications freedom in choosing mapping schemes, while maintaining security within the operating system. Furthermore, DVMT is designed to support virtualized environments, as a means to collapse the costly, hardware-assisted two-dimensional translations. We describe the architecture in detail and demonstrate its effectiveness by evaluating several different DVMT schemes on a range of virtualized applications with a model based on measurements from a commercial system. We show that different DVMT configurations preserve the native performance, while achieving speedups of 1.2x to 2.0x in virtualized environments.

References

[1]
2016. Intel© 64 and IA-32 Architectures Software Developer's Manual.
[2]
2016. perf: Linux profiling with performance counters. https://perf.wiki.kernel.org/index.php/Main_Page. (2016).
[3]
Keith Adams and Ole Agesen. 2006. A comparison of software and hardware techniques for x86 virtualization. In Intl. Conf. on Arch. Support for Programming Languages & Operating Systems (ASPLOS).
[4]
Advanced Micro Devices 2015. AMD64 Architecture Programmer's Manual (Volume 2). Advanced Micro Devices.
[5]
Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh. 2012. Revisiting hardware-assisted Page Walks for virtualized systems. In Intl. Symp. on Computer Architecture (ISCA).
[6]
ARM 2016. ARMv8 Architecture Reference Manual. ARM.
[7]
Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2010. Translation caching: skip, don't walk (the page table). In Intl. Symp. on Computer Architecture (ISCA).
[8]
Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2011. SpecTLB: a mechanism for speculative address translation. In Intl. Symp. on Computer Architecture (ISCA).
[9]
Paul S. Barth, Rishiyur S. Nikhil, and Arvind. 1991. M-Structures: extending a parallel, non-strict, functional language with state. In ACM Conf. on Functional Programming Languages and Computer Architecture.
[10]
Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. 2013. Efficient virtual memory for big memory servers. In Intl. Symp. on Computer Architecture (ISCA).
[11]
Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. 2008. Accelerating two-dimensional page walks for virtualized systems. In Intl. Conf. on Arch. Support for Programming Languages & Operating Systems (ASPLOS).
[12]
Abhishek Bhattacharjee. 2013. Large-reach memory management unit caches. In Intl. Symp. on Microarchitecture (MICRO).
[13]
Abhishek Bhattacharjee, Daniel Lustig, and Margaret Martonosi. 2011. Shared last-level TLBs for chip multiprocessors. In Symp. on High-Performance Computer Architecture (HPCA).
[14]
Abhishek Bhattacharjee and Margaret Martonosi. 2009. Characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors. In Intl. Conf. on Parallel Arch. and Compilation Techniques (PACT).
[15]
Jeffrey Buell, Daniel Hecht, Jin Heo, Kalyan Saladi, and RH Taheri. 2013. Methodology for performance analysis of VMware vSphere under Tier-1 applications. VMware Technical Journal 2, 1 (2013).
[16]
Xiaotao Chang, Hubertus Franke, Yi Ge, Tao Liu, Kun Wang, Jimi Xenidis, Fei Chen, and Yu Zhang. 2013. Improving virtualization in the presence of software managed translation lookaside buffers. In Intl. Symp. on Computer Architecture (ISCA).
[17]
Robert S. Chappell, Jared Stark, Sangwook P. Kim, Steven K. Reinhardt, and Yale N. Patt. 1999. Simultaneous subordinate microthreading (SSMT). In Intl. Symp. on Computer Architecture (ISCA).
[18]
Dawson Engler, Frans Kaashoek, and James O'Toole, Jr. 1995. Exokernel: an operating system architecture for application-level resource management. In ACM Symp. on Operating Systems Principles (SOSP).
[19]
Dawson R. Engler, Sandeep K. Gupta, and Frans M. Kaashoek. 1995. AVM: Application-level virtual memory. In Hot Topics in Operating Systems (HotOS).
[20]
Zhen Fang, Lixin Zhang, John B Carter, Wilson C Hsieh, and Sally A McKee. 2001. Reevaluating online superpage promotion with hardware support. In Symp. on High-Performance Computer Architecture (HPCA).
[21]
Narayanan Ganapathy and Curt Schimmel. 1998. General purpose operating system support for multiple page sizes. In USENIX Ann. Tech. Symp. (ATC). http://dl.acm.org/citation.cfm?id=1268256.1268264
[22]
Jayneel Gandhi, Arkaprava Basu, Mark D. Hill, and Michael M. Swift. 2014. BadgerTrap: A Tool to Instrument x86-64 TLB Misses. Computer Architecture News 42, 2 (Sept. 2014), 20--23.
[23]
Jayneel Gandhi, Arkaprava Basu, Mark D. Hill, and Michael M. Swift. 2014. Efficient memory virtualization: reducing dimensionality of nested page walks. In Intl. Symp. on Microarchitecture (MICRO).
[24]
Jayneel Gandhi, Mark D. Hill, and Michael M. Swift. 2016. Agile paging: exceeding the best of nested and shadow paging. In Intl. Symp. on Computer Architecture (ISCA).
[25]
Fei Guo, Seongbeom Kim, Yury Baskakov, and Ishan Banerjee. 2015. Proactively breaking large pages to improve memory overcommitment performance in VMware ESXi. In Intl. Conf. on Virtual Execution Environments (VEE).
[26]
Bruce Jacob and Trevor Mudge. 1997. Software-managed address translation. In Symp. on High-Performance Computer Architecture (HPCA).
[27]
Gokul B. Kandiraju and Anand Sivasubramaniam. 2002. Going the distance for TLB prefetching: an application-driven study. In Intl. Symp. on Computer Architecture (ISCA).
[28]
Vasileios Karakostas, Jayneel Gandhi, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, and Osman S. Unsal. 2016. Energy-efficient address translation. In Symp. on High-Performance Computer Architecture (HPCA).
[29]
Henry M Levy. 1984. Capability-based computer systems. Digital Press.
[30]
Daniel Lustig, Abhishek Bhattacharjee, and Margaret Martonosi. 2013. TLB improvements for chip multiprocessors: Inter-core cooperative prefetchers and shared last-level TLBs. ACM Trans. on Arch. & Code Optim. 10, 1 (2013).
[31]
MIPS Technologies 2011. MIPS Architecture For Programmers Volume I-A: Introduction to the MIPS32 Architecture. MIPS Technologies. Revision 3.02.
[32]
David Nagle, Richard Uhlig, Tim Stanley, Stuart Sechrest, Trevor Mudge, and Richard Brown. 1993. Design tradeoffs for software-managed TLBs. In Intl. Symp. on Computer Architecture (ISCA).
[33]
Binh Pham, Arup Bhattacharjee, Yasuko Eckert, and Gabriel H. Loh. 2014. Increasing TLB reach by exploiting clustering in page translations. In Symp. on High-Performance Computer Architecture (HPCA).
[34]
Binh Pham, Ján Veselý, Gabriel H. Loh, and Abhishek Bhattacharjee. 2015. Large pages and lightweight memory management in virtualized environments: can you have it both ways?. In Intl. Symp. on Microarchitecture (MICRO).
[35]
Sagi Shahar, Shai Bergman, and Mark Silberstein. 2016. ActivePointers: A case for software address translation on GPUs. In Intl. Symp. on Computer Architecture (ISCA).
[36]
Madhusudhan Talluri and Mark D. Hill. 1994. Surpassing the TLB performance of superpages with less operating system support. In Intl. Conf. on Arch. Support for Programming Languages & Operating Systems (ASPLOS).
[37]
Madhusudhan Talluri, Shing Kong, Mark D. Hill, and David A. Patterson. 1992. Tradeoffs in supporting two page sizes. In Intl. Symp. on Computer Architecture (ISCA).
[38]
Xiaolin Wang, Jiarui Zang, Zhenlin Wang, Yingwei Luo, and Xiaoming Li. 2011. Selective hardware/software memory virtualization. In Intl. Conf. on Virtual Execution Environments (VEE).
[39]
David L. Weaver and Tom Germond (Eds.). 1994. The SPARC Architecture Manual (Version 9). Prentice Hall. SPARC International, Inc.
[40]
Timothy Wood, Gabriel Tarasuk-levin, Prashant Shenoy, Peter Desnoyers, Emmanuel Cecchet, and Mark D. Corner. 2009. Memory Buddies: exploiting page sharing for smart colocation. In Intl. Conf. on Virtual Execution Environments (VEE).
[41]
Idan Yaniv and Dan Tsafrir. 2016. Hash, don't cache (the page table). In Intl. Conf. on Measurement & Modeling of Computer Systems (SIGMETRICS).
[42]
Lixin Zhang, Evan Speight, Ram Rajamony, and Jiang Lin. 2010. Enigma: architectural and operating system support for reducing the impact of address translation. In ACM Intl. Conf. on Supercomputing.

Cited By

View all
  • (2024)Direct Memory Translation for Virtualized CloudsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640358(287-304)Online publication date: 27-Apr-2024
  • (2024)STAR: Sub-Entry Sharing-Aware TLB for Multi-Instance GPU2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00031(309-323)Online publication date: 2-Nov-2024
  • (2023)FlexPointer: Fast Address Translation Based on Range TLB and Tagged PointersACM Transactions on Architecture and Code Optimization10.1145/357985420:2(1-24)Online publication date: 1-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture
June 2017
736 pages
ISBN:9781450348928
DOI:10.1145/3079856
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. TLB
  2. address translation
  3. virtual machines
  4. virtual memory

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ISCA '17
Sponsor:

Acceptance Rates

ISCA '17 Paper Acceptance Rate 54 of 322 submissions, 17%;
Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)8
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Direct Memory Translation for Virtualized CloudsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640358(287-304)Online publication date: 27-Apr-2024
  • (2024)STAR: Sub-Entry Sharing-Aware TLB for Multi-Instance GPU2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00031(309-323)Online publication date: 2-Nov-2024
  • (2023)FlexPointer: Fast Address Translation Based on Range TLB and Tagged PointersACM Transactions on Architecture and Code Optimization10.1145/357985420:2(1-24)Online publication date: 1-Mar-2023
  • (2023)Contiguitas: The Pursuit of Physical Memory Contiguity in DatacentersProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589079(1-15)Online publication date: 17-Jun-2023
  • (2023)SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071063(1195-1207)Online publication date: Feb-2023
  • (2023)Memory-Efficient Hashed Page Tables2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071061(1221-1235)Online publication date: Feb-2023
  • (2022)Reducing Minor Page Fault Overheads through Enhanced Page WalkerACM Transactions on Architecture and Code Optimization10.1145/354714219:4(1-26)Online publication date: 16-Sep-2022
  • (2022)CARAT CAKE: replacing paging via compiler/kernel cooperationProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507771(98-114)Online publication date: 28-Feb-2022
  • (2022)Parallel virtualized memory translation with nested elastic cuckoo page tablesProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507720(84-97)Online publication date: 28-Feb-2022
  • (2021)Trident: Harnessing Architectural Resources for All Page Sizes in x86 ProcessorsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480062(1106-1120)Online publication date: 18-Oct-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media