research-article

Do-It-Yourself Virtual Memory Translation

Authors:

Yoav EtsionAuthors Info & Claims

ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture

Pages 457 - 468

https://doi.org/10.1145/3079856.3080209

Published: 24 June 2017 Publication History

Abstract

In this paper, we introduce the Do-It-Yourself virtual memory translation (DVMT) architecture as a flexible complement for current hardware-fixed translation flows. DVMT decouples the virtual-to-physical mapping process from the access permissions, giving applications freedom in choosing mapping schemes, while maintaining security within the operating system. Furthermore, DVMT is designed to support virtualized environments, as a means to collapse the costly, hardware-assisted two-dimensional translations. We describe the architecture in detail and demonstrate its effectiveness by evaluating several different DVMT schemes on a range of virtualized applications with a model based on measurements from a commercial system. We show that different DVMT configurations preserve the native performance, while achieving speedups of 1.2x to 2.0x in virtualized environments.

References

[1]

2016. Intel© 64 and IA-32 Architectures Software Developer's Manual.

[2]

2016. perf: Linux profiling with performance counters. https://perf.wiki.kernel.org/index.php/Main_Page. (2016).

[3]

Keith Adams and Ole Agesen. 2006. A comparison of software and hardware techniques for x86 virtualization. In Intl. Conf. on Arch. Support for Programming Languages & Operating Systems (ASPLOS).

Digital Library

[4]

Advanced Micro Devices 2015. AMD64 Architecture Programmer's Manual (Volume 2). Advanced Micro Devices.

[5]

Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh. 2012. Revisiting hardware-assisted Page Walks for virtualized systems. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[6]

ARM 2016. ARMv8 Architecture Reference Manual. ARM.

[7]

Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2010. Translation caching: skip, don't walk (the page table). In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[8]

Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2011. SpecTLB: a mechanism for speculative address translation. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[9]

Paul S. Barth, Rishiyur S. Nikhil, and Arvind. 1991. M-Structures: extending a parallel, non-strict, functional language with state. In ACM Conf. on Functional Programming Languages and Computer Architecture.

Digital Library

[10]

Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. 2013. Efficient virtual memory for big memory servers. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[11]

Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. 2008. Accelerating two-dimensional page walks for virtualized systems. In Intl. Conf. on Arch. Support for Programming Languages & Operating Systems (ASPLOS).

Digital Library

[12]

Abhishek Bhattacharjee. 2013. Large-reach memory management unit caches. In Intl. Symp. on Microarchitecture (MICRO).

Digital Library

[13]

Abhishek Bhattacharjee, Daniel Lustig, and Margaret Martonosi. 2011. Shared last-level TLBs for chip multiprocessors. In Symp. on High-Performance Computer Architecture (HPCA).

Digital Library

[14]

Abhishek Bhattacharjee and Margaret Martonosi. 2009. Characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors. In Intl. Conf. on Parallel Arch. and Compilation Techniques (PACT).

Digital Library

[15]

Jeffrey Buell, Daniel Hecht, Jin Heo, Kalyan Saladi, and RH Taheri. 2013. Methodology for performance analysis of VMware vSphere under Tier-1 applications. VMware Technical Journal 2, 1 (2013).

[16]

Xiaotao Chang, Hubertus Franke, Yi Ge, Tao Liu, Kun Wang, Jimi Xenidis, Fei Chen, and Yu Zhang. 2013. Improving virtualization in the presence of software managed translation lookaside buffers. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[17]

Robert S. Chappell, Jared Stark, Sangwook P. Kim, Steven K. Reinhardt, and Yale N. Patt. 1999. Simultaneous subordinate microthreading (SSMT). In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[18]

Dawson Engler, Frans Kaashoek, and James O'Toole, Jr. 1995. Exokernel: an operating system architecture for application-level resource management. In ACM Symp. on Operating Systems Principles (SOSP).

Digital Library

[19]

Dawson R. Engler, Sandeep K. Gupta, and Frans M. Kaashoek. 1995. AVM: Application-level virtual memory. In Hot Topics in Operating Systems (HotOS).

Digital Library

[20]

Zhen Fang, Lixin Zhang, John B Carter, Wilson C Hsieh, and Sally A McKee. 2001. Reevaluating online superpage promotion with hardware support. In Symp. on High-Performance Computer Architecture (HPCA).

Digital Library

[21]

Narayanan Ganapathy and Curt Schimmel. 1998. General purpose operating system support for multiple page sizes. In USENIX Ann. Tech. Symp. (ATC). http://dl.acm.org/citation.cfm?id=1268256.1268264

Digital Library

[22]

Jayneel Gandhi, Arkaprava Basu, Mark D. Hill, and Michael M. Swift. 2014. BadgerTrap: A Tool to Instrument x86-64 TLB Misses. Computer Architecture News 42, 2 (Sept. 2014), 20--23.

Digital Library

[23]

Jayneel Gandhi, Arkaprava Basu, Mark D. Hill, and Michael M. Swift. 2014. Efficient memory virtualization: reducing dimensionality of nested page walks. In Intl. Symp. on Microarchitecture (MICRO).

Digital Library

[24]

Jayneel Gandhi, Mark D. Hill, and Michael M. Swift. 2016. Agile paging: exceeding the best of nested and shadow paging. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[25]

Fei Guo, Seongbeom Kim, Yury Baskakov, and Ishan Banerjee. 2015. Proactively breaking large pages to improve memory overcommitment performance in VMware ESXi. In Intl. Conf. on Virtual Execution Environments (VEE).

Digital Library

[26]

Bruce Jacob and Trevor Mudge. 1997. Software-managed address translation. In Symp. on High-Performance Computer Architecture (HPCA).

Digital Library

[27]

Gokul B. Kandiraju and Anand Sivasubramaniam. 2002. Going the distance for TLB prefetching: an application-driven study. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[28]

Vasileios Karakostas, Jayneel Gandhi, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, and Osman S. Unsal. 2016. Energy-efficient address translation. In Symp. on High-Performance Computer Architecture (HPCA).

[29]

Henry M Levy. 1984. Capability-based computer systems. Digital Press.

Digital Library

[30]

Daniel Lustig, Abhishek Bhattacharjee, and Margaret Martonosi. 2013. TLB improvements for chip multiprocessors: Inter-core cooperative prefetchers and shared last-level TLBs. ACM Trans. on Arch. & Code Optim. 10, 1 (2013).

Digital Library

[31]

MIPS Technologies 2011. MIPS Architecture For Programmers Volume I-A: Introduction to the MIPS32 Architecture. MIPS Technologies. Revision 3.02.

[32]

David Nagle, Richard Uhlig, Tim Stanley, Stuart Sechrest, Trevor Mudge, and Richard Brown. 1993. Design tradeoffs for software-managed TLBs. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[33]

Binh Pham, Arup Bhattacharjee, Yasuko Eckert, and Gabriel H. Loh. 2014. Increasing TLB reach by exploiting clustering in page translations. In Symp. on High-Performance Computer Architecture (HPCA).

[34]

Binh Pham, Ján Veselý, Gabriel H. Loh, and Abhishek Bhattacharjee. 2015. Large pages and lightweight memory management in virtualized environments: can you have it both ways?. In Intl. Symp. on Microarchitecture (MICRO).

Digital Library

[35]

Sagi Shahar, Shai Bergman, and Mark Silberstein. 2016. ActivePointers: A case for software address translation on GPUs. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[36]

Madhusudhan Talluri and Mark D. Hill. 1994. Surpassing the TLB performance of superpages with less operating system support. In Intl. Conf. on Arch. Support for Programming Languages & Operating Systems (ASPLOS).

Digital Library

[37]

Madhusudhan Talluri, Shing Kong, Mark D. Hill, and David A. Patterson. 1992. Tradeoffs in supporting two page sizes. In Intl. Symp. on Computer Architecture (ISCA).

Digital Library

[38]

Xiaolin Wang, Jiarui Zang, Zhenlin Wang, Yingwei Luo, and Xiaoming Li. 2011. Selective hardware/software memory virtualization. In Intl. Conf. on Virtual Execution Environments (VEE).

Digital Library

[39]

David L. Weaver and Tom Germond (Eds.). 1994. The SPARC Architecture Manual (Version 9). Prentice Hall. SPARC International, Inc.

Digital Library

[40]

Timothy Wood, Gabriel Tarasuk-levin, Prashant Shenoy, Peter Desnoyers, Emmanuel Cecchet, and Mark D. Corner. 2009. Memory Buddies: exploiting page sharing for smart colocation. In Intl. Conf. on Virtual Execution Environments (VEE).

Digital Library

[41]

Idan Yaniv and Dan Tsafrir. 2016. Hash, don't cache (the page table). In Intl. Conf. on Measurement & Modeling of Computer Systems (SIGMETRICS).

Digital Library

[42]

Lixin Zhang, Evan Speight, Ram Rajamony, and Jiang Lin. 2010. Enigma: architectural and operating system support for reducing the impact of address translation. In ACM Intl. Conf. on Supercomputing.

Digital Library

Cited By

Zhang JJia WChai SLiu PKim JXu TTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Direct Memory Translation for Virtualized CloudsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640358(287-304)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640358
Li BWang YWang TEeckhout LYang JJaleel ATang X(2024)STAR: Sub-Entry Sharing-Aware TLB for Multi-Instance GPU2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00031(309-323)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00031
Chen DTong DYang CYi JCheng X(2023)FlexPointer: Fast Address Translation Based on Range TLB and Tagged PointersACM Transactions on Architecture and Code Optimization10.1145/357985420:2(1-24)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1145/3579854
Show More Cited By

Index Terms

Do-It-Yourself Virtual Memory Translation
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Virtual memory
      2. Software infrastructure
        Virtual machines

Recommendations

Large pages and lightweight memory management in virtualized environments: can you have it both ways?
MICRO-48: Proceedings of the 48th International Symposium on Microarchitecture

Large pages have long been used to mitigate address translation overheads on big-memory systems, particularly in virtualized environments where TLB miss overheads are severe. We show, however, that far from being a panacea, large pages are used ...
Do-It-Yourself Virtual Memory Translation
Special Topics

In this paper, we introduce the Do-It-Yourself virtual memory translation (DVMT) architecture as a flexible complement for current hardware-fixed translation flows. DVMT decouples the virtual-tophysical mapping process from the access permissions, ...
Do-It-Yourself Virtual Memory Translation
ISCA'17

In this paper, we introduce the Do-It-Yourself virtual memory translation (DVMT) architecture as a flexible complement for current hardware-fixed translation flows. DVMT decouples the virtual-to-physical mapping process from the access permissions, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture

June 2017

736 pages

ISBN:9781450348928

DOI:10.1145/3079856

ACM SIGARCH Computer Architecture News Volume 45, Issue 2
ISCA'17
May 2017
715 pages
ISSN:0163-5964
DOI:10.1145/3140659
Editor:
Babak Falsafi
Interim
Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IEEE: IEEE Computer Society Technical Committee on Design Automation
SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ISCA '17

Sponsor:

IEEE
SIGARCH

ISCA '17: The 44th Annual International Symposium on Computer Architecture

June 24 - 28, 2017

ON, Toronto, Canada

Acceptance Rates

ISCA '17 Paper Acceptance Rate 54 of 322 submissions, 17%;

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

36
Total Citations
View Citations
1,134
Total Downloads

Downloads (Last 12 months)73
Downloads (Last 6 weeks)8

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang JJia WChai SLiu PKim JXu TTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Direct Memory Translation for Virtualized CloudsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640358(287-304)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640358
Li BWang YWang TEeckhout LYang JJaleel ATang X(2024)STAR: Sub-Entry Sharing-Aware TLB for Multi-Instance GPU2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00031(309-323)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00031
Chen DTong DYang CYi JCheng X(2023)FlexPointer: Fast Address Translation Based on Range TLB and Tagged PointersACM Transactions on Architecture and Code Optimization10.1145/357985420:2(1-24)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1145/3579854
Zhao KXue KWang ZSchatzberg DYang LManousis AWeiner JVan Riel RSharma BTang CSkarlatos DSolihin YHeinrich M(2023)Contiguitas: The Pursuit of Physical Memory Contiguity in DatacentersProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589079(1-15)Online publication date: 17-Jun-2023
https://dl.acm.org/doi/10.1145/3579371.3589079
Lee JLee JOh YSong WRo W(2023)SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071063(1195-1207)Online publication date: Feb-2023
https://doi.org/10.1109/HPCA56546.2023.10071063
Stojkovic JMantri NSkarlatos DXu TTorrellas J(2023)Memory-Efficient Hashed Page Tables2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071061(1221-1235)Online publication date: Feb-2023
https://doi.org/10.1109/HPCA56546.2023.10071061
Tirumalasetty CChou CReddy NGratz PAbouelwafa A(2022)Reducing Minor Page Fault Overheads through Enhanced Page WalkerACM Transactions on Architecture and Code Optimization10.1145/354714219:4(1-26)Online publication date: 16-Sep-2022
https://dl.acm.org/doi/10.1145/3547142
Suchy BGhosh SKersnar DChai SHuang ZNelson ACuevas MBernat AChaudhary GHardavellas NCampanoni SDinda PFalsafi BFerdman MLu SWenisch T(2022)CARAT CAKE: replacing paging via compiler/kernel cooperationProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507771(98-114)Online publication date: 28-Feb-2022
https://dl.acm.org/doi/10.1145/3503222.3507771
Stojkovic JSkarlatos DKokolis AXu TTorrellas JFalsafi BFerdman MLu SWenisch T(2022)Parallel virtualized memory translation with nested elastic cuckoo page tablesProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507720(84-97)Online publication date: 28-Feb-2022
https://dl.acm.org/doi/10.1145/3503222.3507720
Ram VPanwar ABasu A(2021)Trident: Harnessing Architectural Resources for All Page Sizes in x86 ProcessorsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480062(1106-1120)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480062
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten