research-article

Performance Implications of Extended Page Tables on Virtualized x86 Processors

Authors:

Timothy Merrifield,

H. Reza TaheriAuthors Info & Claims

VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Pages 25 - 35

https://doi.org/10.1145/2892242.2892258

Published: 25 March 2016 Publication History

Abstract

Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Process- ing TLB misses on a virtualized x86 server requires a two-dimensional page walk that can have 6x more page table lookups, hence 6x more memory references, than a native page table walk. Thus much of the recent research on the subject starts from the assumption that TLB miss processing in virtual environments is significantly more expensive than on native servers. However, we will show that with the latest software stack on modern x86 processors, most of these page-table lookups are satisfied by internal paging structure caches and the L1/L2 data caches, and the actual virtualization overhead of TLB miss processing is a modest fraction of the overall time spent processing TLB misses.

In this paper, we present a detailed accounting of the TLB miss processing costs on virtualized x86 servers for an exhaustive set of workloads, in particular, two very demanding industry standard workloads. We show that an implementation of the TPC-C workload that actively uses 475 GB of memory on a 72-CPU Haswell-EP server spends 20% of its time processing TLB misses when the application runs in a VM. Although this is a non-trivial amount, it is only 4.2% higher than the TLB miss processing costs on bare metal. The multi-VM VMmark benchmark sees 12.3% in TLB miss processing, but only 4.3% of that can be attributed to virtualization overheads. We show that even for the heaviest workloads, a well-tuned application that uses large pages on a recent OS release with a modern hypervisor running on the latest x86 processors sees only minimal degradation from the additional overhead of the two-dimensional page walks in a virtualized server.

References

[1]

K. Adams and O. Agesen, "A comparison of software and hardware techniques for x86 virtualization," in Proceedings of the 12th international conference on Architectural sup- port for programming languages and operating systems (ASPLOS), 2006.

[2]

T. Barr, A. Cox, and S. Rixner, Translation Caching: Skip, Don't Walk the Page Table, in Proceedings of the 37th annual international symposium on computer architecture(ISCA), 2010.

Digital Library

[3]

----, SpecTLB: A Mechanism for Speculative Address Translation, in Proceedings of the 38th annual international symposium on computer architecture (ISCA), 2011.

Digital Library

[4]

A. Basu, J. Gandhi, J. Chang, M. Hill, and M. Swift, Efficient Virtual Memory for Big Memory Servers, in Proceedings of the 39th annual international symposium on computer architecture (ISCA), 2012.

[5]

R. Bhargava, B. Serebrin, F. Spadini, and S. Manne, Accelerating two-dimensional page walks for virtualized systems, in Proceedings of the 13th international conference on Architectural support for programming languages and operating systems (ASPLOS), 2008.

Digital Library

[6]

C. Bienia, S. Kumar, J. P. Singh, and K. Li, The PARSEC benchmark suite: characterization and architectural implications, in Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT) 2008, 2008.

Digital Library

[7]

J. Buell, D. Hecht, J. Heo, K. Saladi, and H. R. Taheri, Methodology for Performance Analysis of VMware vSere under Tier-1 Applications, in VMware Technical Journal, 2013.

[8]

X. Chang, H. Franke, Y. Ge, T. Liu, K. Wang, J. Xenidis, F. Chen, and Y. Zhang, Improving Virtualization in the Presence of Software Managed Translation Lookaside Buffers, in Proceedings of the 40th annual international symposium on computer architecture (ISCA), 2013.

Digital Library

[9]

J. Gandhi, A. Basu, M. Hill, and M. Swift, Efficient Memory Virtualization: Reducing Dimensionality of Nested Page Walks, in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47), 2014.

Digital Library

[10]

J. L. Henning and SPEC, "benchmark descriptions, in ACM SIGARCH Computer Architecture News," vol. 34, Sep. 2006.

[11]

J. Huck and J. Hays, Architectural support for translation table management in large address space machines, in Proceedings of the 20th annual international symposium on computer architecture (ISCA), 1993.

Digital Library

[12]

Intel, Intel 64 and IA-32 Architectures Optimization Reference Manual, 2015.

[13]

----, Intel 64 and IA-32 Architectures Software Developer's Manual, 2015.

[14]

B. Jacob and T. Mudge, Uniprocessor virtual memory without TLBs, in IEEE Transactions on Computers (Volume:50, Issue: 5 ), May 2001.

Digital Library

[15]

V. Karakostas, J. Gandhi, F. Ayar, A. Cristal, M. Hill, K. McKinley, M. Nemirovsky, M. Swift, and O. Unsal, Redundant Memory Mappings for Fast Access to Large Memories, in Proceedings of the 45thth annual international symposium on computer architecture (ISCA), 2015.

Digital Library

[16]

\BIBentryALTinterwordspacingC.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: Building customized program analysis tools with dynamic instrumentation," in Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI'05. New York, NY, USA: ACM, 2005, pp. 190--200. [Online]. Available: http://doi.acm.org/10.1145/1065010.1065034\

Digital Library

[17]

J. Navarr, S. Iyer, P. Druschel, and A. Cox, Practical, transparent operating system support for superpages, Proceedings of the 5th symposium on Operating systems design and implementation (OSDI) 2012, 2012.

Digital Library

[18]

B. am, J. Vesely, G. H. Loh, and A. Bhattacharjee, Large Pages and Lightweight Memory Management in Virtualized Environments: Can You Have it Both Ways?, in Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-48), 2015.

[19]

----, Using TLB Speculation to Overcome Page Splintering in Virtual Machines, in Rutgers University Technical Report DCS-TR-713, Mar. 2015.

[20]

T. H. Romer, W. H. Ohlrich, A. R. Karlin, and B. N. Bershad, Reducing TLB and Memory Overhead Using Online Superpage Promotion, in Proceedings of the 22th annual international symposium on computer architecture (ISCA), 1995.

Digital Library

[21]

D. T.-C. D. TPC, http://www.tpc.org/tpcc/detail.asp.

[22]

VMware, Understanding Full Virtualization, Paravirtualization, and Hardware Assist. [Online]. Available: https://www.vmware.com/files/pdf/VMware\_paravirtualization.pdf\

[23]

----, VMmark Benchmark 2. [Online]. Available: http://www.vmware.com/products/vmmark

[24]

S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, The SPLASH-2 programs: characterization and methodological considerations, 1995.

Cited By

Psomadakis SAlverti CKarakostas VKatsakioris CSiakavaras DNikas KGoumas GKoziris N(2024)Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00012(17-35)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00012
Sha SLi CLuo YWang XWang ZFedorova ANarayanan DDi Luna GQuerzoni L(2023)vTMM: Tiered Memory Management for Virtual MachinesProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587449(283-297)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3587449
Jia WZhang JShan JDing XFedorova ANarayanan DDi Luna GQuerzoni L(2023)Making Dynamic Page Coalescing Effective on Virtualized CloudsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567487(298-313)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3567487
Show More Cited By

Index Terms

Performance Implications of Extended Page Tables on Virtualized x86 Processors
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Virtual memory
      2. Software infrastructure
        Virtual machines

Recommendations

Accelerating two-dimensional page walks for virtualized systems
ASPLOS XIII: Proceedings of the 13th international conference on Architectural support for programming languages and operating systems

Nested paging is a hardware solution for alleviating the software memory management overhead imposed by system virtualization. Nested paging complements existing page walk hardware to form a two-dimensional (2D) page walk, which reduces the need for ...
Performance Implications of Extended Page Tables on Virtualized x86 Processors
VEE '16

Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Process- ing TLB misses on a virtualized x86 server requires a two-dimensional page walk that can have 6x more page table lookups, hence 6x more ...
Performance Implications of Extended Page Tables on Virtualized x86 Processors
Special Topics

Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Processing TLB misses on a virtualized x86 server requires a twodimensional page walk that can have 6x more page table lookups, hence 6x more ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

March 2016

186 pages

ISBN:9781450339476

DOI:10.1145/2892242

General Chair:
Vishakha Gupta-Cledat
Intel Labs
,
Program Chairs:
Donald E. Porter
Stony Brook University
,
Vivek Sarkar
Rice University

ACM SIGPLAN Notices Volume 51, Issue 7
VEE '16
July 2016
167 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3007611
Editor:
Matthew Fluet
Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

VEE '16

Sponsor:

VEE '16: 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

April 2 - 3, 2016

Georgia, Atlanta, USA

Acceptance Rates

VEE '16 Paper Acceptance Rate 10 of 29 submissions, 34%;

Overall Acceptance Rate 80 of 235 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
896
Total Downloads

Downloads (Last 12 months)76
Downloads (Last 6 weeks)8

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Psomadakis SAlverti CKarakostas VKatsakioris CSiakavaras DNikas KGoumas GKoziris N(2024)Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00012(17-35)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00012
Sha SLi CLuo YWang XWang ZFedorova ANarayanan DDi Luna GQuerzoni L(2023)vTMM: Tiered Memory Management for Virtual MachinesProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587449(283-297)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3587449
Jia WZhang JShan JDing XFedorova ANarayanan DDi Luna GQuerzoni L(2023)Making Dynamic Page Coalescing Effective on Virtualized CloudsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3567487(298-313)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1145/3552326.3567487
Yao LLi YGuo FWu SXu YLui J(2023)Towards High Performance and Efficient Memory Deduplication via Mixed PagesIEEE Transactions on Computers10.1109/TC.2022.319174272:4(926-940)Online publication date: 1-Apr-2023
https://doi.org/10.1109/TC.2022.3191742
Jia WZhang JShan JDu YDing XXu T(2023)HugeGPT: Storing Guest Page Tables on Host Huge Pages to Accelerate Address TranslationProceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT58117.2023.00014(62-73)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1109/PACT58117.2023.00014
Sha SZhang YLuo YWang XWang Z(2022)Accelerating Address Translation for Virtualization by Leveraging Hardware ModeIEEE Transactions on Computers10.1109/TC.2022.314567171:11(3047-3060)Online publication date: 1-Nov-2022
https://doi.org/10.1109/TC.2022.3145671
Tian CLiu HLiao XJin H(2022)UCat: heterogeneous memory management for unikernelsFrontiers of Computer Science10.1007/s11704-022-1201-y17:1Online publication date: 8-Aug-2022
https://doi.org/10.1007/s11704-022-1201-y
Sha SZhang YLuo YWang XWang ZTitzer BXu HZhang I(2021)Swift shadow paging (SSP): no write-protection but following TLB flushingProceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3453933.3454012(29-42)Online publication date: 7-Apr-2021
https://dl.acm.org/doi/10.1145/3453933.3454012
Panwar AAchermann RBasu ABhattacharjee AGopinath KGandhi JSherwood TBerger EKozyrakis C(2021)Fast local page-tables for virtualized NUMA servers with vMitosisProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446709(194-210)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3445814.3446709
Schildermans SShan JAerts KJackrel JDing X(2021)Virtualization Overhead of Multithreading in X86 State-of-the-Art & Remaining ChallengesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.306470932:10(2557-2570)Online publication date: 1-Oct-2021
https://doi.org/10.1109/TPDS.2021.3064709
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents