skip to main content
research-article

Performance Implications of Extended Page Tables on Virtualized x86 Processors

Published: 11 September 2017 Publication History

Abstract

Managing virtual memory is an expensive operation, and becomes even more expensive on virtualized servers. Processing TLB misses on a virtualized x86 server requires a twodimensional page walk that can have 6x more page table lookups, hence 6x more memory references, than a native page table walk. Thus much of the recent research on the subject starts from the assumption that TLB miss processing in virtual environments is significantly more expensive than on native servers. However, we will show that with the latest software stack on modern x86 processors, most of these page table lookups are satisfied by internal paging structure caches and the L1/L2 data caches, and the actual virtualization overhead of TLB miss processing is a modest fraction of the overall time spent processing TLB misses.
We show that even for the heaviest workloads, a welltuned application that uses large pages on a recent OS release with a modern hypervisor running on the latest x86 processors sees only minimal degradation from the additional overhead of the two-dimensional page walks in a virtualized server.

References

[1]
K. Adams and O. Agesen, "A comparison of software and hardware techniques for x86 virtualization," in Proceedings of the 12th international conference on Architectural support for programming languages and operating systems (ASPLOS), 2006.
[2]
T. Barr, A. Cox, and S. Rixner, Translation Caching: Skip, Don't Walk the Page Table, in Proceedings of the 37th annual international symposium on computer architecture(ISCA), 2010.
[3]
____, SpecTLB: A Mechanism for Speculative Address Translation, in Proceedings of the 38th annual international symposium on computer architecture (ISCA), 2011.
[4]
A. Basu, J. Gandhi, J. Chang, M. Hill, and M. Swift, Efficient Virtual Memory for Big Memory Servers, in Proceedings of the 39th annual international symposium on computer architecture (ISCA), 2012.
[5]
R. Bhargava, B. Serebrin, F. Spadini, and S. Manne, Accelerating two-dimensional page walks for virtualized systems, in Proceedings of the 13th international conference on Architectural support for programming languages and operating systems (ASPLOS), 2008.
[6]
C. Bienia, S. Kumar, J. P. Singh, and K. Li, The PARSEC benchmark suite: characterization and architectural implications, in Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT) 2008, 2008.
[7]
J. Buell, D. Hecht, J. Heo, K. Saladi, and H. R. Taheri, Methodology for Performance Analysis of VMware vSphere under Tier-1 Applications, in VMware Technical Journal, 2013.
[8]
X. Chang, H. Franke, Y. Ge, T. Liu, K. Wang, J. Xenidis, F. Chen, and Y. Zhang, Improving Virtualization in the Presence of Software Managed Translation Lookaside Buffers, in Proceedings of the 40th annual international symposium on computer architecture (ISCA), 2013.
[9]
J. Gandhi, A. Basu, M. Hill, and M. Swift, Efficient Memory Virtualization: Reducing Dimensionality of Nested Page Walks, in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47), 2014.
[10]
J. L. Henning and SPEC, "benchmark descriptions, in ACM SIGARCH Computer Architecture News," vol. 34, Sep. 2006.
[11]
J. Huck and J. Hays, Architectural support for translation table management in large address space machines, in Proceedings of the 20th annual international symposium on computer architecture (ISCA), 1993.
[12]
Intel, Intel 64 and IA-32 Architectures Optimization Reference Manual, 2015.
[13]
____, Intel 64 and IA-32 Architectures Software Developer's Manual, 2015.
[14]
B. Jacob and T. Mudge, Uniprocessor virtual memory without TLBs, in IEEE Transactions on Computers (Volume:50, Issue: 5 ), May 2001.
[15]
V. Karakostas, J. Gandhi, F. Ayar, A. Cristal, M. Hill, K. McKinley, M. Nemirovsky, M. Swift, and O. Unsal, Redundant Memory Mappings for Fast Access to Large Memories, in Proceedings of the 45thth annual international symposium on computer architecture (ISCA), 2015.
[16]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: Building customized program analysis tools with dynamic instrumentation," in Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '05. New York, NY, USA: ACM, 2005, pp. 190--200. {Online}. Available:
[17]
J. Navarr, S. Iyer, P. Druschel, and A. Cox, Practical, transparent operating system support for superpages, Proceedings of the 5th symposium on Operating systems design and implementation (OSDI) 2012, 2012.
[18]
B. Pham, J. Vesely, G. H. Loh, and A. Bhattacharjee, Large Pages and Lightweight Memory Management in Virtualized Environments: Can You Have it Both Ways?, in Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-48), 2015.
[19]
____, Using TLB Speculation to Overcome Page Splintering in Virtual Machines, in Rutgers University Technical Report DCS-TR-713, Mar. 2015.
[20]
T. H. Romer, W. H. Ohlrich, A. R. Karlin, and B. N. Bershad, Reducing TLB and Memory Overhead Using Online Superpage Promotion, in Proceedings of the 22th annual international symposium on computer architecture (ISCA), 1995.
[21]
D. T.-C. D. TPC, http://www.tpc.org/tpcc/detail.asp.
[22]
VMware, Understanding Full Virtualization, Paravirtualization, and Hardware Assist. {Online}. Available: https://www.vmware.com/files/pdf/VMware paravirtualization.pdf
[23]
____, VMmark Benchmark 2. {Online}. Available: http://www.vmware.com/products/vmmark
[24]
S. C.Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, The SPLASH-2 programs: characterization and methodological considerations, 1995.
  1. Performance Implications of Extended Page Tables on Virtualized x86 Processors

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 51, Issue 1
      Special Topics
      August 2017
      123 pages
      ISSN:0163-5980
      DOI:10.1145/3139645
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 September 2017
      Published in SIGOPS Volume 51, Issue 1

      Check for updates

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 196
        Total Downloads
      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media