research-article

Public Access

Mosaic Pages: Big TLB Reach with Small Pages

Authors:
Krishnan Gosakan

Rutgers University, USA

Rutgers University, USA
View Profile

,
Jaehyun Han

University of North Carolina at Chapel Hill, USA

University of North Carolina at Chapel Hill, USA
View Profile

,
William Kuszmaul

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Ibrahim N. Mubarek

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Nirjhar Mukherjee

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Karthik Sriram

Yale University, USA

Yale University, USA
View Profile

,
Guido Tagliavini

Rutgers University, USA

Rutgers University, USA
View Profile

,
Evan West

Stony Brook University, USA

Stony Brook University, USA
View Profile

,
Michael A. Bender

Stony Brook University, USA

Stony Brook University, USA
View Profile

,
Abhishek Bhattacharjee

Yale University, USA

Yale University, USA
View Profile

,
Alex Conway

VMware Research, USA

VMware Research, USA
View Profile

,
Martin Farach-Colton

Rutgers University, USA

Rutgers University, USA
View Profile

,
Jayneel Gandhi

Meta, USA

Meta, USA
View Profile

,
Rob Johnson

VMware Research, USA

VMware Research, USA
View Profile

,
Sudarsun Kannan

Rutgers University, USA

Rutgers University, USA
View Profile

,
Donald E. Porter

University of North Carolina at Chapel Hill, USA

University of North Carolina at Chapel Hill, USA
View Profile

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3March 2023Pages 433–448https://doi.org/10.1145/3582016.3582021

Published:25 March 2023Publication History

Related Artifact: Mosaic Pages: Big TLB Reach with Small Pages March 2023 software https://doi.org/10.5281/zenodo.7709303

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3

Pages 433–448

ABSTRACT

The TLB is increasingly a bottleneck for big data applications. In most designs, the number of TLB entries are highly constrained by latency requirements, and growing much more slowly than the working sets of applications. Many solutions to this problem, such as huge pages, perforated pages, or TLB coalescing, rely on physical contiguity for performance gains, yet the cost of defragmenting memory can easily nullify these gains. This paper introduces mosaic pages, which increase TLB reach by compressing multiple, discrete translations into one TLB entry. Mosaic leverages virtual contiguity for locality, but does not use physical contiguity. Mosaic relies on recent advances in hashing theory to constrain memory mappings, in order to realize this physical address compression without reducing memory utilization or increasing swapping. This paper presents a full-system prototype of Mosaic, in gem5 and modified Linux. In simulation and with comparable hardware to a traditional design, mosaic reduces TLB misses in several workloads by 6-81%. Our results show that Mosaic’s constraints on memory mappings do not harm performance, we never see conflicts before memory is 98% full in our experiments — at which point, a traditional design would also likely swap. Once memory is over-committed, Mosaic swaps fewer pages than Linux in most cases. Finally, we present timing and area analysis for a verilog implementation of the hashing function required on the critical path for the TLB, and show that on a commercial 28nm CMOS process; the circuit runs at a maximum frequency of 4 GHz, indicating that a mosaic TLB is unlikely to affect clock frequency.

References

Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh. 2015. Fast Two-Level Address Translation for Virtualized Systems. IEEE Trans. Comput., 64, 12 (2015), dec, 3461–3474. issn:0018-9340 https://doi.org/10.1109/tc.2015.2401022 Google ScholarDigital Library
Chloe Alverti, Stratos Psomadakis, Vasileios Karakostas, Jayneel Gandhi, Konstantinos Nikas, Georgios Goumas, and Nectarios Koziris. 2020. Enhancing and Exploiting Contiguity for Fast Memory Virtualization. In Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA ’20). IEEE, Virtual Event. 515–528. isbn:9781728146614 https://doi.org/10.1109/ISCA45697.2020.00050 Google ScholarDigital Library
Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2010. Translation Caching: Skip, Don’t Walk (the Page Table). In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA ’10). ACM, New York, NY, USA. 48–59. isbn:9781450300537 https://doi.org/10.1145/1815961.1815970 Google ScholarDigital Library
Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2011. SpecTLB: A Mechanism for Speculative Address Translation. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA ’11). ACM, New York, NY, USA. 307–318. isbn:9781450304726 https://doi.org/10.1145/2000064.2000101 Google ScholarDigital Library
Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. 2013. Efficient Virtual Memory for Big Memory Servers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA ’13). ACM, New York, NY, USA. 237–248. isbn:9781450320795 https://doi.org/10.1145/2485922.2485943 Google ScholarDigital Library
Arkaprava Basu, Mark D. Hill, and Michael M. Swift. 2012. Reducing Memory Reference Energy with Opportunistic Virtual Caching. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA ’12). IEEE Computer Society, USA. 297–308. isbn:9781450316422 Google ScholarDigital Library
Michael A. Bender, Abhishek Bhattacharjee, Alex Conway, Martín Farach-Colton, Rob Johnson, Sudarsun Kannan, William Kuszmaul, Nirjhar Mukherjee, Don Porter, Guido Tagliavini, Janet Vorobyeva, and Evan West. 2021. Paging and the Address-Translation Problem. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’21). ACM, New York, NY, USA. 105–117. isbn:9781450380706 https://doi.org/10.1145/3409964.3461814 Google ScholarDigital Library
Michael A. Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, and Guido Tagliavini. 2021. All-Purpose Hashing. https://doi.org/10.48550/ARXIV.2109.04548 Google Scholar
Michael A. Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, and Guido Tagliavini. 2023. Tiny Pointers. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’23). Society for Industrial and Applied Mathematics, USA. 477–508. https://doi.org/10.1137/1.9781611977554.ch21 arxiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611977554.ch21. Google ScholarCross Ref
Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. 2008. Accelerating Two-Dimensional Page Walks for Virtualized Systems. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII). ACM, New York, NY, USA. 26–35. isbn:9781595939586 https://doi.org/10.1145/1346281.1346286 Google ScholarDigital Library
Abhishek Bhattacharjee. 2013. Large-Reach Memory Management Unit Caches. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA. 383–394. isbn:9781450326384 https://doi.org/10.1145/2540708.2540741 Google ScholarDigital Library
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News, 39, 2 (2011), aug, 1–7. issn:0163-5964 https://doi.org/10.1145/2024716.2024718 Google ScholarDigital Library
Yann Collet. 2016. xxHash: Extremely fast hash algorithm. https://cyan4973.github.io/xxHash/ Google Scholar
Intel Coorporation. 2022. Intel 64 and IA-32 architectures optimization reference manual. Google Scholar
Cort Dougan, Paul Mackerras, and Victor Yodaiken. 1999. Optimizing the Idle Task and Other MMU Tricks. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI ’99). USENIX Association, USA. 229–237. isbn:1880446391 https://doi.org/10.5555/296806.296833 Google ScholarDigital Library
Yu Du, Miao Zhou, Bruce R Childers, Daniel Mossé, and Rami Melhem. 2015. Supporting Superpages in Non-Contiguous Physical Memory. In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA ’15). IEEE, USA. 223–234. https://doi.org/10.1109/hpca.2015.7056035 Google ScholarCross Ref
Stephane Eranian and David Mosberger. 2000. The Linux/ia64 Project: Kernel Design and Status Update. HP LABORATORIES TECHNICAL REPORT HPL. Google Scholar
James R. Goodman. 1987. Coherency for Multiprocessor Virtual Address Caches. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems (ASPLOS II). ACM, ew York, NY, USA. 72–81. isbn:0818608056 https://doi.org/10.1145/36206.36186 Google ScholarDigital Library
Mel Gorman. 2010. Linux Huge Pages. https://lwn.net/Articles/375096/ Google Scholar
Mel Gorman. 2018. AMD Zen Architecture. https://en.wikichip.org/wiki/amd/microarchitectures/zen Google Scholar
Charles Gray, Matthew Chapman, Peter Chubb, David Mosberger-Tang, and Gernot Heiser. 2005. Itanium: A System Implementor’s Tale. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC ’05). USENIX Association, USA. 264–278. Google Scholar
Joe Heinrich. 1994. MIPS R4000 Microprocessor User’s Manual. Google Scholar
Mark D. Hill and Alan Jay Smith. 1984. Experimental Evaluation of On-Chip Microprocessor Cache Memories. In Proceedings of the 11th Annual International Symposium on Computer Architecture (ISCA ’84). ACM, New York, NY, USA. 158–166. isbn:0818605383 https://doi.org/10.1145/800015.808178 Google ScholarDigital Library
Michal Hocko and Tomas Kalibera. 2010. Reducing Performance Non-Determinism via Cache-Aware Page Allocation Strategies. In Proceedings of the First Joint WOSP/SIPEW International Conference on Performance Engineering (WOSP/SIPEW ’10). ACM, New York, NY, USA. 223–234. isbn:9781605585635 https://doi.org/10.1145/1712605.1712640 Google ScholarDigital Library
Jerry Huck and Jim Hays. 1993. Architectural Support for Translation Table Management in Large Address Space Machines. In Proceedings of the 20th Annual International Symposium on Computer Architecture (ISCA ’93). ACM, New York, NY, USA. 39–50. isbn:0-8186-3810-9 https://doi.org/10.1145/165123.165128 Google ScholarDigital Library
Bruce L. Jacob and Trevor N. Mudge. 1998. A Look at Several Memory Management Units, TLB-Refill Mechanisms, and Page Table Organizations. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII). ACM, New York, NY, USA. 295–306. isbn:1581131070 https://doi.org/10.1145/291069.291065 Google ScholarDigital Library
Konstantinos Kanellopoulos, Rahul Bera, Kosta Stojiljkovic, Can Firtina, Rachata Ausavarungnirun, Nastaran Hajinazar, Jisung Park, Nandita Vijaykumar, and Onur Mutlu. 2022. Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping. https://doi.org/10.48550/arXiv.2211.12205 arxiv:2211.12205. Google Scholar
Vasileios Karakostas, Jayneel Gandhi, Furkan Ayar, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, and Osman Ünsal. 2015. Redundant Memory Mappings for Fast Access to Large Memories. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA ’15). ACM, New York, NY, USA. 66–78. isbn:9781450334020 https://doi.org/10.1145/2749469.2749471 Google ScholarDigital Library
Vasileios Karakostas, Jayneel Gandhi, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, and Osman S. Unsal. 2016. Energy-efficient address translation. In Proceedings of the 22nd International Symposium on High Performance Computer Architecture (HPCA ’16). IEEE, USA. 631–643. https://doi.org/10.1109/HPCA.2016.7446100 Google ScholarCross Ref
Stefanos Kaxiras and Alberto Ros. 2013. A New Perspective for Efficient Virtual-Cache Coherence. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA ’13). ACM, New York, NY, USA. 535–546. isbn:9781450320795 https://doi.org/10.1145/2485922.2485968 Google ScholarDigital Library
Richard E Kessler and Mark D Hill. 1992. Page Placement Algorithms for Large Real-Indexed Caches. ACM Transactions on Computer Systems, 10, 4 (1992), nov, 338–359. issn:0734-2071 https://doi.org/10.1145/138873.138876 Google ScholarDigital Library
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel. 2016. Coordinated and Efficient Huge Page Management with Ingens. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI ’16). USENIX Association, USA. 705–721. isbn:978-1-931971-33-1 https://doi.org/10.5555/3026877.3026931 Google ScholarDigital Library
John S. Liptay. 1968. Structural Aspects of the System/360 Model 85: II the Cache. IBM Systems Journal, 7, 1 (1968), mar, 15–21. issn:0018-8670 https://doi.org/10.1147/sj.71.0015 Google ScholarDigital Library
Artemiy Margaritov, Dmitrii Ustiugov, Edouard Bugnion, and Boris Grot. 2019. Prefetched Address Translation. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-52). ACM, New York, NY, USA. 1023–1036. isbn:9781450369381 https://doi.org/10.1145/3352460.3358294 Google ScholarDigital Library
Chris Mellor. 2022. SK hynix announces CXL 2 memory cards and SDK. https://blocksandfiles.com/2022/08/02/sk-hynix-announces-cxl-2-memory-cards-and-sdk/ Google Scholar
2022. "Disable Transparent Huge Pages (THP)". https://www.mongodb.com/docs/manual/tutorial/transparent-huge-pages/ Google Scholar
Juan Navarro, Sitaram Iyer, Peter Druschel, and Alan Cox. 2002. Practical, Transparent Operating System Support for Superpages. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI ’02). USENIX Association, USA. 89–104. isbn:9781450301114 https://doi.org/10.5555/1060289.1060299 Google ScholarDigital Library
Prashant Pandey, Michael A. Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, Guido Tagliavini, and Rob Johnson. 2023. IcebergHT: High Performance PMEM Hash Tables Through Stability and Low Associativity. In Proceedings of the 2023 International Conference on Management of Data, to be published (SIGMOD ’23). ACM, New York, NY, USA. Google Scholar
Ashish Panwar, Sorav Bansal, and K. Gopinath. 2019. HawkEye: Efficient Fine-Grained OS Support for Huge Pages. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’19). ACM, New York, NY, USA. 347–360. isbn:9781450362405 https://doi.org/10.1145/3297858.3304064 Google ScholarDigital Library
Chang Hyun Park, Sanghoon Cha, Bokyeong Kim, Youngjin Kwon, David Black-Schaffer, and Jaehyuk Huh. 2020. Perforated Page: Supporting Fragmented Memory Allocation for Large Pages. In Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA ’20). IEEE Press, Virtual Event. 913–925. isbn:9781728146614 https://doi.org/10.1109/ISCA45697.2020.00079 Google ScholarDigital Library
Chang Hyun Park, Taekyung Heo, and Jaehyuk Huh. 2016. Efficient Synonym Filtering and Scalable Delayed Translation for Hybrid Virtual Caching. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA ’16). IEEE, Seoul, Republic of Korea. 217–229. isbn:9781467389471 https://doi.org/10.1109/ISCA.2016.28 Google ScholarDigital Library
Chang Hyun Park, Taekyung Heo, Jungi Jeong, and Jaehyuk Huh. 2017. Hybrid TLB Coalescing: Improving TLB Translation Coverage under Diverse Fragmented Memory Allocations. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA ’17). ACM, New York, NY, USA. 444–456. isbn:9781450348928 https://doi.org/10.1145/3079856.3080217 Google ScholarDigital Library
Mihai Patrascu and Mikkel Thorup. 2011. The Power of Simple Tabulation Hashing. In Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing (STOC ’11). ACM, New York, NY, USA. 1–10. isbn:9781450306911 https://doi.org/10.1145/1993636.1993638 Google ScholarDigital Library
Binh Pham, Abhishek Bhattacharjee, Yasuko Eckert, and Gabriel H. Loh. 2014. Increasing TLB reach by exploiting clustering in page translations. In Proceedings of the 20th International Symposium on High Performance Computer Architecture (HPCA ’14). IEEE, Los Alamitos, CA, USA. 558–567. issn:1530-0897 https://doi.org/10.1109/HPCA.2014.6835964 Google ScholarCross Ref
Binh Pham, Viswanathan Vaidyanathan, Aamer Jaleel, and Abhishek Bhattacharjee. 2012. CoLT: Coalesced Large-Reach TLBs. In Proceedings of the 45th International Symposium on Microarchitecture (MICRO-45). IEEE, USA. 258–269. https://doi.org/10.1109/MICRO.2012.32 Google ScholarDigital Library
Binh Pham, Ján Veselý, Gabriel H. Loh, and Abhishek Bhattacharjee. 2015. Large Pages and Lightweight Memory Management in Virtualized Environments: Can You Have It Both Ways? In Proceedings of the 48th International Symposium on Microarchitecture (MICRO-48). ACM, New York, NY, USA. 1–12. isbn:9781450340342 https://doi.org/10.1145/2830772.2830773 Google ScholarDigital Library
Javier Picorel, Djordje Jevdjic, and Babak Falsafi. 2017. Near-Memory Address Translation. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT ’17). IEEE Computer Society, Los Alamitos, CA, USA. 303–317. https://doi.org/10.1109/PACT.2017.56 Google ScholarCross Ref
2022. Redis Administration. https://redis.io/docs/manual/admin/ Google Scholar
Dimitrios Skarlatos, Apostolos Kokolis, Tianyin Xu, and Josep Torrellas. 2020. Elastic Cuckoo Page Tables: Rethinking Virtual Memory Translation for Parallelism. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’20). ACM, New York, NY, USA. 1093–1108. isbn:9781450371025 https://doi.org/10.1145/3373376.3378493 Google ScholarDigital Library
Alan Jay Smith. 1978. A Comparative Study of Set Associative Memory Mapping Algorithms and Their Use for Cache and Main Memory. IEEE Transactions on Software Engineering, SE-4, 2 (1978), mar, 121–130. issn:0098-5589 https://doi.org/10.1109/TSE.1978.231482 Google ScholarDigital Library
2021. Transparent huge memory pages and Splunk performance. https://docs.splunk.com/Documentation/Splunk/7.3.1/ReleaseNotes/SplunkandTHP Google Scholar
Jovan Stojkovic, Dimitrios Skarlatos, Apostolos Kokolis, Tianyin Xu, and Josep Torrellas. 2022. Parallel Virtualized Memory Translation with Nested Elastic Cuckoo Page Tables. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’22). ACM, New York, NY, USA. 84–97. isbn:9781450392051 https://doi.org/10.1145/3503222.3507720 Google ScholarDigital Library
Mark Swanson, Leigh Stoller, and John Carter. 1998. Increasing TLB Reach Using Superpages Backed by Shadow Memory. In Proceedings of the 25th Annual International Symposium on Computer Architecture (ISCA ’98). IEEE Computer Society, USA. 204–213. isbn:0818684917 https://doi.org/10.1145/279361.279388 Google ScholarDigital Library
Michael M. Swift. 2017. Towards O(1) Memory. In Proceedings of the 16th Workshop on Hot Topics in Operating Systems (HotOS ’17). ACM, New York, NY, USA. 7–11. isbn:9781450350686 https://doi.org/10.1145/3102980.3102982 Google ScholarDigital Library
Madhusudhan Talluri and Mark D. Hill. 1994. Surpassing the TLB Performance of Superpages with Less Operating System Support. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI). ACM, New York, NY, USA. 171–182. isbn:0897916603 https://doi.org/10.1145/195473.195531 Google ScholarDigital Library
M. Talluri, M. D. Hill, and Y. A. Khalidi. 1995. A New Page Table for 64-Bit Address Spaces. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles (SOSP ’95). ACM, New York, NY, USA. 184–200. isbn:0897917154 https://doi.org/10.1145/224056.224071 Google ScholarDigital Library
Xulong Tang, Ziyu Zhang, Weizheng Xu, Mahmut Taylan Kandemir, Rami Melhem, and Jun Yang. 2020. Enhancing Address Translations in Throughput Processors via Compression. In Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (PACT ’20). ACM, New York, NY, USA. 191–204. https://doi.org/10.1145/3410463.3414633 Google ScholarDigital Library
George Taylor, Peter Davies, and Michael Farmwald. 1990. The TLB Slice—a Low-Cost High-Speed Address Translation Mechanism. In Proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA ’90). ACM, New York, NY, USA. 355–363. isbn:0897913663 https://doi.org/10.1145/325164.325161 Google ScholarDigital Library
Berthold Vöcking. 2003. How Asymmetry Helps Load Balancing. Journal of the ACM, 50, 4 (2003), jul, 568–589. issn:0004-5411 https://doi.org/10.1145/792538.792546 Google ScholarDigital Library
2022. VoltDB Administrator’s Guide, S2.3 - Configure Memory Management. https://docs.voltdb.com/AdminGuide/adminmemmgt.php Google Scholar
W. H. Wang, J.-L. Baer, and H. M. Levy. 1989. Organization and Performance of a Two-Level Virtual-Real Cache Hierarchy. In Proceedings of the 16th Annual International Symposium on Computer Architecture (ISCA ’89). ACM, New York, NY, USA. 140–148. isbn:0897913191 https://doi.org/10.1145/74925.74942 Google ScholarDigital Library
Zi Yan, Daniel Lustig, David Nellans, and Abhishek Bhattacharjee. 2019. Translation Ranger: Operating System Support for Contiguity-Aware TLBs. In Proceedings of the 46th International Symposium on Computer Architecture (ISCA ’19). ACM, New York, NY, USA. 698–710. isbn:9781450366694 https://doi.org/10.1145/3307650.3322223 Google ScholarDigital Library
Idan Yaniv and Dan Tsafrir. 2016. Hash, Don’t Cache (the Page Table). In Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science (SIGMETRICS ’16). ACM, New York, NY, USA. 337–350. isbn:9781450342667 https://doi.org/10.1145/2896377.2901456 Google ScholarDigital Library
Hongil Yoon and Gurindar S. Sohi. 2016. Revisiting virtual L1 caches: A practical design using dynamic synonym remapping. In Proceedings of the 22nd International Symposium on High Performance Computer Architecture (HPCA ’16). IEEE, USA. 212–224. https://doi.org/10.1109/HPCA.2016.7446066 Google ScholarCross Ref
Lixin Zhang, Evan Speight, Ram Rajamony, and Jiang Lin. 2010. Enigma: Architectural and Operating System Support for Reducing the Impact of Address Translation. In Proceedings of the 24th ACM International Conference on Supercomputing (ICS ’10). ACM, New York, NY, USA. 159–168. isbn:9781450300186 https://doi.org/10.1145/1810085.1810109 Google ScholarDigital Library
Weixi Zhu, Alan L. Cox, and Scott Rixner. 2020. A Comprehensive Analysis of Superpage Management Mechanisms and Policies. In Proceedings of the 2020 USENIX Conference on Usenix Annual Technical Conference (ATC ’20). USENIX Association, USA. Article 57, 14 pages. isbn:978-1-939133-14-4 https://doi.org/10.5555/3489146.3489203 Google ScholarDigital Library
Sudarsun Kannan and Jaehyun Han. 2023. oscarlab/mosaic-asplos23-artifacts: Mosaic ASPLOS’23 Artifacts. https://doi.org/10.5281/zenodo.7709303 Google ScholarDigital Library

Index Terms

Mosaic Pages: Big TLB Reach with Small Pages
1. Computer systems organization
  1. Architectures
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Main memory
        Virtual memory

Recommendations

Filtering Translation Bandwidth with Virtual Caching
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

Heterogeneous computing with GPUs integrated on the same chip as CPUs is ubiquitous, and to increase programmability many of these systems support virtual address accesses from GPU hardware. However, this entails address translation on every memory ...
Read More
Filtering Translation Bandwidth with Virtual Caching
ASPLOS '18

Heterogeneous computing with GPUs integrated on the same chip as CPUs is ubiquitous, and to increase programmability many of these systems support virtual address accesses from GPU hardware. However, this entails address translation on every memory ...
Read More
Victima: Drastically Increasing Address Translation Reach by Leveraging Underutilized Cache Resources
MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

Address translation is a performance bottleneck in data-intensive workloads due to large datasets and irregular access patterns that lead to frequent high-latency page table walks (PTWs). PTWs can be reduced by using (i) large hardware TLBs or (ii) ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
March 2023
820 pages
ISBN:9781450399180
DOI:10.1145/3582016
General Chair:
Tor M. Aamodt
University of British Columbia, Canada
,
Program Chairs:
Natalie Enright Jerger
University of Toronto, Canada
,
Michael Swift
University of Wisconsin-Madison, USA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 March 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
Author Tags
TLB
address translation
hashing
paging
virtual memory
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate535of2,713submissions,20%
Upcoming Conference
ASPLOS '24

Sponsor:

sigarch

sigarch

sigarch

29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 27 - May 1, 2024

La Jolla , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 1,187
  Total Downloads
- Downloads (Last 12 months)1,095
- Downloads (Last 6 weeks)109
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.