skip to main content
10.1145/2465351.2465373acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

RadixVM: scalable address spaces for multithreaded applications

Published: 15 April 2013 Publication History

Abstract

RadixVM is a new virtual memory system design that enables fully concurrent operations on shared address spaces for multithreaded processes on cache-coherent multicore computers. Today, most operating systems serialize operations such as mmap and munmap, which forces application developers to split their multithreaded applications into multiprocess applications, hoard memory to avoid the overhead of returning it, and so on. RadixVM removes this burden from application developers by ensuring that address space operations on non-overlapping memory regions scale perfectly. It does so by combining three techniques: 1) it organizes metadata in a radix tree instead of a balanced tree to avoid unnecessary cache line movement; 2) it uses a novel memory-efficient distributed reference counting scheme; and 3) it uses a new scheme to target remote TLB shootdowns and to often avoid them altogether. Experiments on an 80 core machine show that RadixVM achieves perfect scalability for non-overlapping regions: if several threads mmap or munmap pages in parallel, they can run completely independently and induce no cache coherence traffic.

References

[1]
FreeBSD source code. http://www.freebsd.org/.
[2]
J. Appavoo, D. da Silva, O. Krieger, M. Auslander, M. Ostrowski, B. Rosenburg, A. Waterland, R. W. Wisniewski, J. Xenidis, M. Stumm, and L. Soares. Experience distributing objects in an SMMP OS. ACM Transactions on Computer Systems, 25(3), Aug. 2007.
[3]
A. Baumann, P. Barham, P.-E. Dagand, T. Haris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, Oct. 2009.
[4]
D. L. Black, R. F. Rashid, D. B. Golub, and C. R. Hill. Translation lookaside buffer consistency: a software approach. In Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 113--122, Boston, MA, Apr. 1989.
[5]
S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Corey: An operating system for many cores. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, Dec. 2008.
[6]
S. Boyd-Wickizer, A. T. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An analysis of Linux scalability to many cores. In Proceedings of the 9th Symposium on Operating Systems Design and Implementation (OSDI), Vancouver, Canada, Oct. 2010.
[7]
A. T. Clements, M. F. Kaashoek, and N. Zeldovich. Concurrent address spaces using RCU balanced trees. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), London, UK, Mar. 2012.
[8]
J. Corbet. The search for fast, scalable counters, May 2010. http://lwn.net/Articles/170003/.
[9]
R. Cox, M. F. Kaashoek, and R. Morris. Xv6, a simple Unix-like teaching operating system. http://pdos.csail.mit.edu/6.828/2012/xv6.html.
[10]
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008.
[11]
J. DeTreville. Experience with concurrent garbage collectors for Modula-2+. Technical Report 64, DEC Systems Research Center, Nov. 1990.
[12]
F. Ellen, Y. Lev, V. Luchango, and M. Moir. SNZI: Scalable nonzero indicators. In Proceedings of the 26th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, Portland, OR, Aug. 2007.
[13]
J. Evans. A scalable concurrent malloc(3) implementation for FreeBSD. In Proceedings of the BSDCan Conference, Ottawa, Canada, Apr. 2006.
[14]
B. Gamsa, O. Krieger, J. Appavoo, and M. Stumm. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI), pages 87--100, New Orleans, LA, Feb. 1999.
[15]
S. Ghemawat. TCMalloc: Thread-caching malloc, 2007. http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html.
[16]
M. Herlihy and N. Shavit. The art of multiprocessor programming. Morgan Kaufmann, 2008.
[17]
P. W. Howard and J. Walpole. Relativistic red-black trees. http://web.cecs.pdx.edu/~walpole/papers/ccpe2011.pdf.
[18]
P. W. Howard and J. Walpole. Relativistic red-black trees. Technical Report 10-06, Portland State University, Computer Science Department, 2010.
[19]
ISO. ISO/IEC 14882:2011(E): Information technology -- Programming languages -- C++. ISO, Geneva, Switzerland, 2011.
[20]
O. Krieger, M. Auslander, B. Rosenburg, R. W. Wisniewski, J. Xenidis, D. da Silva, M. Ostrowski, J. Appavoo, M. Butrico, M. Mergen, A. Waterland, and V. Uhlig. K42: Building a complete operating system. In Proceedings of the ACM EuroSys Conference, Leuven, Belgium, Apr. 2006.
[21]
R. Liu and H. Chen. SSMalloc: A low-latency, locality-conscious memory allocator with stable performance scalability. In Proceedings of the 3rd Asia-Pacific Workshop on Systems, Seoul, South Korea, July 2012.
[22]
Y. Mao, R. Morris, and M. F. Kaashoek. Optimizing MapReduce for multicore architectures. Technical Report MIT-CSAIL-TR-2010-020, Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, May 2010.
[23]
P. McKenney. Hierarchical RCU, Nov. 2008. https://lwn.net/Articles/305782/.
[24]
Microsoft Corp. Windows research kernel. http://www.microsoft.com/resources/sharedsource/windowsacademic/researchkernelkit.mspx.
[25]
S. Schneider, C. D. Antonopoulos, and D. S. Nikolopoulos. Scalable locality-conscious multithreaded memory allocation. In Proceedings of the 2006 ACM SIGPLAN International Symposium on Memory Management, Ottawa, Canada, June 2006.
[26]
G. Tene, B. Iyengar, and M. Wolf. C4: The continuously concurrent compacting collector. SIGPLAN Notices, 46 (11):79--88, June 2011.
[27]
L. Torvalds et al. Linux source code. http://www.kernel.org/.
[28]
V. Uhlig. The mechanics of in-kernel synchronization for a scalable microkernel. ACM SIGOPS Operating Systems Review, 41(4):49--58, July 2007.
[29]
L. Wang. Windows 7 memory management, Nov. 2009. http://download.microsoft.com/download/7/E/7/7E7662CF-CBEA-470B-A97E-CE7CE0D98DC2/mmwin7.pptx.
[30]
D. Wentzlaff and A. Agarwal. Factored operating systems (fos): The case for a scalable operating system for multicores. ACM SIGOPS Operating Systems Review, 43(2):76--85, 2009.

Cited By

View all
  • (2023)Virtual-Memory Assisted Buffer ManagementProceedings of the ACM on Management of Data10.1145/35886871:1(1-25)Online publication date: 30-May-2023
  • (2023)Evaluation and Refinement of an Explicit Virtual-Memory PrimitiveIEEE Access10.1109/ACCESS.2023.333814911(136855-136868)Online publication date: 2023
  • (2023)CredsCache: Making OverlayFS scalable for containerized servicesFuture Generation Computer Systems10.1016/j.future.2023.04.027147(44-58)Online publication date: Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '13: Proceedings of the 8th ACM European Conference on Computer Systems
April 2013
401 pages
ISBN:9781450319942
DOI:10.1145/2465351
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

EuroSys '13
Sponsor:
EuroSys '13: Eighth Eurosys Conference 2013
April 15 - 17, 2013
Prague, Czech Republic

Acceptance Rates

EuroSys '13 Paper Acceptance Rate 28 of 143 submissions, 20%;
Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)104
  • Downloads (Last 6 weeks)8
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Virtual-Memory Assisted Buffer ManagementProceedings of the ACM on Management of Data10.1145/35886871:1(1-25)Online publication date: 30-May-2023
  • (2023)Evaluation and Refinement of an Explicit Virtual-Memory PrimitiveIEEE Access10.1109/ACCESS.2023.333814911(136855-136868)Online publication date: 2023
  • (2023)CredsCache: Making OverlayFS scalable for containerized servicesFuture Generation Computer Systems10.1016/j.future.2023.04.027147(44-58)Online publication date: Oct-2023
  • (2022)DaxVM: Stressing the Limits of Memory as a File InterfaceProceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO56248.2022.00037(369-387)Online publication date: 1-Oct-2022
  • (2021)On-demand-forkProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456258(540-555)Online publication date: 21-Apr-2021
  • (2021)Memory-mapped I/O on steroidsProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456242(277-293)Online publication date: 21-Apr-2021
  • (2021)Fast local page-tables for virtualized NUMA servers with vMitosisProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446709(194-210)Online publication date: 19-Apr-2021
  • (2021)Zweilous: A Decoupled and Flexible Memory Management FrameworkIEEE Transactions on Computers10.1109/TC.2020.300912470:9(1350-1362)Online publication date: 1-Sep-2021
  • (2021)A survey of operating system support for persistent memoryFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-020-9395-315:4Online publication date: 1-Aug-2021
  • (2021)montage: NVM-based scalable synchronization framework for crash-consistent file systemsCluster Computing10.1007/s10586-021-03329-wOnline publication date: 30-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media