skip to main content
10.1145/3229710.3229737acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

DVM: Scaling Out Virtual Memory in Userspace

Published:13 August 2018Publication History

ABSTRACT

One of the most challenging problems in modern distributed big data systems lies in their memory management: these systems preallocate a fixed amount of memory before applications start. In the best case where more memory can be acquired, users have to reconfigure the deployment and re-compute many intermediate results. If no more memory is available, users are then forced to manually partition the job into smaller tasks, incurring both development and performance overhead. This paper presents a user-level utility for scaling the memory in a distributed setup---the Distributed Virtual Memory (DVM). DVM enables to efficiently swap data between memory and disk between arbitrary nodes without users' intervention or applications' awareness.

References

  1. Apache Hadoop. Accessed September 5, 2014. http://hadoop.apache.org/.Google ScholarGoogle Scholar
  2. Claude Barthels, Simon Loesing, Gustavo Alonso, and Donald Kossmann. 2015. Rack-Scale In-Memory Join Processing Using RDMA. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Adam Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, and Christos Kozyrakis. 2012. Dune: Safe User-level Access to Privileged CPU Features. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Carlos H. A. Costa, Yoonho Park, Bryan S. Rosenburg, Chen-Yong Cher, and Kyung Dong Ryu. 2014. A System Software Approach to Proactive Memory-error Avoidance. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. R. Denz, M. Curtis-Maury, and V. Devadas. 2016. Think Global, Act Local: A Buffer Cache Design for Global Ordering and Parallel Processing in the WAFL File System. In 2016 45th International Conference on Parallel Processing (ICPP). 386--395.Google ScholarGoogle Scholar
  6. Lu Fang, Khanh Nguyen, Guoqing Xu, Brian Demsky, and Shan Lu. 2015. Interruptible Tasks: Treating Memory Pressure As Interrupts for Highly Scalable Data-parallel Programs. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Fleisch and G. Popek. 1989. Mirage: A Coherent Distributed Shared Memory Design. SIGOPS Oper. Syst. Rev. 23, 5 (Nov. 1989). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael J. Franklin, Michael J. Carey, and Miron Livny. 1992. Global Memory Management in Client-Server Database Architectures. In Proceedings of the 18th International Conference on Very Large Data Bases (VLDB). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. FUSE. Accessed September 5, 2014. http://fuse.sourceforge.net.Google ScholarGoogle Scholar
  10. Jungrae Kim, Michael Sullivan, Seong-Lyong Gong, and Mattan Erez. 2015. Frugal ECC: Efficient and Versatile Memory Error Protection Through Fine-grained Compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Scott Levy, Kurt B. Ferreira, and Patrick G. Bridges. 2016. Improving Application Resilience to Memory Errors with Lightweight Compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Feng Li, Sudipto Das, Manoj Syamala, and Vivek R. Narasayya. 2016. Accelerating Relational Databases by Leveraging Remote Memory and RDMA. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tonglin Li, Chaoqi Ma, Jiabao Li, Xiaobing Zhou, Ke Wang, Dongfang Zhao, and Ioan Raicu. 2015. GRAPH/Z: A Key-Value Store Based Scalable Graph Processing System. In Cluster Computing, IEEE International Conference on. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xu Liu and Bo Wu. 2015. ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Parmita Mehta, Sven Dorkenwald, Dongfang Zhao, Tomer Kaftan, Alvin Cheung, Magdalena Balazinska, Ariel Rokem, Andrew Connolly, Jacob Vanderplas, and Yusra AlSayyad. 2017. Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads. In Proceedings of the 43rd International Conference on Very Large Data Bases (VLDB).Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Neha Narula, Cody Cutler, Eddie Kohler, and Robert Morris. 2014. Phase Reconciliation for Contended In-memory Transactions. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. 2015. Latency-tolerant Software Distributed Shared Memory. In Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference (ATC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Numba. Accessed Sept. 6, 2016. http://numba.pydata.org/.Google ScholarGoogle Scholar
  19. S3FS. Accessed Feb. 2, 2017. https://github.com/s3fs-fuse/s3fs-fuse.Google ScholarGoogle Scholar
  20. Frank Schmuck and Roger Haskin. 2002. GPFS: A Shared-Disk File System for Large Computing Clusters. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST '02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Felix Martin Schuhknecht, Jens Dittrich, and Ankur Sharma. 2016. RUMA has it: Rewired User-space Memory Access is Possible! Proc. VLDB Endow. 9, 10 (June 2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Starfish. Accessed September 12, 2016. http://www.cs.duke.edu/starfish.Google ScholarGoogle Scholar
  23. Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy Transactions in Multicore In-memory Databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Wang, M. Torres, D. Li, J. Zhao, and F. Rusu. 2016. Performance Implications of Processing-in-Memory Designs on Data-Intensive Applications. In 2016 45th International Conference on Parallel Processing Workshops (ICPPW). 115--122.Google ScholarGoogle Scholar
  25. Jingjing Wang, Tobin Baker, Magdalena Balazinska, Daniel Halperin, Brandon Haynes, Bill Howe, Dylan Hutchison, Shrainik Jain, Ryan Maas, Parmita Mehta, Dominik Moritz, Brandon Myers, Jennifer Ortiz, Dan Suciu, Andrew Whitaker, and Shengliang Xu. 2017. The Myria Big Data Management and Analytics System and Cloud Services. In 8th Biennial Conference on Innovative Data Systems Research (CIDR).Google ScholarGoogle Scholar
  26. X. Wang, J. D. Leidel, and Y. Chen. 2017. OpenMP Memkind: An Extension for Heterogeneous Physical Memories. In 2017 46th International Conference on Parallel Processing Workshops (ICPPW). 220--227.Google ScholarGoogle Scholar
  27. M. Young, A. Tevanian, R. Rashid, D. Golub, and J. Eppinger. 1987. The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Dongfang Zhao, Ning Liu, Dries Kimpe, Robert Ross, Xian-He Sun, and Ioan Raicu. June 2016. Towards Exploring Data-Intensive Scientific Applications at Extreme Scales through Systems and Simulations. IEEE Transactions on Parallel and Distributed Systems (TPDS) 27, 6 (June 2016), 1824--1837.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Dongfang Zhao, Jian Yin, and Ioan Raicu. 2013. Improving the I/O Throughput for Data-Intensive Scientific Applications with Efficient Compression Mechanisms. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC '13), poster session.Google ScholarGoogle Scholar
  31. Zhou Zhou, Xu Yang, Dongfang Zhao, Paul Rich, Wei Tang, Jia Wang, and Zhiling Lan. 2015. I/O-Aware Batch Scheduling for Petascale Computing Systems. In Cluster Computing, IEEE International Conference on. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing
    August 2018
    409 pages
    ISBN:9781450365239
    DOI:10.1145/3229710

    Copyright © 2018 ACM

    © 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 August 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate91of313submissions,29%
  • Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader