ABSTRACT
One of the most challenging problems in modern distributed big data systems lies in their memory management: these systems preallocate a fixed amount of memory before applications start. In the best case where more memory can be acquired, users have to reconfigure the deployment and re-compute many intermediate results. If no more memory is available, users are then forced to manually partition the job into smaller tasks, incurring both development and performance overhead. This paper presents a user-level utility for scaling the memory in a distributed setup---the Distributed Virtual Memory (DVM). DVM enables to efficiently swap data between memory and disk between arbitrary nodes without users' intervention or applications' awareness.
- Apache Hadoop. Accessed September 5, 2014. http://hadoop.apache.org/.Google Scholar
- Claude Barthels, Simon Loesing, Gustavo Alonso, and Donald Kossmann. 2015. Rack-Scale In-Memory Join Processing Using RDMA. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD). Google ScholarDigital Library
- Adam Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, and Christos Kozyrakis. 2012. Dune: Safe User-level Access to Privileged CPU Features. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google ScholarDigital Library
- Carlos H. A. Costa, Yoonho Park, Bryan S. Rosenburg, Chen-Yong Cher, and Kyung Dong Ryu. 2014. A System Software Approach to Proactive Memory-error Avoidance. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '14). Google ScholarDigital Library
- P. R. Denz, M. Curtis-Maury, and V. Devadas. 2016. Think Global, Act Local: A Buffer Cache Design for Global Ordering and Parallel Processing in the WAFL File System. In 2016 45th International Conference on Parallel Processing (ICPP). 386--395.Google Scholar
- Lu Fang, Khanh Nguyen, Guoqing Xu, Brian Demsky, and Shan Lu. 2015. Interruptible Tasks: Treating Memory Pressure As Interrupts for Highly Scalable Data-parallel Programs. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
- B. Fleisch and G. Popek. 1989. Mirage: A Coherent Distributed Shared Memory Design. SIGOPS Oper. Syst. Rev. 23, 5 (Nov. 1989). Google ScholarDigital Library
- Michael J. Franklin, Michael J. Carey, and Miron Livny. 1992. Global Memory Management in Client-Server Database Architectures. In Proceedings of the 18th International Conference on Very Large Data Bases (VLDB). Google ScholarDigital Library
- FUSE. Accessed September 5, 2014. http://fuse.sourceforge.net.Google Scholar
- Jungrae Kim, Michael Sullivan, Seong-Lyong Gong, and Mattan Erez. 2015. Frugal ECC: Efficient and Versatile Memory Error Protection Through Fine-grained Compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). Google ScholarDigital Library
- Scott Levy, Kurt B. Ferreira, and Patrick G. Bridges. 2016. Improving Application Resilience to Memory Errors with Lightweight Compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16). Google ScholarDigital Library
- Feng Li, Sudipto Das, Manoj Syamala, and Vivek R. Narasayya. 2016. Accelerating Relational Databases by Leveraging Remote Memory and RDMA. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD). Google ScholarDigital Library
- Tonglin Li, Chaoqi Ma, Jiabao Li, Xiaobing Zhou, Ke Wang, Dongfang Zhao, and Ioan Raicu. 2015. GRAPH/Z: A Key-Value Store Based Scalable Graph Processing System. In Cluster Computing, IEEE International Conference on. Google ScholarDigital Library
- Xu Liu and Bo Wu. 2015. ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). Google ScholarDigital Library
- Parmita Mehta, Sven Dorkenwald, Dongfang Zhao, Tomer Kaftan, Alvin Cheung, Magdalena Balazinska, Ariel Rokem, Andrew Connolly, Jacob Vanderplas, and Yusra AlSayyad. 2017. Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads. In Proceedings of the 43rd International Conference on Very Large Data Bases (VLDB).Google ScholarDigital Library
- Neha Narula, Cody Cutler, Eddie Kohler, and Robert Morris. 2014. Phase Reconciliation for Contended In-memory Transactions. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI). Google ScholarDigital Library
- Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. 2015. Latency-tolerant Software Distributed Shared Memory. In Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference (ATC). Google ScholarDigital Library
- Numba. Accessed Sept. 6, 2016. http://numba.pydata.org/.Google Scholar
- S3FS. Accessed Feb. 2, 2017. https://github.com/s3fs-fuse/s3fs-fuse.Google Scholar
- Frank Schmuck and Roger Haskin. 2002. GPFS: A Shared-Disk File System for Large Computing Clusters. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST '02). Google ScholarDigital Library
- Felix Martin Schuhknecht, Jens Dittrich, and Ankur Sharma. 2016. RUMA has it: Rewired User-space Memory Access is Possible! Proc. VLDB Endow. 9, 10 (June 2016). Google ScholarDigital Library
- Starfish. Accessed September 12, 2016. http://www.cs.duke.edu/starfish.Google Scholar
- Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy Transactions in Multicore In-memory Databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
- B. Wang, M. Torres, D. Li, J. Zhao, and F. Rusu. 2016. Performance Implications of Processing-in-Memory Designs on Data-Intensive Applications. In 2016 45th International Conference on Parallel Processing Workshops (ICPPW). 115--122.Google Scholar
- Jingjing Wang, Tobin Baker, Magdalena Balazinska, Daniel Halperin, Brandon Haynes, Bill Howe, Dylan Hutchison, Shrainik Jain, Ryan Maas, Parmita Mehta, Dominik Moritz, Brandon Myers, Jennifer Ortiz, Dan Suciu, Andrew Whitaker, and Shengliang Xu. 2017. The Myria Big Data Management and Analytics System and Cloud Services. In 8th Biennial Conference on Innovative Data Systems Research (CIDR).Google Scholar
- X. Wang, J. D. Leidel, and Y. Chen. 2017. OpenMP Memkind: An Extension for Heterogeneous Physical Memories. In 2017 46th International Conference on Parallel Processing Workshops (ICPPW). 220--227.Google Scholar
- M. Young, A. Tevanian, R. Rashid, D. Golub, and J. Eppinger. 1987. The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud). Google ScholarDigital Library
- Dongfang Zhao, Ning Liu, Dries Kimpe, Robert Ross, Xian-He Sun, and Ioan Raicu. June 2016. Towards Exploring Data-Intensive Scientific Applications at Extreme Scales through Systems and Simulations. IEEE Transactions on Parallel and Distributed Systems (TPDS) 27, 6 (June 2016), 1824--1837.Google ScholarDigital Library
- Dongfang Zhao, Jian Yin, and Ioan Raicu. 2013. Improving the I/O Throughput for Data-Intensive Scientific Applications with Efficient Compression Mechanisms. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC '13), poster session.Google Scholar
- Zhou Zhou, Xu Yang, Dongfang Zhao, Paul Rich, Wei Tang, Jia Wang, and Zhiling Lan. 2015. I/O-Aware Batch Scheduling for Petascale Computing Systems. In Cluster Computing, IEEE International Conference on. Google ScholarDigital Library
Recommendations
DVM: towards a datacenter-scale virtual machine
VEE '12As cloud-based computation becomes increasingly important, providing a general computational interface to support datacenter-scale programming has become an imperative research agenda. Many cloud systems use existing virtual machine monitor (VMM) ...
DVM: towards a datacenter-scale virtual machine
VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution EnvironmentsAs cloud-based computation becomes increasingly important, providing a general computational interface to support datacenter-scale programming has become an imperative research agenda. Many cloud systems use existing virtual machine monitor (VMM) ...
DVM: A Big Virtual Machine for Cloud Computing
As cloud-based computation grows to be an increasingly important paradigm, providing a general computational interface to support datacenter-scale programming has become an imperative research agenda. Many cloud systems use existing virtual machine ...
Comments