ABSTRACT
We present several key elements towards elastic memory management in modern big data systems. The goal of our approach is to avoid out-of-memory failures without over-provisioning but also to avoid garbage-collection overheads when possible.
- Tungsten: Memory management and binary processing on spark. https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html.Google Scholar
- Memory management in the Java HotSpot™virtual machine. http://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf, 2006.Google Scholar
- N. Anciaux et al. Memory requirements for query execution in highly constrained devices. In VLDB, 2003. Google ScholarDigital Library
- K. P. Brown et al. Managing memory to meet multiclass workload response time goals. In VLDB, 1993. Google ScholarDigital Library
- C. Chen et al. Adaptive database buffer allocation using query feedback. In VLDB, 1993. Google ScholarDigital Library
- J. E. Cook et al. Semi-automatic, self-adaptive control of garbage collection rates in object databases. In SIGMOD, 1996. Google ScholarDigital Library
- D. L. Davison et al. Dynamic resource brokering for multi-user query execution. In SIGMOD, 1995. Google ScholarDigital Library
- C. Faloutsos et al. Predictive load control for flexible buffer allocation. In VLDB, 1991. Google ScholarDigital Library
- M. N. Garofalakis et al. Parallel query scheduling and optimization with time- and space-shared resources. In VLDB, 1997. Google ScholarDigital Library
- M. Hall et al. The weka data mining software: An update. 2009.Google Scholar
- D. Halperin et al. Demo of the Myria big data management service. In SIGMOD, 2014. Google ScholarDigital Library
- H. Herodotou et al. No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In SoCC, 2011. Google ScholarDigital Library
- H. Herodotou et al. Starfish: A self-tuning system for big data analytics. In CIDR, 2011.Google Scholar
- B. Hindman et al. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, 2011. Google ScholarDigital Library
- M. Kornacker et al. Impala: A modern, open-source SQL engine for hadoop. In CIDR, 2015.Google Scholar
- W. Lang et al. Towards multi-tenant performance slos. IEEE Trans. Knowl. Data Eng., 2014. Google ScholarDigital Library
- J. Li et al. Resource bricolage for parallel database systems. Proc. of the VLDB Endow., 2014. Google ScholarDigital Library
- Y. Low et al. Distributed GraphLab: a framework for machine learning and data mining in the cloud. In VLDB, 2012. Google ScholarDigital Library
- D. G. Murray et al. Naiad: A timely dataflow system. In SOSP, 2013. Google ScholarDigital Library
- V. R. Narasayya et al. Sharing buffer pool memory in multi-tenant relational database-as-a-service. Proc. of the VLDB Endow., 2015. Google ScholarDigital Library
- R. T. Ng et al. Flexible buffer allocation based on marginal gains. In SIGMOD, 1991. Google ScholarDigital Library
- H. Pang et al. Managing memory for real-time queries. In SIGMOD, 1994. Google ScholarDigital Library
- T. A. Project. Apache Giraph, http://giraph.apache.org/.Google Scholar
- J. Schaffner et al. Predicting in-memory database performance for automating cluster management tasks. In ICDE, 2011. Google ScholarDigital Library
- A. J. Storm et al. Adaptive self-tuning memory in DB2. In VLDB, 2006. Google ScholarDigital Library
- P. Tembey et al. Merlin: Application- and platform-aware resource allocation in consolidated server systems. In SoCC, 2014. Google ScholarDigital Library
- V. K. Vavilapalli et al. Apache hadoop YARN: yet another resource negotiator. In SoCC, 2013. Google ScholarDigital Library
- M. Weimer et al. REEF: retainable evaluator execution framework. In SIGMOD, 2015. Google ScholarDigital Library
- T. White. Hadoop: The Definitive Guide. 2009. Google ScholarDigital Library
- M. Zaharia et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, 2012. Google ScholarDigital Library
- Toward elastic memory management for cloud data analytics
Recommendations
Big data analytics in Cloud computing: an overview
AbstractBig Data and Cloud Computing as two mainstream technologies, are at the center of concern in the IT field. Every day a huge amount of data is produced from different sources. This data is so big in size that traditional processing tools are unable ...
Comments