ABSTRACT
In this work, we investigate techniques to improve the performance of big data analytics in virtualized clusters by effectively increasing the utilization of cached data and efficiently using scarce memory resources.
- Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2011. Disk-locality in Datacenter Computing Considered Irrelevant. In Proceedings of the 13th USENIX Conference on Hot Topics in Operating Systems (HotOS'11).Google ScholarDigital Library
- Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker, and Ion Stoica. 2012. PACMan: Coordinated Memory Caching for Parallel Jobs. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI'12).Google Scholar
- Apache Hadoop 2017. Apache Hadoop Centralized Cache Management in HDFS. (2017). http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html.Google Scholar
- Jaewon Kwak, Eunji Hwang, Tae-kyung Yoo, Beomseok Nam, and Young-ri Choi. 2016. In-Memory Caching Orchestration for Hadoop. In Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CC-Grid'16).Google ScholarDigital Library
Index Terms
Exploring memory locality for big data analytics in virtualized clusters
Recommendations
'Big data', Hadoop and cloud computing in genomics
Graphical abstractDisplay Omitted Ever improving next generation sequencing technologies has led to an unprecedented proliferation of sequence data.Biology is now one of the fastest growing fields of big data science.Cloud computing and big data ...
Big data analytics in Cloud computing: an overview
AbstractBig Data and Cloud Computing as two mainstream technologies, are at the center of concern in the IT field. Every day a huge amount of data is produced from different sources. This data is so big in size that traditional processing tools are unable ...
Comments