skip to main content
10.1145/3267809.3267827acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Best Paper

Netco: Cache and I/O Management for Analytics over Disaggregated Stores

Published:11 October 2018Publication History

ABSTRACT

We consider a common setting where storage is disaggregated from the compute in data-parallel systems. Colocating caching tiers with the compute machines can reduce load on the interconnect but doing so leads to new resource management challenges. We design a system Netco, which prefetches data into the cache (based on workload predictability), and appropriately divides the cache space and network bandwidth between the prefetches and serving ongoing jobs. Netco makes various decisions (what content to cache, when to cache and how to apportion bandwidth) to support end-to-end optimization goals such as maximizing the number of jobs that meet their service-level objectives (e.g., deadlines). Our implementation of these ideas is available within the open-source Apache HDFS project. Experiments on a public cloud, with production-trace inspired workloads, show that Netco uses up to 5x less remote I/O compared to existing techniques and increases the number of jobs that meet their deadlines up to 80%.

References

  1. Amazon Elastic Compute Cloud: Enhanced Networking on Linux. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html.Google ScholarGoogle Scholar
  2. Allow HDFS block replicas to be provided by an external storage system. https://issues.apache. org/jira/browse/HDFS-9806.Google ScholarGoogle Scholar
  3. Alluxio - Open Source Memory Speed Virtual Distributed Storage. http://www.alluxio.org/.Google ScholarGoogle Scholar
  4. Amazon EC2. https://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  5. Amazon Elastic Block Store. https://aws.amazon.com/ebs/.Google ScholarGoogle Scholar
  6. Amazon S3. https://aws.amazon.com/s3/.Google ScholarGoogle Scholar
  7. Apache Gridmix. https://hadoop.apache.org/docs/r1.2.1/gridmix.html.Google ScholarGoogle Scholar
  8. Apache Hadoop. http://hadoop.apache.org/.Google ScholarGoogle Scholar
  9. Azure Data Lake Analytics. https://azure.microsoft.com/en-us/services/data-lake-analytics/.Google ScholarGoogle Scholar
  10. Azure Storage Scalability and Performance Targets. https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets.Google ScholarGoogle Scholar
  11. Best Practices for Amazon EMR. https://d0.awsstatic.com/whitepapers/aws-amazon-emr-best-practices.pdf.Google ScholarGoogle Scholar
  12. Cloudera Enterprise Reference Architecture for Azure Deployments. http://www.cloudera.com/documentation/other/reference-architecture/PDF/cloudera_ref_arch_azure.pdf.Google ScholarGoogle Scholar
  13. Databricks IO Cache. https://docs.databricks.com/user-guide/databricks-io-cache.html.Google ScholarGoogle Scholar
  14. Enable HDFS to cache data read from external storage systems. https://issues.apache.org/jira/browse/HDFS-13069.Google ScholarGoogle Scholar
  15. Gurobi Optimization. http://www.gurobi.com/.Google ScholarGoogle Scholar
  16. Hadoop Distributed File System. https://wiki.apache.org/hadoop/HDFS.Google ScholarGoogle Scholar
  17. Handling writes from HDFS to Provided storages. https://issues.apache.org/jira/browse/HDFS-12090.Google ScholarGoogle Scholar
  18. High-performance Premium Storage and managed disks for VMs. https://docs.microsoft.com/en-us/azure/virtual-machines/windows/premium-storage.Google ScholarGoogle Scholar
  19. Microsoft Azure. https://azure.microsoft.com/en-us/.Google ScholarGoogle Scholar
  20. Moving Data into HDFS from Amazon S3. http://documentation.altiscale.com/moving-data-from-s3-to-hdfs.Google ScholarGoogle Scholar
  21. Sizes for Windows virtual machines in Azure. https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes.Google ScholarGoogle Scholar
  22. Use HDFS-compatible Azure Blob storage with Hadoop in HDInsight. https://docs.microsoft. com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage.Google ScholarGoogle Scholar
  23. Windows Azure Storage BLOB. https://azure.microsoft.com/en-us/services/storage/blobs/.Google ScholarGoogle Scholar
  24. S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou. Re-optimizing data-parallel computing. In NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Albers, S. Arora, and S. Khanna. Page replacement for general caching problems. In Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '99, Philadelphia, PA, USA, 1999. Society for Industrial and Applied Mathematics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Ananthanarayanan, S. Agarwal, S. Kandula, A. Greenberg, I. Stoica, D. Harlan, and E. Harris. Scarlett: Coping with skewed content popularity in mapreduce clusters. In Proceedings of the Sixth Conference on Computer Systems, EuroSys '11, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica. Effective Straggler Mitigation: Attack of the Clones. In NSDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. Pacman: Coordinated memory caching for parallel jobs. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI'12, Berkeley, CA, USA, 2012. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the Outliers in Map-reduce Clusters Using Mantri. In OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. Bansal, N. Buchbinder, and J. S. Naor. A primal-dual randomized algorithm for weighted paging. Journal of the ACM (JACM), 59(4):19, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Bestavros. Using speculation to reduce server load and service time on the www. Technical report, Boston, MA, USA, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Bhaskara, M. Charikar, E. Chlamtac, U. Feige, and A. Vijayaraghavan. Detecting high log-densities: an O(n1/4) approximation for densest k-subgraph. In Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010, pages 201--210, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. E. Boutin, J. Ekanayake, W. Lin, B. Shi, J. Zhou, Z. Qian, M. Wu, and L. Zhou. Apollo: Scalable and Coordinated Scheduling for Cloud-scale Computing. In OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Brehob, S. Wagner, E. Torng, and R. Enbody. Optimal replacement is np-hardfor nonstandard caches. IEEE Trans. Comput., 53(1):73--76, Jan. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Călinescu, A. Chakrabarti, H. J. Karloff, and Y. Rabani. An improved approximation algorithm for resource allocation. ACM Trans. Algorithms, 7(4):48:1--48:7, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Y. Chen, S. Alspaugh, and R. Katz. Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads. Proc. VLDB Endow., 5(12):1802--1813, Aug. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Cheng, M. S. Iqbal, A. Gupta, and A. R. Butt. Cast: Tiering storage for data analytics in the cloud. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC '15, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. H.-T. Chou and D. J. DeWitt. An evaluation of buffer management strategies for relational database systems. In Proceedings of the 11th International Conference on Very Large Data Bases - Volume 11, VLDB '85. VLDB Endowment, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Chowdhury et al. Leveraging Endpoint Flexibility in Data-Intensive Clusters. In SIGCOMM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. Chowdhury and I. Stoica. Coflow: A networking abstraction for cluster applications. In Proceedings of the 11th ACM Workshop on Hot Topics in Networks, HotNets-XI, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Chowdhury and I. Stoica. Efficient coflow scheduling without prior knowledge. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM '15, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Chowdhury, Y. Zhong, and I. Stoica. Efficient coflow scheduling with varys. In ACM SIGCOMM 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. E. Culler, A. Gupta, and J. P. Singh. Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1st edition, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. C. Curino, D. E. Difallah, C. Douglas, S. Krishnan, R. Ramakrishnan, and S. Rao. Reservation-based scheduling: If you're late don't blame us! In Proceedings of the ACM Symposium on Cloud Computing, SOCC '14, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron. Decentralized task-aware scheduling for data center networks. In ACM SIGCOMM 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem. Algorithmica, 29(3):410--421, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. A. D. Ferguson, P. Bodik, S. Kandula, E. Boutin, and R. Fonseca. Jockey: Guaranteed job latency in data parallel clusters. In Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys '12, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-resource Packing for Cluster Schedulers. In SIGCOMM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A Platform for Fine-grained Resource Sharing in the Data Center. In NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. Iosup, N. Yigitbasi, and D. Epema. On the Performance Variability of Production Cloud Services. In CCGRID, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. S. Irani. Page replacement with multi-size pages and applications to web caching. In Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, STOC '97, New York, NY, USA, 1997. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat. B4: Experience with a globally-deployed software defined wan. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM '13, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. V. Jalaparti, P. Bodik, I. Menache, S. Rao, K. Makarychev, andM. Caesar. Network-aware scheduling for data-parallel jobs: Plan when you can. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM '15, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. S. A. Jyothi, C. Curino, I. Menache, S. M. Narayanamurthy, A. Tumanov, J. Yaniv, R. Mavlyutov, I. n. Goiri, S. Krishnan, J. Kulkarni, and S. Rao. Morpheus: Towards automated slos for enterprise clusters. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI'16, Berkeley, CA, USA, 2016. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. E. Kakoulli and H. Herodotou. OctopusFS: A Distributed File System with Tiered Storage Management. In SIGMOD Conference, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. S. Kandula, I. Menache, R. Schwartz, and S. R. Babbula. Calendaring for wide area networks. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. P. Leitner and J. Cito. Patterns in the chaos-a study of performance variation and predictability in public iaas clouds. ACM Transactions on Internet Technology (TOIT), 16(3):15, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica. Tachyon: Reliable, memory speed storage for cluster computing frameworks. In Proceedings of the ACM Symposium on Cloud Computing, pages 1--15. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. P. Manurangsi. Almost-polynomial ratio eth-hardness of approximating densest k-subgraph. In Proceedings of the 49th ACM Symposium on Theory of Computing, STOC 2017, Montreal, Quebec, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. M. Mao and M. Humphrey. Auto-scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows. In SC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. N. Megiddo and D. S. Modha. Arc: A self-tuning, low overhead replacement cache. In Proceedings of the 2Nd USENIX Conference on File and Storage Technologies, FAST '03, Berkeley, CA, USA, 2003. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. R. Motwani and P. Raghavan. Randomized algorithms. Chapman & Hall/CRC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. V. Narasayya, I. Menache, M. Singh, F. Li, M. Syamala, and S. Chaudhuri. Sharing buffer pool memory in multi-tenant relational database-as-a-ser vice. Proceedings of the VLDB Endowment, 8(7):726--737, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. E. J. O'Neil, P. E. O'Neil, and G. Weikum. The LRU-K Page Replacement Algorithm for Database Disk Buffering. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD '93, New York, NY, USA, 1993. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. E. J. O'neil, P. E. O'neil, and G. Weikum. The LRU-K page replacement algorithm for database disk buffering. ACM SIGMOD Record, 22(2):297--306, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. V. N. Padmanabhan and J. C. Mogul. Using predictive prefetching to improve world wide web latency. SIGCOMM Comput. Commun. Rev., 26(3):22--36, July 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Q. Pu, H. Li, M. Zaharia, A. Ghodsi, and I. Stoica. Fairride: Near-optimal, fair cache sharing. In Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation, NSDI'16, Berkeley, CA, USA, 2016. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. K. V. Rashmi, M. Chowdhury, J. Kosaian, I. Stoica, and K Ramchandran. EC-cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI'16, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. A. S. Tanenbaum and H. Bos. Modern Operating Systems. Prentice Hall Press, Upper Saddle River, NJ, USA, 4th edition, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. E. Thereska, H. Ballani, G. O'Shea, T. Karagiannis, A. Rowstron, T. Talpey, R. Black, and T. Zhu. Ioflow: A software-defined storage architecture. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. J. Wang. A survey of web caching schemes for the internet. SIGCOMM Comput. Commun. Rev., 29(5), Oct. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. A. Wieder, P. Bhatotia, A. Post, and R. Rodrigues. Orchestrating the Deployment of Computations in the Cloud with Conductor. In NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Z. Wu, C. Yu, and H. V. Madhyastha. CosTLO: Cost-effective Redundancy for Lower Latency Variance on Cloud Storage Services. In NSDI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. J. Yang, R. Karimi, T. Sæmundsson, A. Wildani, and Y. Vigfusson. MITHRIL: Mining Sporadic Associations for Cache Prefetching. CoRR, abs/1705.07400, 2017.Google ScholarGoogle Scholar
  75. S. Yang, K. Srinivasan, K. Udayashankar, S. Krishnan, J. Feng, Y. Zhang, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Tombolo: Performance enhancements for cloud storage gateways. In MSST, 2016.Google ScholarGoogle Scholar
  76. H. Zhang, K. Chen, W. Bai, D. Han, C. Tian, H. Wang, H. Guan, and M. Zhang. Guaranteeing deadlines for inter-datacenter transfers. In Proceedings of the Tenth European Conference on Computer Systems, EuroSys '15, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. H. Zhang, L. Chen, B. Yi, K. Chen, M. Chowdhury, and Y. Geng. Coda: Toward automatically identifying and scheduling coflows in the dark. In Proceedings of the 2016 Conference on ACM SIGCOMM 2016 Conference, SIGCOMM '16, New York, NY, USA, 2016. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. T. Zou, R. Le Bras, M. V. Salles, A. Demers, and J. Gehrke. ClouDiA: a deployment advisor for public clouds. In PVLDB'13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Netco: Cache and I/O Management for Analytics over Disaggregated Stores

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SoCC '18: Proceedings of the ACM Symposium on Cloud Computing
        October 2018
        546 pages
        ISBN:9781450360111
        DOI:10.1145/3267809

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 October 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate169of722submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader