Autoscaling tiered cloud storage in Anna

Wu, Chenggang; Sreekanti, Vikram; Hellerstein, Joseph M.

doi:10.1007/s00778-020-00632-7

Autoscaling tiered cloud storage in Anna

Special Issue Paper
Published: 09 September 2020

Volume 30, pages 25–43, (2021)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

1064 Accesses
9 Citations
Explore all metrics

Abstract

In this paper, we describe how we extended a distributed key-value store called Anna into an autoscaling, multi-tier service for the cloud. In its extended form, Anna is designed to overcome the narrow cost–performance limitations typical of current cloud storage systems. We describe three key aspects of Anna’s new design: multi-master selective replication of hot keys, a vertical tiering of storage layers with different cost–performance trade-offs, and horizontal elasticity of each tier to add and remove nodes in response to load dynamics. Anna’s policy engine uses these mechanisms to balance service-level objectives around cost, latency, and fault tolerance. Experimental results explore the behavior of Anna’s mechanisms and policy, exhibiting orders of magnitude efficiency improvements over both commodity cloud KVS services and research systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hop: Elastic Consistency for Exascale Data Stores

peaCS-Performance and Efficiency Analysis for Cloud Storage

USTO.RE: A Private Cloud Storage Software System

Notes

Note that repartitioning overhead is not as high as in Sect. 7.4 because here we are using more machines and only add one new node, as opposed to four in that experiment.

References

Abadi, D.: Consistency tradeoffs in modern distributed database system design: Cap is only part of the story. Computer 45(2), 37–42 (2012)
Article Google Scholar
Acharya, S., Alonso, R. Franklin, M., Zdonik, S.: Broadcast disks: data management for asymmetric communication environments. In: Mobile Computing, pp. 331–361. Springer (1995)
Akamai. https://www.akamai.com
Al-Shishtawy, A., Vlassov, V.: Elastman: Elasticity manager for elastic key-value stores in the cloud. In: Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference, CAC ’13, pp. 7:1–7:10. ACM, New York (2013)
Alizadeh, M., Greenberg, A., Maltz, D.A., Padhye, J., Patel, P., Prabhakar, B., Sengupta, S., Sridharan, M.: Data center tcp (dctcp). In: Proceedings of the ACM SIGCOMM 2010 Conference, SIGCOMM ’10, pp. 63–74. ACM, New York (2010)
Amazon Web Services. Amazon dynamodb developer guide (api version 2012-08-10), Aug. 2012. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ProvisionedThroughput.html. Accessed May 3, (2018)
Amur, H., Cipar, J., Gupta, V., Ganger, G.R., Kozuch, M.A., Schwan, K.: Robust and flexible power-proportional storage. In: Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC ’10, pp. 217–228. ACM, New York (2010)
Ananthanarayanan, G., Agarwal, S., Kandula, S., Greenberg, A., Stoica, I., Harlan, D., Harris, E.: Scarlett: Coping with skewed content popularity in mapreduce clusters. In: Proceedings of the Sixth Conference on Computer Systems, EuroSys ’11, pp. 287–300. ACM, New York (2011)
Amazon web services. https://aws.amazon.com
Microsoft azure cloud computing platform. http://azure.microsoft.com
Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J.M., Stoica, I.: Highly available transactions: Virtues and limitations. PVLDB 7(3), 181–192 (2013)
Google Scholar
Birman, K., Chockler, G., van Renesse, R.: Toward a cloud computing research agenda. ACM SIGACt News 40(2), 68–80 (2009)
Article Google Scholar
Brewer, E.: A certain freedom: Thoughts on the cap theorem. In: Proceedings of the 29th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, PODC ’10, pp. 335–335. ACM, New York (2010)
Apache cassandra. http://cassandra.apache.org
Chen, L., Qiu, M., Song, J., Xiong, Z., Hassan, H.: E2fs: an elastic storage system for cloud computing. J. Supercomput. 74(3), 1045–1060 (2018)
Article Google Scholar
Conway, N., Marczak, W.R., Alvaro, P., Hellerstein, J.M., Maier, D.: Logic and lattices for distributed programming. In: Proceedings of the Third ACM Symposium on Cloud Computing, SoCC ’12, pp. 1:1–1:14. ACM, New York (2012)
Copeland, G., Alexander, W., Boughter, E., Keller, T.: Data placement in Bubba. In: ACM SIGMOD Record, volume 17, pp. 99–108. ACM (1988)
Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: High availability via asynchronous virtual machine replication. In: Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, pp. 161–174. San Francisco (2008)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles, SOSP ’07, pp. 205–220. ACM, New York (2007)
Demers, A., Greene, D., Houser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for replicated database maintenance. ACM SIGOPS Oper. Syst. Rev. 22(1), 8–32 (1988)
Article Google Scholar
Kubernetes–build, ship, and run any app, anywhere. https://www.docker.com
Faleiro, J.M., Abadi, D.J.: Latch-free synchronization in database systems: Silver bullet or fool’s gold? In: Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, CIDR ’17 (2017)
Firecracker. https://firecracker-microvm.github.io
Google cloud platform. https://cloud.google.com
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for big data analytics. CIDR 11, 261–272 (2011)
Google Scholar
Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: Zookeeper: Wait-free coordination for internet-scale systems. In: USENIX annual technical conference, volume 8. Boston, USA (2010)
Kakoulli, E., Herodotou, H.: Octopusfs: A distributed file system with tiered storage management. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17, pp. 65–78. ACM, New York (2017)
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: onsistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, STOC ’97, pp. 654–663. ACM, New York (1997)
Khandelwal, A., Agarwal, R., Stoica, I.: Blowfish: Dynamic storage-performance tradeoff in data stores. In: 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pp. 485–500. USENIX Association, Santa Clara (2016)
Kubernetes: Production-grade container orchestration. http://kubernetes.io
Kubernetes. Set up high-availability kubernetes masters. https://kubernetes.io/docs/tasks/administer-cluster/highly-available-master/. Accessed May 3, (2018)
Kulkarni, S., Bhagat, N., Fu, M., Kedigehalli, V., Kellogg, C., Mittal, S., Patel, J.M., Ramasamy, K., Taneja, S.: Twitter heron: Stream processing at scale. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD ’15, pp. 239–250. ACM, New York (2015)
Lagar-Cavilla, H.A., Whitney, J.A., Scannell, A.M., Patchin, P., Rumble, S.M., De Lara, E. Brudno, M., Satyanarayanan, M.: Snowflock: rapid virtual machine cloning for cloud computing. In: Proceedings of the 4th ACM European conference on Computer systems, pp. 1–12. ACM (2009)
Lamport. L.: The part-time parliament. ACM Transactions on Computer Systems (TOCS), 16(2), (1998)
Larsen, K.G., Nelson, J., Nguyen, H.L., Thorup, M.: Heavy hitters via cluster-preserving clustering. CoRR, arXiv:1604.01357, (2016)
Li, H., Ghodsi, A., Zaharia, M., Shenker, S., Stoica, I.: Tachyon: Reliable, memory speed storage for cluster computing frameworks. In: Proceedings of the ACM Symposium on Cloud Computing, SOCC ’14, pp. 6:1–6:15. ACM, New York (2014)
Lomet, D., Salzberg, B.: Access methods for multiversion data. SIGMOD Rec. 18(2), 315–324 (1989)
Article Google Scholar
Ma, L., Van Aken, D., Hefny, A., Mezerhane, G., Pavlo, A., Gordon, G.J.: Query-based workload forecasting for self-driving database management systems. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD ’18, pp. 631–645 (2018)
Manjhi, A., Nath, S., Gibbons, P.B.: Tributaries and deltas: Efficient and robust aggregation in sensor network streams. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD ’05, pp. 287–298. ACM, New York (2005)
Mao, Y., Kohler, E., Morris, R.T.: Cache craftiness for fast multicore key-value storage. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 183–196. ACM (2012)
Microsoft Corp. Delivering a great startup and shutdown experience, May 2017. https://docs.microsoft.com/en-us/windows-hardware/test/weg/delivering-a-great-startup-and-shutdown-experience. Accessed May 3 (2018)
Pavlo, A., Angulo, G., Arulraj, J., Lin, H., Lin, J., Ma, L., Menon, P., Mowry, T., Perron, M., Quah, I., Santurkar, S., Tomasic, A., Toor, S., Aken, D.V., Wang, Z., Wu, Y., Xian, R., Zhang, T.: Self-driving database management systems. In: CIDR 2017, Conference on Innovative Data Systems Research (2017)
Quamar, A., Kumar, K.A., Deshpande, A.: Sword: Scalable workload-aware data placement for transactional workloads. In: Proceedings of the 16th International Conference on Extending Database Technology, EDBT ’13, pp. 430–441. Association for Computing Machinery, New York (2013)
Rao, A., Lakshminarayanan, K., Surana, S., Karp, R., Stoica, I.: Load balancing in structured p2p systems. In: International Workshop on Peer-to-Peer Systems, pp. 68–79. Springer (2003)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network, vol. 31. ACM, New York (2001)
MATH Google Scholar
Ross, A., Hilton, A., Rensin, D.: Slos, slis, slas, oh my - cre life lessons, january 2017. https://cloudplatform.googleblog.com/2017/01/availability-part-deux--CRE-life-lessons.html. Accessed May 3, (2018)
Roy, N., Dubey, A., Gokhale, A.: Efficient autoscaling in the cloud using predictive models for workload forecasting. In: Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing, CLOUD ’11, pp. 500–507. IEEE Computer Society, Washington (2011)
Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: Conflict-free replicated data types. In: Défago, X., Petit, F., Villain, V. editors, Stabilization, Safety, and Security of Distributed Systems, pp. 386–400. Springer, Berlin (2011)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST ’10, pp. 1–10. IEEE Computer Society, Washington (2010)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM ’01, pp. 149–160. ACM, New York (2001)
Stonebraker, M.: The design of the postgres storage system. In: Proceedings of the 13th International Conference on Very Large Data Bases, VLDB ’87, pp. 289–300. Morgan Kaufmann Publishers Inc., San Francisco (1987)
Storm. https://github.com/apache/storm
Swarmify. https://swarmify.com
Thereska, E., Donnelly, A., Narayanan, D.: Sierra: Practical power-proportionality for data center storage. In: Proceedings of the Sixth Conference on Computer Systems, EuroSys ’11, pp. 169–182. ACM, New York (2011)
Van Aken, D., Pavlo, A., Gordon, G.J., Zhang, B.: Automatic database management system tuning through large-scale machine learning. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1009–1024. ACM (2017)
Vo, H.T., Chen, C., Ooi, B.C.: Towards elastic transactional cloud storage with range query support. PVLDB 3(1–2), 506–514 (2010)
Google Scholar
Waas, F.M.: Beyond conventional data warehousing - massively parallel data processing with greenplum database–(invited talk). In: Dayal, U., Castellanos, M., Sellis, T. editors, Business Intelligence for the Real-Time Enterprise–Second International Workshop, BIRTE 2008, Auckland, New Zealand, August 24, 2008, Revised Selected Papers, pp. 89–96 (2008)
Wilkes, J., Golding, R., Staelin, C., Sullivan, T.: The hp autoraid hierarchical storage system. ACM Trans. Comput. Syst. 14(1), 108–136 (1996)
Article Google Scholar
Wood, T., Shenoy, P.J., Venkataramani, A., Yousif, M.S., et al.: Black-box and gray-box strategies for virtual machine migration. NSDI 7, 17–17 (2007)
Google Scholar
Wu, C., Faleiro, J.M., Lin, Y., Hellerstein, J.M.: Anna: A kvs for any scale. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE) (2018)
Xu, L., Cipar, J., Krevat, E., Tumanov, A., Gupta, N., Kozuch, M.A., Ganger, G.R.: Springfs: Bridging agility and performance in elastic distributed storage. In: Proceedings of the 12th USENIX FAST, pp. 243–255. USENIX (2014)

Download references

Author information

Authors and Affiliations

University of California, Berkeley, 465 Soda Hall, Berkeley, 94720, CA, USA
Chenggang Wu, Vikram Sreekanti & Joseph M. Hellerstein

Authors

Chenggang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Sreekanti
View author publications
You can also search for this author in PubMed Google Scholar
Joseph M. Hellerstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chenggang Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

We include pseudocode for the algorithms described in Sect. 5 here. Note that some algorithms included here rely on a latency objective, which may or may not be specified. When no latency objective is specified, Anna aspires to its unsaturated request latency (2.5 ms) to provide the best possible performance but caps spending at the specified budget.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, C., Sreekanti, V. & Hellerstein, J.M. Autoscaling tiered cloud storage in Anna. The VLDB Journal 30, 25–43 (2021). https://doi.org/10.1007/s00778-020-00632-7

Download citation

Received: 01 February 2020
Revised: 17 August 2020
Accepted: 26 August 2020
Published: 09 September 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s00778-020-00632-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autoscaling tiered cloud storage in Anna

Abstract

Access this article

Similar content being viewed by others

Hop: Elastic Consistency for Exascale Data Stores

peaCS-Performance and Efficiency Analysis for Cloud Storage

USTO.RE: A Private Cloud Storage Software System

Notes

References