Abstract
Cloud services are becoming centralized at several geo-replicated datacentres. These services replicate data within a single datacentre to tolerate isolated failures. Unfortunately, the effects of a disaster cannot be avoided, as existing approaches migrate a copy of data to backup datacentres only after data have been stored at a primary datacentre. Upon disaster, all data not yet migrated can be lost.
In this paper, we propose and implement SDN-KVS, a disaster-tolerant key-value store, which provides strong disaster resilience by replicating data before storing. To this end, SDN-KVS features a novel communication primitive, SDN-cast, that leverages Software Defined Network (SDN) in two ways: it offers an SDN-multicast primitive to replicate critical update request flows and an SDN-anycast primitive to redirect request flows to the closest available datacentre. Our performance evaluation indicates that SDN-KVS ensures no data loss and that traffic gets redirected across long distance key-value store replicas within 30 s after a datacentre outage.
NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
References
Barré, S., Bonaventure, O., Raiciu, O., Handley, M.: Experimenting with multipath TCP. In: SIGCOMM, pp. 443–444 (2010)
Chakravorty, R., Katti, S., Crowcroft, J., Pratt, I.: Flow aggregation for enhanced TCP over wide-area wireless. In: INFOCOM (2003)
Chockler, G., Gilbert, S., Gramoli, V., Musial, P.M., Shvartsman, A.A.: Reconfigurable distributed storage for dynamic networks. J. Parallel Distrib. Comput. 69(1), 100–116 (2009)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: SOSP, pp. 205–220 (2007)
Garcia-Molina, H., Polyzois, C.A., Hagmann, R.B.: Two epoch algorithms for disaster recovery. In: VLDB, pp. 222–230 (1990)
Handigol, N., Heller, B., Jeyakumar, V., Lantz, B., McKeown, N.: Reproducible network experiments using container-based emulation. In: CoNEXT, pp. 253–264 (2012)
Herlihy, M., Wing, J.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)
Honda, M., Nishida, Y., Raiciu, C., Greenhalgh, A., Handley, M., Tokuda, H.: Is it still possible to extend TCP? In: IMC (2011)
Jain, S., Kumar, A., Mandal, S., Ong, J., Poutievski, L., Singh, A., Venkata, S., Wanderer, J., Zhou, J., Zhu, M., Zolla, J., Hölzle, U., Stuart, S., Vahdat, A.: B4: experience with a globally-deployed software defined WAN. In: SIGCOMM, pp. 3–14 (2013)
Ji, M., Veitch, A.C., Wilkes, J.: Seneca: remote mirroring done write. In: ATC, pp. 253–268 (2003)
Kim, J., Santos, J.R., Turner, Y., Schlansker, M., Tourrilhes, J., Feamster, N.: CORONET: fault tolerance for software defined networks. In: ICNP (2012)
Lynch, N., Shvartsman, A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: FTCS, pp. 272–281 (1997)
Maltz, D., Bhagwat, P.: TCP splicing for application layer proxy performance, RC 21139. IBM, March 1998
Medina, A., Allman, M., Floyd, S.: Measuring interaction between transport protocols and middleboxes. In: IMC, pp. 336–341 (2004)
Oracle: Oracle optimized solution for disaster recovery on oracle supercluster (2013)
Patterson, R.H., Manley, S., Federwisch, M., Hitz, D., Kleiman, S., Owara, S.: SnapMirror: file-system-based asynchronous mirroring for disaster recovery. In: FAST, pp. 117–129 (2002)
Verma, A., Voruganti, K., Routray, R., Jain, R.: SWEEPER: an efficient disaster recovery point identification mechanism. In: FAST, pp. 297–312 (2008)
Vigfusson, Y., Abu-Libdeh, H., Balakrishnan, M., Birman, K., Burgess, R., Chockler, G., Li, H., Tock, Y.: Dr. multicast: Rx for data center communication scalability. In: EuroSys, pp. 349–362 (2010)
Wood, T., Lagar-Cavilla, H.A., Ramakrishnan, K.K., Shenoy, P., Van der Merwe, J.: PipeCloud: using causality to overcome speed-of-light delays in cloud-based disaster recovery. In: SoCC, pp. 17:1–17:13 (2011)
Xie, A., Wang, X., Wang, W., Lu, S.: Designing a disaster-resilient network with software defined networking. In: IWQoS, pp. 135–140, May 2014
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Gramoli, V., Jourjon, G., Mehani, O. (2015). Disaster-Tolerant Storage with SDN. In: Bouajjani, A., Fauconnier, H. (eds) Networked Systems . NETYS 2015. Lecture Notes in Computer Science(), vol 9466. Springer, Cham. https://doi.org/10.1007/978-3-319-26850-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-26850-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26849-1
Online ISBN: 978-3-319-26850-7
eBook Packages: Computer ScienceComputer Science (R0)