Abstract
The fault-tolerance property in most cloud storage systems are designed within the scale of a single datacenter. The single datacenter as a whole may be unreachable or crashed due to severe problems, such as broken network links, power supply interruptions, and natural disasters, etc. Therefore, the design of an effective cross-datacenter fault-tolerant storage system is important to protect data security in the cloud. However, building a cross-datacenter fault-tolerant system faces great challenges, such as high latency, low throughput, high costs of bandwidth resources between datacenters. In this paper, we propose a practical cross-datacenter fault-tolerant (CDFT) algorithm in the cloud storage system. Our fault-tolerant algorithm design considers the difficult tradeoffs among fault tolerance, latency, throughput, network and storage costs. We propose the Domain Fault Codes (DFC) and the topology-aware scheduling techniques, which can tolerate the whole datacenter breakdown. We implemented the DFC-CDFT algorithm in a prototype cloud storage system. The experimental results showed that the proposed DFC-CDFT algorithm can effectively recover data blocks from the single datacenter failure while achieves low storage and bandwidth costs.
Similar content being viewed by others
References
Data Center Knowledge. UPDATE: Explosion in Downtown Los Angeles Disrupts Data Center Operations[N/OL]. http://www.datacenterknowledge.com/archives/2015/ 08/21/explosion-downtown-los-angeles-disrupts-data-center-operatio ns/. Accessed 15 Oct 2016
Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J.M., Stoica, I.: Highly available transactions: virtues and limitations. Proc. VLDB Endow. 7(3), 181–192 (2013)
Greenberg, A., Hamilton, J., Maltz, D.A., Patel, P.: The cost of a cloud: research problems in data center networks. ACM SIGCOMM Comput. Commun. Rev. 39(1), 68–73 (2008)
Shah, N.B., Lee, K., Ramchandran, K.: The MDS queue: analysing the latency performance of erasure codes. In: Proceedings of International Symposium on Information Theory (2014)
Bailis, P.: Communication Costs in Real-world Networks [R/OL]. http://www.bailis.org/blog/communication-costs-in-real-world-networks/. Accessed 16 Oct 2016
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM SIGOPS Oper. Syst. Rev. ACM 37(5), 29–43 (2003)
Sivasubramanian, S.: Amazon dynamoDB: a seamlessly scalable non-relational database service. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 729–730. ACM (2012)
Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., Yekhanin, S.: Erasure coding in Windows Azure storage. In: Proceedings of USENIX Annual Technical Conference (2012)
Fikes, A.: Storage architecture and challenges. Talk at the Google Faculty Summit (2010)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system[C]. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)
Dian Fu, Avik Key.: HDFS-5442: Zero loss HDFS data replication for multiple datacenters[EB/OL]. https://issues.apache.org/jira/browse/HDFS-5442. Accessed 16 Oct 2016
Zhang, Z., Jiang, W.: HDFS-7285: Erasure Coding Support inside HDFS[EB/OL]. https://issues.apache.org/jira/browse/HDFS-7285. Accessed 16 Oct 2016
The Apache Software Foundation.: HDFS-RAID Wiki[EB/OL]. http://wiki.apache.org/hadoop/HDFS-RAID. Accessed 16 Oct 2016
Fan, B., Tantisiriroj, W., Xiao, L., Gibson, G.: DiskReduce: RAID for data-intensive scalable computing. In: Proceedings of the 4th Annual Workshop on Petascale Data Storage, pp. 6–10. ACM (2009)
Sathiamoorthy, M., Asteris, M., Papailiopoulos, D., Dimakis, A.G., Vadali, R., Chen, S., Borthakur, D.: Xoring elephants: novel erasure codes for big data. Proc. VLDB Endow. VLDB Endow. 6(5), 325–336 (2013)
Amazon Web Services, Inc.: Amazon DynamoDB Developer Guide: Cross-Region Replication Using DynamoDB Streams[R/OL]. http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.CrossRegionRepl.html. Accessed 16 Oct 2016
The Apache Software Foundation.: Apache Cassandra. http://cassandra.apache.org/. Accessed 16 Oct 2016
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 4 (2008)
Baker, J., Bond, C., Corbett, J.C., Furman, J.J., Khorlin, A., Larson, J., Leon, J.-M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: providing scalable, highly available storage for interactive services. CIDR 11, 223–234 (2011)
Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., et al.: Spanner: Google’s globally distributed database. ACM Trans. Comput. Syst. (TOCS) 31(3), 8 (2013)
Silberstein, M., Ganesh, L., Wang, Y., Alvisi, L., Dahlin, M.: Lazy means smart: reducing repair bandwidth costs in erasure-coded distributed storage. In: Proceedings of International Conference on Systems and Storage (2014)
Huang, J., Liang, X., Qin, X., Xie, P., Xie, C.: Scale-RS: an efficient scaling scheme for RS-coded storage clusters. IEEE Trans. Parallel Distrib. Syst. 26(6), 1704–1717 (2015)
Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math. 8(2), 300–304 (1960)
Galois field. https://en.wikipedia.org/wiki/Finite_field. Accessed 16 Oct 2016
Rashmi, K.V., Nakkiran, P., Wang, J., Shah, N.B., Ramchandran, K.: Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth. In: USENIX Conference on File and Storage Technologies (2015)
Singh, A., Ong, J., Agarwal, A., Anderson, G.: Jupiter rising: a decade of clos topologies and centralized control in Google’s datacenter network. Commun. ACM 59(9), 88–97 (2016)
Xiao, L., Ren, K., Zheng, Q., Gibson, G.A.: ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems. In: Proceedings of the Sixth ACM Symposium on Cloud Computing (2015)
Thomson, A., Abadi, D.J.: CalvinFS: consistent WAN replication and scalable metadata management for distributed file systems. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies (2015)
LevelDB. https://github.com/google/leveldb. Accessed 16 Oct 2016
Ford, D., Labelle, F., Popovici, F.I., Stokely, M., Truong, V.-A., Barroso, L., Grimes, C., Quinlan, S.: Availability in Globally Distributed Storage Systems. In: Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, pp. 61–74 (2010)
Standard, NIST-FIPS.: Announcing the advanced encryption standard (AES). Fed. Inf. Process. Stand. Publ. 197, 1–51 (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheng, Y., Yu, X., Chen, W. et al. A practical cross-datacenter fault-tolerance algorithm in the cloud storage system. Cluster Comput 20, 1801–1813 (2017). https://doi.org/10.1007/s10586-017-0840-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0840-5