Abstract
In cloud storage, replication technologies are essential to fault tolerance and high availability of data. While achieving the goal of high availability, replication brings extra number of active servers to the storage system. Extra active servers mean extra power consumption and capital expenditure. Furthermore, the lack of classification of data makes replication scheme fixed at the very beginning. This paper proposes an elastic and efficient file storage called E2FS for big data applications. E2FS can dynamically scale in/out the storage system based on real-time demands of big data applications. We adopt a novel replication scheme based on data blocks, which provides a fine-grained maintenance of the data in the storage system. E2FS analyzes features of data and makes dynamic replication decision to balance the cost and performance of cloud storage. To evaluate the performance of proposed work, we implement a prototype of E2FS and compare it with HDFS. Our experiments show E2FS can outperform HDFS in elasticity while achieving guaranteed performance for big data applications.






Similar content being viewed by others
References
Chen M, Hai J, Wen Y, Leung VC (2013) Enabling technologies for future data center networking: a primer. IEEE Netw 27(4):8–15
Li J, Qiu M, Niu J, Gao W, Zong Z, Qin X (2010) Feedback dynamic algorithms for preemptable job scheduling in cloud systems. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, DC, USA, pp 561–564
Dai W, Qiu M (2015) Energy optimization with dynamic task scheduling mobile cloud computing. Syst J IEEE PP(99):1–10
Chen M, Mao S, Zhang Y, Leung VC (2014) Big data: related technologies, challenges and future prospects. Springer Briefs in Computer Science
Zhang Y, Chen M, Mao S, Hu L, Leung VC (2014) Cap: Community activity prediction based on big data analysis. IEEE Netw 28(4):52–57
Chen M, Hao Y, Li Y, Lai C, Wu D (2015) On the computation offloading at ad hoc cloudlet: architecture and service modes. IEEE Commun Mag 53(6):18–24
Cidon A et al (2013) Copysets: reducing the frequency of data loss in cloud storage. In: USENIX Annual Technical Conference 2013 (USENIXATC 13). San Jose, pp 37–48
Qiu M, Ming Z (2013) Informer homed routing fault tolerance mechanism for wireless sensor networks. J Syst Archit 59(4):260–270
CISCO (2014) Cisco Visual Networking Index: Forecast and Methodology, 2014–2019 White Paper. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/ip-ngn-ip-next-generation-network/white_paper_c11-481360.html. Accessed 18 Feb 2016
CNET (2013) Cloud storage comparison. http://www.cnet.com/how-to/onedrive-dropbox-google-drive-and-box-which-cloud-storage-service-is-right-for-you/. Accessed 18 Feb 2016
Gai K, Qiu M (2015) Dynamic Energy-aware Cloudlet-based Mobile Cloud Computing Model for Green Computing. J Netw Comput Appl 59:46–54
Wu G, Qiu M (2013) A decentralized approach for mining event correlations in dis- tributed system monitoring. J Parallel Distrib Comput 73(3):330–340
Xu L et al (2014) SpringFS: bridging agility and performance in elastic distributed storage. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14). Santa Clara, CA, pp 243–255
Harter T et al (2014) Analysis of hdfs under hbase: A facebook messages case study. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14), pp 199–212
Wang H, Varman P (2014) Balancing fairness and effciency in tiered storage systems with bottleneck-aware allocation. In: Proceedings of the 12th USENIX Conferenceon File and Storage Technologies (FAST 14), pp 229–242
Cidon A et al (2015) Tiered replication: a cost-effective alternative to full cluster geo-replication. In: 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp 31–43
Bowers KD, Juels A, Oprea A (2009) Hail: a high-availability and integrity layer for cloud storage. In: Proceedings of the 16th ACM Conference on Computer and Communications Security. ACM, New York, pp 187–198
Acknowledgments
This work is supported by NSF CNS-1457506 and NSF CNS-1359557.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, L., Qiu, M., Song, J. et al. E2FS: an elastic storage system for cloud computing. J Supercomput 74, 1045–1060 (2018). https://doi.org/10.1007/s11227-016-1827-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-016-1827-3