Abstract
With a striped RAID (Redundant Array of Independent Disks) which consists of multiple disks and spreads data across them in parallel, distributed file systems (DFSs) easily enhance the performance of a single read stream (i.e., a series of sequential reads by a process). However, most existing DFSs suffer from performance degradation in concurrent read streams (i.e., multiple series of sequential reads by concurrent processes). Furthermore, research on the performance of concurrent ones for a striped RAID in DFSs has been rarely reported so far. In this paper, we define the problems that degrade it at different configurations of striped RAIDs, and resolve them by proposing the following two methods: (1) a fair allocating of network bandwidth for concurrent read streams and (2) a strip-aware prefetching for each individual read stream. We show that our proposal outperforms all the existing DFSs by at least two times for all kinds and configurations of striped RAIDs. Furthermore, the performance gap between our proposal and the existing DFSs becomes wider according to the increasing number of striped disks.
Similar content being viewed by others
References
Gluster File System. http://www.gluster.org. Accessed Apr 2018
Palankar MR et al (2008) Amazon S3 for science grids: a viable solution? In: Proceedings of the 2008 International Workshop on Data-Aware Distributed Computing. ACM
Weil SA et al (2006) Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. USENIX Association
Calder B et al (2011) Windows Azure storage: a highly available cloud storage service with strong consistency. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM
Ghemawat S, Gobioff H, Leung S-T (2003) The Google file system. In: ACM SIGOPS Operating Systems Review, vol. 37, no 5. ACM, pp 29–43
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Shvachko K et al (2010) The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). IEEE
Wang F et al (2009) Understanding lustre filesystem internals. Oak Ridge National Laboratory, National Center for Computational Sciences, Technical Report
Welch B et al. (2008) Scalable performance of the Panasas parallel file system. In: FAST, vol 8, pp 1–17
Chen Y (2011) Towards scalable I/O architecture for exascale systems. In: Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers. ACM
Xu Q et al (2014) Efficient and scalable metadata management in EB-scale file systems. IEEE Trans Parallel Distrib Syst 25.11:2840–2850
Xiong J et al (2011) Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans Parallel Distrib Syst 22.5:803–816
Kim Y, Gunasekaran R (2015) Understanding I/O workload characteristics of a peta-scale storage system. J Supercomput 71(3):761–780
Lai WK et al (2014) Towards a framework for large-scale multimedia data storage and processing on Hadoop platform. J Supercomput 68.1:488–507
Mao B, Wu S, Duan L (2018) Improving the SSD performance by exploiting request characteristics and internal parallelism. IEEE Trans Comput Aided Des Integr Circuits Syst 37(2):472–484
Sur S et al (2010) Can high-performance interconnects benefit hadoop distributed file system. In: Workshop on Micro Architectural Support for Virtualization, Data Center Computing, and Clouds (MASVDC). Held in Conjunction with MICRO
Kolli A et al (2016) High-performance transactions for persistent memories. ACM SIGPLAN Not 51.4:399–411
Matsui C, Sun C, Takeuchi K (2017) Design of hybrid SSDs with storage class memory and NAND flash memory. In: Proceedings of the IEEE
Qiu S, Reddy ALN (2013) NVMFS: a hybrid file system for improving random write in nand-flash SSD. In: 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST). IEEE
Huang TC, Chang DW (2016) TridentFS: a hybrid file system for nonvolatile RAM, flash memory and magnetic disk. Softw Pract Exp 46.3:291–318
Fan Z et al (2017) Hibachi: a cooperative hybrid cache with NVRAM and DRAM for storage arrays. In: Proceedings of IEEE Conference on Mass Storage Systems and Technologies (MSST)
Chandy JA (2008) RAID0. 5: design and implementation of a low cost disk array data protection method. J Supercomput 46(2):108–123
Shriver EAM, Small C, Smith KA (1999) Why does file system prefetching work? In: USENIX Annual Technical Conference, General Track
Fengguang WU, Hongsheng XI, Chenfeng XU (2008) On the design of a new linux readahead framework. ACM SIGOPS Oper Syst Rev 42(5):75–84
Pai R, Pulavarty B, Cao M (2004) Linux 2.6 performance improvement through readahead optimization. In: Proceedings of the Linux Symposium, vol 2
Wu F et al (2007) Linux readahead: less tricks for more. In: Proceedings of the Linux Symposium, vol 2
Li C, Shen K, Papathanasiou AE (2007) Competitive prefetching for concurrent sequential I/O. In: ACM SIGOPS Operating Systems Review, vol 41, no 3. ACM
Ding X et al (2007) DiskSeen: exploiting disk layout and access history to enhance I/O prefetch. In: USENIX Annual Technical Conference, vol 7
Jiang S et al (2013) A prefetching scheme exploiting both data layout and access history on disk. ACM Trans Storage (TOS) 9.3:10
Gill BS, Bathen LAD (2007) Optimal multistream sequential prefetching in a shared cache. ACM Trans Storage (TOS) 3.3:10
Baek SH, Park KH (2009) Striping-aware sequential prefetching for independency and parallelism in disk arrays with concurrent accesses. IEEE Trans Comput 58(8):1146–1152
Shi X, Feng D (2012) LSP: a locality-aware strip prefetching scheme for striped disk array systems with concurrent accesses. J Comput 7(6):1303–1311
Pratt S, Heger DA (2004) Workload dependent performance evaluation of the linux 2.6 i/o schedulers. In: 2004 Linux Symposium
Lee Y-J et al (2009) Fast-path I/O architecture for high performance streaming server. J Supercomput 50.2:99
Roselli DS, Lorch JR, Anderson TE (2000) A comparison of file system workloads. In: USENIX Annual Technical Conference, General Track
Cooper BF et al (2010) Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing. ACM
Shafer J, Rixner S, Cox AL (2010) The hadoop distributed filesystem: balancing portability and performance. In: 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS). IEEE
Saini S et al (2012) I/O performance characterization of Lustre and NASA applications on Pleiades. In: 2012 19th International Conference on High Performance Computing (HiPC). IEEE
Chen PM et al (1994) RAID: high-performance, reliable secondary storage. ACM Comput Surv (CSUR) 26.2:145–185
Moon S et al (2015) Optimizing the Hadoop MapReduce Framework with high-performance storage devices. J Supercomput 71.9:3525–3548
Liang S, Jiang S, Zhang X (2007) STEP: sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In: 27th International Conference on Distributed Computing Systems (ICDCS’07). IEEE
Zhang Z et al (2008) Pfc: transparent optimization of existing prefetching strategies for multi-level storage systems. In: The 28th International Conference on Distributed Computing Systems, 2008. ICDCS’08. IEEE
Soundararajan G, Mihailescu M, Amza C (2008) Context-aware prefetching at the storage server. In: USENIX Annual Technical Conference
Lee HK, An BS, Kim EJ (2009) Adaptive prefetching scheme using web log mining in Cluster-based web systems. In: IEEE International Conference on Web Services, 2009. ICWS 2009. IEEE
Gala Y et al (2011) Management of multilevel, multiclient cache hierarchies with application hints. ACM Trans Comput Syst (TOCS) 29(2):5
Yadgar G et al (2008) Mc2: multiple clients on a multilevel cache. In: The 28th International Conference on Distributed Computing Systems, 2008. ICDCS’08. IEEE
Dong B et al (2010) Correlation based file prefetching approach for hadoop. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom). IEEE
Lee S, Hyun SJ, Kim HY et al. (2018) APS: adaptable prefetching scheme to different running environments for concurrent read streams in distributed file systems. J Supercomput https://doi.org/10.1007/s11227-018-2333-6
The IOzone Benchmark. http://www.iozone.org. Accessed Apr 2018
Acknowledgements
This work was supported by Institute for Information & communications Technology Promotion (IITP) Grant funded by the Korea Government (MSIP) (No. R0126-15-1082, Management of Developing ICBMS (IoT, Cloud, Bigdata, Mobile, Security) Core Technologies and Development of Exascale Cloud Storage Technology)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, S., Hyun, S.J., Kim, HY. et al. Fair bandwidth allocating and strip-aware prefetching for concurrent read streams and striped RAIDs in distributed file systems. J Supercomput 74, 3904–3932 (2018). https://doi.org/10.1007/s11227-018-2396-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2396-4