Abstract
Existing local file systems, designed to support a typical single-file access mode only, can lead to poor performance when accessing a batch of files, especially small files. This single-file mode essentially serializes accesses to batched files one by one, resulting in a large number of non-sequential, random, and often dependent I/Os between file data and metadata at the storage ends. Such access mode can further worsen the efficiency and performance of applications accessing massive files, such as data migration. We first experimentally analyze the root cause of such inefficiency in batch-file accesses. Then, we propose a novel batch-file access approach, referred to as BFO for its set of optimized Batch-File Operations, by developing novel BFOr and BFOw operations for fundamental read and write processes, respectively, using a two-phase access for metadata and data jointly. The BFO offers dedicated interfaces for batch-file accesses and additional processes integrated into existing file systems without modifying their structures and procedures. In addition, based on BFOr and BFOw, we also propose the novel batch-file migration BFOm to accelerate the data migration for massive small files. We implement a BFO prototype on ext4, one of the most popular file systems. Our evaluation results show that the batch-file read and write performances of BFO are consistently higher than those of the traditional approaches regardless of access patterns, data layouts, and storage media, under synthetic and real-world file sets. BFO improves the read performance by up to 22.4× and 1.8× with HDD and SSD, respectively, and it boosts the write performance by up to 111.4× and 2.9× with HDD and SSD, respectively. BFO also demonstrates consistent performance advantages for data migration in both local and remote situations.
- Vasily Tarasov and George Amvrosiadis. 2018. Filebench. Retrieved from http://sourceforge.net/projects/filebench/.Google Scholar
- Skyvia.com. 2018. Skyvia. Retrieved from https://skyvia.com/data-integration/synchronization.Google Scholar
- Alibaba. 2018. TFS Project. Retrieved from http://code.taobao.org/p/tfs/src/.Google Scholar
- William E. Allcock, John Bresnahan, Rajkumar Kettimuthu, and Michael Link. 2005. The globus striped GridFTP framework and server. In Proceedings of the ACM/IEEE Supercomputing Conference (SC’05). 54.Google ScholarDigital Library
- Michael P. Andersen and David E. Culler. 2016. BTrDB: Optimizing storage system design for timeseries processing. In Proceedings of the USENIX Conference on File and Storage Technologies. 39--52.Google Scholar
- Apache. 2018. Hadoop. Retrieved from http://hadoop.apache.org/.Google Scholar
- Jens Axboe. 2018. Blktrace. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/axboe.Google Scholar
- Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a needle in haystack: Facebook’s photo storage. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). 47--60.Google ScholarDigital Library
- Binfer. 2018. High-speed File Transfer Software. Retrieved from https://www.binfer.com/high-speed-file-transfer-software/.Google Scholar
- Vijay Chidambaram, Tushar Sharma, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2012. Consistency without ordering. In Proceedings of the USENIX Conference on File and Storage Technologies. 9.Google Scholar
- Xiaoning Ding, Song Jiang, Feng Chen, Kei Davis, and Xiaodong Zhang. 2007. DiskSeen: Exploiting disk layout and access history to enhance I/O prefetch. In Proceedings of theUSENIX Annual Technical Conference. USENIX, 261--274.Google Scholar
- John Esmet, Michael A. Bender, Martin Farach-Colton, and Bradley C. Kuszmaul. 2012. The TokuFS streaming file system. In Proceedings of the 4th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’12).Google Scholar
- Songling Fu, Ligang He, Chenlin Huang, Xiangke Liao, and Kenli Li. 2015. Performance optimization for managing massive numbers of small files in distributed file systems. IEEE Trans. Parallel Distrib. Syst. 26, 12 (2015), 3433--3448.Google ScholarDigital Library
- GNU. 2018. Linux scp. Retrieved from http://www.gnu.org/software/coreutils/coreutils.html.Google Scholar
- GNU. 2018. Linux tar. Retrieved from http://www.gnu.org/software/coreutils/coreutils.html.Google Scholar
- Yunhong Gu and Robert L. Grossman. 2007. UDT: UDP-based data transfer for high-speed wide area networks. Comput. Netw. 51, 7 (2007), 1777--1799.Google ScholarDigital Library
- Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: A new TCP-friendly high-speed TCP variant. Operat. Syst. Rev. 42, 5 (2008), 64--74.Google ScholarDigital Library
- Andrew Hanushevsky. 2018. BBCP. Retrieved from http://www.slac.stanford.edu/ abh/bbcp/.Google Scholar
- William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael A. Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter. 2015. BetrFS: A right-optimized write-optimized file system. In Proceedings of the USENIX Conference on File and Storage Technologies. 301--315.Google Scholar
- Yongsoo Joo, Sangsoo Park, and Hyokyung Bahn. 2017. Exploiting I/O reordering and I/O interleaving to improve application launch performance. Trans. Stor. 13, 1 (2017), 8:1–8:17.Google Scholar
- Tom Kelly. 2003. Scalable TCP: Improving performance in highspeed wide area networks. Comput. Commun. Rev. 33, 2 (2003), 83--91.Google ScholarDigital Library
- Youngjae Kim, Scott Atchley, Geoffroy Vallée, and Galen M. Shipman. 2015. LADS: Optimizing data transfers using layout-aware data scheduling. In Proceedings of the USENIX Conference on File and Storage Technologies. 67--80.Google Scholar
- Changman Lee, Dongho Sim, Joo Young Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the USENIX Conference on File and Storage Technologies. 273--286.Google Scholar
- Tan Li, Yufei Ren, Dantong Yu, and Shudong Jin. 2017. RAMSYS: Resource-aware asynchronous data transfer with multicore SYStems. IEEE Trans. Parallel Distrib. Syst. 28, 5 (2017), 1430--1444.Google ScholarDigital Library
- Jie Liang, Yinlong Xu, Yongkun Li, and Yubiao Pan. 2017. ISM- An intra-stripe data migration approach for RAID-5 scaling. In Proceedings of the International Conference on Networking, Architecture, and Storage (NAS’17). 1--10.Google ScholarCross Ref
- LinuxKernel. 2018. CFQ. Retrieved from https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt.Google Scholar
- LinuxKernel. 2018. Deadline. Retrieved from https://www.kernel.org/doc/Documentation/block/deadline-iosched.txt.Google Scholar
- Yue Liu, Songlin Hu, Tilmann Rabl, Wantao Liu, Hans-Arno Jacobsen, Kaifeng Wu, Jian Chen, and Jintao Li. 2014. DGFIndex for smart grid: Enhancing hive with a cost-effective multidimensional range index. Proc. VLDB Endow. 7, 13 (2014), 1496--1507.Google ScholarDigital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 133--148.Google ScholarDigital Library
- Marshall Kirk Mckusick and T. J. Kowalski. 2007. Fsck—The UNIX file system check program. Retrieved from https://dl.acm.org/doi/10.5555/107172.107210.Google Scholar
- Netapp. 2018. Cloud Sync. Retrieved from https://cloud.netapp.com/cloud-sync.Google Scholar
- Nexor. 2018. Secure and Efficient Manual Release of Files Across Networks. Retrieved from https://www.nexor.com/case-studies/files-transfer-secure-networks/.Google Scholar
- Thanumalayan Sankaranarayana Pillai, Ramnatthan Alagappan, Lanyue Lu, Vijay Chidambaram, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. Application crash consistency and performance with CCFS. In Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17). 181--196.Google ScholarDigital Library
- Kai Ren and Garth A. Gibson. 2013. TABLEFS: Enhancing metadata efficiency in the local file system. In Proceedings of the USENIX Annual Technical Conference. 145--156.Google ScholarDigital Library
- Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Trans. Storage 9, 3 (2013), 1--32.Google ScholarDigital Library
- Bradley W. Settlemyer, Jonathan D. Dobson, Stephen W. Hodson, Jeffery A. Kuehn, Stephen W. Poole, and Thomas Ruwart. 2011. A technique for moving large data sets over high-performance long distance networks. In Proceedings of the IEEE Conference on Mass Storage Systems and Technologies (MSST’11). 1--6.Google ScholarDigital Library
- Philip Shilane, Mark Huang, Grant Wallace, and Windsor Hsu. 2012. WAN optimized replication of backup datasets using stream-informed delta compression. In Proceedings of the USENIX Conference on File and Storage Technologies. 5.Google ScholarDigital Library
- Adam Sweeney, Doug Doucette, Wei Hu, Curtis Anderson, Mike Nishimoto, and Geoff Peck. 1996. Scalability in the XFS file system. In Proceedings of the USENIX Annual Technical Conference. 1--14.Google Scholar
- Textfiles.com. 2018. TextFiles. Retrieved from http://www.textfiles.com/bbs/.Google Scholar
- Andrew Tridgell. 2018. Rsync. Retrieved from https://rsync.samba.org/.Google Scholar
- Stephen C. Tweedie. 2000. EXT3, journaling filesystem. In Proceedings of the Ottowa Linux Symposium.Google Scholar
- Wenrui Yan, Jie Yao, Qiang Cao, Changsheng Xie, and Hong Jiang. 2017. ROS: A rack-based optical storage system with inline accessibility for long-term data preservation. In Proceedings of the 12th European Conference on Computer Systems (EuroSys’17). 161--174.Google ScholarDigital Library
- Wangdong Yang, Kenli Li, and Keqin Li. 2019. A pipeline computing method of SpTV for three-order tensors on CPU and GPU. Trans. Knowl. Discov. Data 13, 6 (2019), 63:1–63:27.Google Scholar
- Weikuan Yu, Jeffrey S. Vetter, Shane Canon, and Song Jiang. 2007. Exploiting lustre file joining for effective collective IO. In Proceedings of the 7th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID’07). 267--274.Google ScholarDigital Library
- Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Leif Walsh, Michael A. Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter. 2016. Optimizing every operation in a write-optimized file system. In Proceedings of the USENIX Conference on File and Storage Technologies. 1--14.Google Scholar
- Haoyu Zhang, Brian Cho, Ergin Seyfe, Avery Ching, and Michael J. Freedman. 2018. Riffle: Optimized shuffle service for large-scale data analytics. In Proceedings of the 13th EuroSys Conference (EuroSys’18). 43:1–43:15.Google Scholar
- Shuanglong Zhang, Helen Catanese, and An-I Andy Wang. 2016. The composite-file file system: Decoupling the one-to-one mapping of files and metadata for better performance. In Proceedings of the USENIX Conference on File and Storage Technologies. 15--22.Google Scholar
Index Terms
- Batch-file Operations to Optimize Massive Files Accessing: Analysis, Design, and Application
Recommendations
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Research of Massive Small Files Reading Optimization Based on Parallel Network File System
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and SystemsWith the rapid development of cloud computing and big data, there are more and more small files. How to manage those massive small files efficiently and provide low-latency service is becoming a hot topic in Parallel Network File System (pNFS). When ...
Accessing Files on Unmounted Filesystems
LISA '01: Proceedings of the 15th USENIX conference on System administrationThis paper describes a utility named <tt>ruf</tt> that reads files from an unmounted file system. The files are accessed by reading disk structures directly so the program is peculiar to the specific file system employed. The current implementation ...
Comments