Abstract:
HDFS is designed for storing large files, but it suffered performance penalty when storing large amount of small files such as the space occupied by the metadata cause hi...Show MoreMetadata
Abstract:
HDFS is designed for storing large files, but it suffered performance penalty when storing large amount of small files such as the space occupied by the metadata cause high consumption of NameNode and low efficiency of file reading. Currently, there are many approaches implemented to solve the small file problem. In this paper we use additional hardware named SFS (Small File Server) between users and HDFS to solve the small file problem. The proposed approach includes a file merging algorithm based on temporal continuity, an index structure to retrieve small files and a prefetching mechanism to improve the performance of file reading and writing. The experimental results show that the proposed approach efficiently optimizes small files storing in HDFS with reducing the overload of NameNode and improving the performance of file accessing.
Date of Conference: 05-07 October 2016
Date Added to IEEE Xplore: 10 November 2016
ISBN Information: