Abstract
The implementation approaches of data deduplication system divide into several modes including SBA(source-based approach), ILA(in-line approach) and PPA(post-process approach). Currently, most commercial systems are implemented and operated in an ILA and PPA approach, and some researchers have focused on the SBA approach. As data deduplication systems are widely used, to choose an appropriate mode considering operation environment becomes more and more important than ever. Because the overhead of each mode and resource usage wasn’t fully studied, in some operating environments, the deduplication mode can lead to inefficiency and poor performance. In this study, we propose a data deduplication system supporting multi-mode. The proposed system can be operated in a mode that a user specifies during system operation, therefore, this system can be dynamically adjusted under consideration of system characteristics. In this paper, we operate the proposed system with the SBA, ILA and PPA mode, respectively, and we present the measurement results with a comparative analysis of the mode-specific performance and overhead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Tan, Y., Jiang, H., Feng, D., Tian, L., Yan, Z., Zhou, G.: SAM: A Semantic-Aware Multi-tiered Source De-duplication Framework for Cloud Backup. In: 39th International Conference on Parallel Processing (2010)
Quinlan, S., Dorward, S.: Venti: a new approach to archival storage. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST (2002)
Muthitacharoen, A., Chen, B., Mazieres, D.: A Low-Bandwidth Network File System. In: Proceedings of the Symposium on Operating Systems Principles (SOSP 2001) (2001)
Rabin, M.O.: Fingerprinting by random polynomials:Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University (1981)
Zhu, B., Li, K., Patterson, H.: Avoiding the disk bottleneck in the data domain deduplication file system. In: Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST (2008)
Broder, A., Mitzenmacher, M.: Network Applications of Bloom Filters: A Survey. In: Internet Mathematics (2002)
Lillibridge, M., Eshghi, K., Bhagwat, D., Deolalikar, V., Trezise, G., Campbell, P.: Sparse Indexing, Large Scale, Inline Deduplication Using Sampling and Locality. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies, FAST (2009)
Clements, A., Ahmad, I., Vilayannur, M., Li, J.: Decentralized Deduplication in SAN Cluster File Systems. In: Proceedings of 2009 USENIX Technical Conference (2009)
Dubnicki, C., Gryz, L., Heldt, L., Kaczmarczyk, M., Kilian, W., Strzelczak, P., Szczepkowski, J., Ungureanu, C., Welnicki, M.: HYDRAstor: a Scalable Secondary Storage. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies, FAST (2009)
Ungureanu, C., Atkin, B., Aranya, A., Salil Gokhale, S.R., Calkowski, G., Dubnicki, C., Bohra, A.: HydraFS: a High-Throughput File System for the HYDRAstor Content-Addressable Storage System. In: Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jung, H.M., Park, W.V., Lee, W.Y., Lee, J.G., Ko, Y.W. (2011). Data Deduplication System for Supporting Multi-mode. In: Nguyen, N.T., Kim, CG., Janiak, A. (eds) Intelligent Information and Database Systems. ACIIDS 2011. Lecture Notes in Computer Science(), vol 6591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20039-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-20039-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20038-0
Online ISBN: 978-3-642-20039-7
eBook Packages: Computer ScienceComputer Science (R0)