ABSTRACT
Data deduplication has been an effective way to eliminate redundant data mainly for backup storage systems. Since the recent primary storage systems in cloud services are expected to have the redundancy, the deduplication technique can also bring significant cost saving for the primary storage. However, the primary storage system requires high performance requirement about several GBs per second. Most conventional deduplication techniques targeted the performance requirement of 200-300MB/s.
In an attempt to achieve a high performance storage deduplication system at the primary storage, we thoroughly analyze the performance bottleneck of previous deduplication systems to enhance the system to meet the requirement of the primary storage. The new performance bottleneck of deduplication in the primary storage lies on not only key-value store lookup, also computation for data segmentation and fingerprinting due to recent technology improvement of flash devices such as SSD. To overcome the bottlenecks, we propose a new deduplication system utilizing GPGPU. Our proposed system, termed GHOST, includes the followings to offload and optimize the deduplication processing in GPGPU: (1) In-Host Data Cache, (2) Destage-aware Data offloading to GPGPU and (3) In-GPGPU Table Cache of key-value store. These techniques improve the offloaded deduplication processing about 10-20% on the reasonable workload of the primary storage compared to the naive approach. Our proposed deduplication system can achieve 1.5GB/s in maximum which is about 5 times of the deduplication systems used CPU only.
- Amazon.com. Amazon Elastic Cloud Service. http://aws.amazon.com/ec2/, 2007.Google Scholar
- Amazon.com. Amazon Simple Storage Service. http://aws.amazon.com/S3/, 2007.Google Scholar
- Sung Hoon Baek and Kyu Ho Park. Prefetching with adaptive cache culling for striped disk arrays. In USENIX 2008 Annual Technical Conference on Annual Technical Conference, pages 363--376, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library
- Guilherme Dal Bianco, Renata Galante, and Carlos A. Heuser. A fast approach for parallel deduplication on multicore processors. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC '11, pages 1027--1032, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. ChunkStash: speeding up inline storage deduplication using flash memory. In Proceedings of the 2010 USENIX conference on USENIX annual technical conference, USENIXATC'10, pages 16--16, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- D. Eastlake, 3rd and P. Jones. US Secure Hash Algorithm 1 (SHA1). 2001.Google ScholarDigital Library
- Abdullah Gharaibeh, Samer Al-Kiswany, Sathish Gopalakrishnan, and Matei Ripeanu. A GPU accelerated storage system. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 167--178, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- Binny S. Gill, Michael Ko, Biplob Debnath, and Wendy Belluomini. STOW: a spatially and temporally optimized write caching algorithm. In Proceedings of the 2009 conference on USENIX Annual technical conference, USENIX'09, pages 26--26, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarDigital Library
- Binny S. Gill and Dharmendra S. Modha. WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches. In Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4, pages 10--10, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarDigital Library
- Sangjin Han, Keon Jang, KyoungSoo Park, and Sue Moon. PacketShader: a GPU-accelerated software router. In Proceedings of the ACM SIGCOMM 2010 conference on SIGCOMM, SIGCOMM '10, pages 195--206, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- Intel. Intel Processor Pricing. http://www.intc.com/pricelist.cfm, 2011.Google Scholar
- Intel. Intel Xeon Processor E3-1200 Family Datasheet. http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-1-datasheet.html, 2011.Google Scholar
- Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble. Sparse indexing: large scale, inline deduplication using sampling and locality. In Proccedings of the 7th conference on File and storage technologies, pages 111--123, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarDigital Library
- NexR. Co. Ltd. iCube Cloud Computing and Elastic-Storage Service. http://www.icubecloud.com, 2010.Google Scholar
- Florian Mendel and Vincent Rijmen. Cryptanalysis of the tiger hash function. In Proceedings of the Advances in Crypotology 13th international conference on Theory and application of cryptology and information security, ASIACRYPT'07, pages 536--550, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarDigital Library
- Dutch T. Meyer and William J. Bolosky. A study of practical deduplication. In Proceedings of the 9th USENIX conference on File and stroage technologies, FAST'11, pages 1--1, Berkeley, CA, USA, 2011. USENIX Association. Google ScholarDigital Library
- NVIDIA. NVIDIA GeForce GTX 480/470/465 GPU Datasheed. 2010.Google Scholar
- NVIDIA. CUDA:Parallel Programming Computing Platform. http://www.nvidia.com/object/cuda_home_new.html, 2011.Google Scholar
- Kyu Ho Park, Youngwoo Park, Woomin Hwang, and Ki-Woong Park. MN-Mate: Resource Management of Manycores with DRAM and Nonvolatile Memories. High Performance Computing and Communications, 10th IEEE International Conference on, 0:24--34, 2010. Google ScholarDigital Library
- Sean Quinlan and Sean Dorward. Venti: A New Approach to Archival Storage. In Proceedings of the Conference on File and Storage Technologies, FAST '02, pages 89--101, Berkeley, CA, USA, 2002. USENIX Association. Google ScholarDigital Library
- M. O. Rabin. Fingerprinting by Random Polynomials. In Tech. Rep. TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.Google Scholar
- Samsung. 128GB 2.5-inch SSD 830 Series. http://www.samsung.com/us/computer/memory-storage/MZ-7PC128N/AM-specs, 2011.Google Scholar
- Bianca Schroeder and Garth A. Gibson. Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX conference on File and Storage Technologies, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- Azure Service. Microsoft, Windows Azure Platform. http://www.microsoft.com/windowsazure/, 2010. Google ScholarDigital Library
- K. Shvachko, H. Huang, S. Radia, and R. Chansler. The hadoop distributed file system. In 26th IEEE (MSST2010) Symposium on Massive Storage Systems and Technologies, MSST '10, 2010. Google ScholarDigital Library
- Zhi Tang and Youjip Won. Multithread Content Based File Chunking System in CPU-GPGPU Heterogeneous Architecture. In Data Compression, Communications and Processing (CCP), 2011 First International Conference on, pages 58--64, june 2011. Google ScholarDigital Library
- Benjamin Zhu, Kai Li, and Hugo Patterson. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST'08, pages 18:1--18:14, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library
Index Terms
- GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system
Recommendations
Efficient Deduplication in a Distributed Primary Storage Infrastructure
A large amount of duplicate data typically exists across volumes of virtual machines in cloud computing infrastructures. Deduplication allows reclaiming these duplicates while improving the cost-effectiveness of large-scale multitenant infrastructures. ...
A study of practical deduplication
We collected file system content data from 857 desktop computers at Microsoft over a span of 4 weeks. We analyzed the data to determine the relative efficacy of data deduplication, particularly considering whole-file versus block-level elimination of ...
WOJ: Enabling Write-Once Full-data Journaling in SSDs by Using Weak-Hashing-based Deduplication
Journaling is a commonly used technique to ensure data consistency in file systems, such as ext3 and ext4. With journaling technique, file system updates are first recorded in a journal (in the commit phase) and later applied to their home locations in ...
Comments