research-article

GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system

Authors:
Chulmin Kim

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

,
Ki-Woong Park

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

,
Kyu Ho Park

KAIST, Daejeon, South Korea

KAIST, Daejeon, South Korea
View Profile

PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and ManycoresFebruary 2012Pages 17–26https://doi.org/10.1145/2141702.2141705

Published:26 February 2012Publication History

PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores

Pages 17–26

ABSTRACT

Data deduplication has been an effective way to eliminate redundant data mainly for backup storage systems. Since the recent primary storage systems in cloud services are expected to have the redundancy, the deduplication technique can also bring significant cost saving for the primary storage. However, the primary storage system requires high performance requirement about several GBs per second. Most conventional deduplication techniques targeted the performance requirement of 200-300MB/s.

In an attempt to achieve a high performance storage deduplication system at the primary storage, we thoroughly analyze the performance bottleneck of previous deduplication systems to enhance the system to meet the requirement of the primary storage. The new performance bottleneck of deduplication in the primary storage lies on not only key-value store lookup, also computation for data segmentation and fingerprinting due to recent technology improvement of flash devices such as SSD. To overcome the bottlenecks, we propose a new deduplication system utilizing GPGPU. Our proposed system, termed GHOST, includes the followings to offload and optimize the deduplication processing in GPGPU: (1) In-Host Data Cache, (2) Destage-aware Data offloading to GPGPU and (3) In-GPGPU Table Cache of key-value store. These techniques improve the offloaded deduplication processing about 10-20% on the reasonable workload of the primary storage compared to the naive approach. Our proposed deduplication system can achieve 1.5GB/s in maximum which is about 5 times of the deduplication systems used CPU only.

References

Amazon.com. Amazon Elastic Cloud Service. http://aws.amazon.com/ec2/, 2007.Google Scholar
Amazon.com. Amazon Simple Storage Service. http://aws.amazon.com/S3/, 2007.Google Scholar
Sung Hoon Baek and Kyu Ho Park. Prefetching with adaptive cache culling for striped disk arrays. In USENIX 2008 Annual Technical Conference on Annual Technical Conference, pages 363--376, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library
Guilherme Dal Bianco, Renata Galante, and Carlos A. Heuser. A fast approach for parallel deduplication on multicore processors. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC '11, pages 1027--1032, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
Biplob Debnath, Sudipta Sengupta, and Jin Li. ChunkStash: speeding up inline storage deduplication using flash memory. In Proceedings of the 2010 USENIX conference on USENIX annual technical conference, USENIXATC'10, pages 16--16, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
D. Eastlake, 3rd and P. Jones. US Secure Hash Algorithm 1 (SHA1). 2001.Google ScholarDigital Library
Abdullah Gharaibeh, Samer Al-Kiswany, Sathish Gopalakrishnan, and Matei Ripeanu. A GPU accelerated storage system. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 167--178, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
Binny S. Gill, Michael Ko, Biplob Debnath, and Wendy Belluomini. STOW: a spatially and temporally optimized write caching algorithm. In Proceedings of the 2009 conference on USENIX Annual technical conference, USENIX'09, pages 26--26, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarDigital Library
Binny S. Gill and Dharmendra S. Modha. WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches. In Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4, pages 10--10, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarDigital Library
Sangjin Han, Keon Jang, KyoungSoo Park, and Sue Moon. PacketShader: a GPU-accelerated software router. In Proceedings of the ACM SIGCOMM 2010 conference on SIGCOMM, SIGCOMM '10, pages 195--206, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
Intel. Intel Processor Pricing. http://www.intc.com/pricelist.cfm, 2011.Google Scholar
Intel. Intel Xeon Processor E3-1200 Family Datasheet. http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-1-datasheet.html, 2011.Google Scholar
Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble. Sparse indexing: large scale, inline deduplication using sampling and locality. In Proccedings of the 7th conference on File and storage technologies, pages 111--123, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarDigital Library
NexR. Co. Ltd. iCube Cloud Computing and Elastic-Storage Service. http://www.icubecloud.com, 2010.Google Scholar
Florian Mendel and Vincent Rijmen. Cryptanalysis of the tiger hash function. In Proceedings of the Advances in Crypotology 13th international conference on Theory and application of cryptology and information security, ASIACRYPT'07, pages 536--550, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarDigital Library
Dutch T. Meyer and William J. Bolosky. A study of practical deduplication. In Proceedings of the 9th USENIX conference on File and stroage technologies, FAST'11, pages 1--1, Berkeley, CA, USA, 2011. USENIX Association. Google ScholarDigital Library
NVIDIA. NVIDIA GeForce GTX 480/470/465 GPU Datasheed. 2010.Google Scholar
NVIDIA. CUDA:Parallel Programming Computing Platform. http://www.nvidia.com/object/cuda_home_new.html, 2011.Google Scholar
Kyu Ho Park, Youngwoo Park, Woomin Hwang, and Ki-Woong Park. MN-Mate: Resource Management of Manycores with DRAM and Nonvolatile Memories. High Performance Computing and Communications, 10th IEEE International Conference on, 0:24--34, 2010. Google ScholarDigital Library
Sean Quinlan and Sean Dorward. Venti: A New Approach to Archival Storage. In Proceedings of the Conference on File and Storage Technologies, FAST '02, pages 89--101, Berkeley, CA, USA, 2002. USENIX Association. Google ScholarDigital Library
M. O. Rabin. Fingerprinting by Random Polynomials. In Tech. Rep. TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.Google Scholar
Samsung. 128GB 2.5-inch SSD 830 Series. http://www.samsung.com/us/computer/memory-storage/MZ-7PC128N/AM-specs, 2011.Google Scholar
Bianca Schroeder and Garth A. Gibson. Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX conference on File and Storage Technologies, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
Azure Service. Microsoft, Windows Azure Platform. http://www.microsoft.com/windowsazure/, 2010. Google ScholarDigital Library
K. Shvachko, H. Huang, S. Radia, and R. Chansler. The hadoop distributed file system. In 26th IEEE (MSST2010) Symposium on Massive Storage Systems and Technologies, MSST '10, 2010. Google ScholarDigital Library
Zhi Tang and Youjip Won. Multithread Content Based File Chunking System in CPU-GPGPU Heterogeneous Architecture. In Data Compression, Communications and Processing (CCP), 2011 First International Conference on, pages 58--64, june 2011. Google ScholarDigital Library
Benjamin Zhu, Kai Li, and Hugo Patterson. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST'08, pages 18:1--18:14, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarDigital Library

Index Terms

GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system
1. Information systems
  1. Information retrieval
  2. Information storage systems

Recommendations

Efficient Deduplication in a Distributed Primary Storage Infrastructure

A large amount of duplicate data typically exists across volumes of virtual machines in cloud computing infrastructures. Deduplication allows reclaiming these duplicates while improving the cost-effectiveness of large-scale multitenant infrastructures. ...
Read More
A study of practical deduplication

We collected file system content data from 857 desktop computers at Microsoft over a span of 4 weeks. We analyzed the data to determine the relative efficacy of data deduplication, particularly considering whole-file versus block-level elimination of ...
Read More
WOJ: Enabling Write-Once Full-data Journaling in SSDs by Using Weak-Hashing-based Deduplication

Journaling is a commonly used technique to ensure data consistency in file systems, such as ext3 and ext4. With journaling technique, file system updates are first recorded in a journal (in the commit phase) and later applied to their home locations in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
February 2012
180 pages
ISBN:9781450312110
DOI:10.1145/2141702
Conference Chairs:
Minyi Guo
Shanghai Jiao Tong University, China
,
Zhiyi Huang
University of Otago, New Zealand
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 February 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPGPU
deduplication
primary storage
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate53of97submissions,55%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 571
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system

PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient Deduplication in a Distributed Primary Storage Infrastructure

A study of practical deduplication

WOJ: Enabling Write-Once Full-data Journaling in SSDs by Using Weak-Hashing-based Deduplication

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system

PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient Deduplication in a Distributed Primary Storage Infrastructure

A study of practical deduplication

WOJ: Enabling Write-Once Full-data Journaling in SSDs by Using Weak-Hashing-based Deduplication

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media