skip to main content
10.1145/2141702.2141705acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system

Published:26 February 2012Publication History

ABSTRACT

Data deduplication has been an effective way to eliminate redundant data mainly for backup storage systems. Since the recent primary storage systems in cloud services are expected to have the redundancy, the deduplication technique can also bring significant cost saving for the primary storage. However, the primary storage system requires high performance requirement about several GBs per second. Most conventional deduplication techniques targeted the performance requirement of 200-300MB/s.

In an attempt to achieve a high performance storage deduplication system at the primary storage, we thoroughly analyze the performance bottleneck of previous deduplication systems to enhance the system to meet the requirement of the primary storage. The new performance bottleneck of deduplication in the primary storage lies on not only key-value store lookup, also computation for data segmentation and fingerprinting due to recent technology improvement of flash devices such as SSD. To overcome the bottlenecks, we propose a new deduplication system utilizing GPGPU. Our proposed system, termed GHOST, includes the followings to offload and optimize the deduplication processing in GPGPU: (1) In-Host Data Cache, (2) Destage-aware Data offloading to GPGPU and (3) In-GPGPU Table Cache of key-value store. These techniques improve the offloaded deduplication processing about 10-20% on the reasonable workload of the primary storage compared to the naive approach. Our proposed deduplication system can achieve 1.5GB/s in maximum which is about 5 times of the deduplication systems used CPU only.

References

  1. Amazon.com. Amazon Elastic Cloud Service. http://aws.amazon.com/ec2/, 2007.Google ScholarGoogle Scholar
  2. Amazon.com. Amazon Simple Storage Service. http://aws.amazon.com/S3/, 2007.Google ScholarGoogle Scholar
  3. Sung Hoon Baek and Kyu Ho Park. Prefetching with adaptive cache culling for striped disk arrays. In USENIX 2008 Annual Technical Conference on Annual Technical Conference, pages 363--376, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Guilherme Dal Bianco, Renata Galante, and Carlos A. Heuser. A fast approach for parallel deduplication on multicore processors. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC '11, pages 1027--1032, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Biplob Debnath, Sudipta Sengupta, and Jin Li. ChunkStash: speeding up inline storage deduplication using flash memory. In Proceedings of the 2010 USENIX conference on USENIX annual technical conference, USENIXATC'10, pages 16--16, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Eastlake, 3rd and P. Jones. US Secure Hash Algorithm 1 (SHA1). 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Abdullah Gharaibeh, Samer Al-Kiswany, Sathish Gopalakrishnan, and Matei Ripeanu. A GPU accelerated storage system. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 167--178, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Binny S. Gill, Michael Ko, Biplob Debnath, and Wendy Belluomini. STOW: a spatially and temporally optimized write caching algorithm. In Proceedings of the 2009 conference on USENIX Annual technical conference, USENIX'09, pages 26--26, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Binny S. Gill and Dharmendra S. Modha. WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches. In Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4, pages 10--10, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sangjin Han, Keon Jang, KyoungSoo Park, and Sue Moon. PacketShader: a GPU-accelerated software router. In Proceedings of the ACM SIGCOMM 2010 conference on SIGCOMM, SIGCOMM '10, pages 195--206, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Intel. Intel Processor Pricing. http://www.intc.com/pricelist.cfm, 2011.Google ScholarGoogle Scholar
  14. Intel. Intel Xeon Processor E3-1200 Family Datasheet. http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-1-datasheet.html, 2011.Google ScholarGoogle Scholar
  15. Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble. Sparse indexing: large scale, inline deduplication using sampling and locality. In Proccedings of the 7th conference on File and storage technologies, pages 111--123, Berkeley, CA, USA, 2009. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. NexR. Co. Ltd. iCube Cloud Computing and Elastic-Storage Service. http://www.icubecloud.com, 2010.Google ScholarGoogle Scholar
  17. Florian Mendel and Vincent Rijmen. Cryptanalysis of the tiger hash function. In Proceedings of the Advances in Crypotology 13th international conference on Theory and application of cryptology and information security, ASIACRYPT'07, pages 536--550, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Dutch T. Meyer and William J. Bolosky. A study of practical deduplication. In Proceedings of the 9th USENIX conference on File and stroage technologies, FAST'11, pages 1--1, Berkeley, CA, USA, 2011. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. NVIDIA. NVIDIA GeForce GTX 480/470/465 GPU Datasheed. 2010.Google ScholarGoogle Scholar
  20. NVIDIA. CUDA:Parallel Programming Computing Platform. http://www.nvidia.com/object/cuda_home_new.html, 2011.Google ScholarGoogle Scholar
  21. Kyu Ho Park, Youngwoo Park, Woomin Hwang, and Ki-Woong Park. MN-Mate: Resource Management of Manycores with DRAM and Nonvolatile Memories. High Performance Computing and Communications, 10th IEEE International Conference on, 0:24--34, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sean Quinlan and Sean Dorward. Venti: A New Approach to Archival Storage. In Proceedings of the Conference on File and Storage Technologies, FAST '02, pages 89--101, Berkeley, CA, USA, 2002. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. O. Rabin. Fingerprinting by Random Polynomials. In Tech. Rep. TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.Google ScholarGoogle Scholar
  24. Samsung. 128GB 2.5-inch SSD 830 Series. http://www.samsung.com/us/computer/memory-storage/MZ-7PC128N/AM-specs, 2011.Google ScholarGoogle Scholar
  25. Bianca Schroeder and Garth A. Gibson. Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX conference on File and Storage Technologies, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Azure Service. Microsoft, Windows Azure Platform. http://www.microsoft.com/windowsazure/, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Shvachko, H. Huang, S. Radia, and R. Chansler. The hadoop distributed file system. In 26th IEEE (MSST2010) Symposium on Massive Storage Systems and Technologies, MSST '10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhi Tang and Youjip Won. Multithread Content Based File Chunking System in CPU-GPGPU Heterogeneous Architecture. In Data Compression, Communications and Processing (CCP), 2011 First International Conference on, pages 58--64, june 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Benjamin Zhu, Kai Li, and Hugo Patterson. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST'08, pages 18:1--18:14, Berkeley, CA, USA, 2008. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
        February 2012
        180 pages
        ISBN:9781450312110
        DOI:10.1145/2141702
        • Conference Chairs:
        • Minyi Guo,
        • Zhiyi Huang

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 February 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate53of97submissions,55%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader