On GPU’s viability as a middleware accelerator

Al-Kiswany, Samer; Gharaibeh, Abdullah; Santos-Neto, Elizeu; Ripeanu, Matei

doi:10.1007/s10586-009-0076-0

On GPU’s viability as a middleware accelerator

Published: 17 January 2009

Volume 12, pages 123–140, (2009)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Samer Al-Kiswany¹,
Abdullah Gharaibeh¹,
Elizeu Santos-Neto¹ &
…
Matei Ripeanu¹

121 Accesses
5 Citations
Explore all metrics

Abstract

Today Graphics Processing Units (GPUs) are a largely underexploited resource on existing desktops and a possible cost-effective enhancement to high-performance systems. To date, most applications that exploit GPUs are specialized scientific applications. Little attention has been paid to harnessing these highly-parallel devices to support more generic functionality at the operating system or middleware level. This study starts from the hypothesis that generic middleware-level techniques that improve distributed system reliability or performance (such as content addressing, erasure coding, or data similarity detection) can be significantly accelerated using GPU support.

We take a first step towards validating this hypothesis and we design StoreGPU, a library that accelerates a number of hashing-based middleware primitives popular in distributed storage system implementations. Our evaluation shows that StoreGPU enables up twenty five fold performance gains on synthetic benchmarks as well as on a high-level application: the online similarity detection between large data files.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Moya, V., Gonzalez, C., Roca, J., Fernandez, A., et al.: Shader performance analysis on a modern GPU architecture. In: IEEE/ACM International Symposium on Microarchitecture, MICRO-38, 2005
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., et al.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(1), 80–113 (2007). doi:10.1111/j.1467-8659.2007.01012.x
Article Google Scholar
NVIDIA CUDA Compute Unified Device Architecture: Programming Guide v2.0 (2008)
Quinlan, S., Dorward, S.: Venti: a new approach to archival data storage. In: FAST, Monterey, CA, 2002
Twisted Storage. http://twistedstorage.sourceforge.net/ (2008)
Weatherspoon, H., Kubiatowicz, J.: Erasure coding vs. replication: a quantitative comparison. In: IPTPS, 2002
Muthitacharoen, A., Chen, B., Mazieres, D.: A low-bandwidth network file system, In: SOSP, 2001
Chun, B.-G., Dabek, F., Haeberlen, A., Sit, E., et al.: Efficient replica maintenance for distributed storage systems. In: NSDI, San Jose, CA, (2006)
Bloom, B.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970). doi:10.1145/362686.362692
Article MATH Google Scholar
Huffman, D.: A method for the construction of minimum-redundancy codes. Proc. IRE 40(9), 1098–1101 (1952). doi:10.1109/JRPROC.1952.273898
Article Google Scholar
Vilayannur, M., Nath, P., Sivasubramaniam, A.: Providing tunable consistency for a parallel file store. In: USENIX Conference on File and Storage Technologies, 2005
Al-Kiswany, S., Ripeanu, M., Vazhkudai, S., Gharaibeh, A.: STDCHK: a checkpoint storage system for desktop grid computing. In: ICDCS, Beijing, China, 2008
Yumerefendi, A.R., Chase, J.S.: Strong accountability for network storage. In: FAST’07, 2007
Cox, L.P., Noble, B.D.: Samsara: honor among thieves in peer-to-peer storage. In: ACM Symposium on Operating Systems Principles, 2003
Fu, K., Kaashoek, M.F., Mazières, D.: Fast and secure distributed read-only file system. In: OSDI, 2000
Kotla, R., Alvisi, L., Dahlin, M.: SafeStore: a durable and practical storage system. In: USENIX Annual Technical Conference, 2007
Karger, D.R., Lehman, E., Leighton, F.T., Panigrahy, R., et al.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In: Symposium on Theory of Computing, 1997. ACM, New York (1997)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., et al.: Chord: a scalable peer-to-peer lookup service for Internet applications. In: SIGCOMM 2001, San Diego, USA, 2001
Rowstron, A., Druschel, P.: Pastry: scalable, distributed object location and routing for large-scale peer-to-peer systems. In: IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, 2001
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., et al.: Dynamo: Amazon’s highly available key-value store. In: SOSP07, 2007
Dabek, F., Kaashoek, M.F., Karger, D., Morris, R., et al.: Wide-area cooperative storage with CFS. In: SOSP, 2001
Eshghi, K., Lillibridge, M., Wilcock, L., Belrose, G., et al.: JumboStore: providing efficient incremental upload and versioning for a utility rendering service. In: FAST, 2007
Jon Peddie Research Report: NVIDIA on a roll, grabs more desktop graphics market share in 4Q. http://www.jonpeddie.com/about/press/MarketWatch_Q405.shtml (2006)
Jon Peddie Research Report: Overall GPU market was up an astounding 20%—desktop displaced mobile. http://www.jonpeddie.com/about/press/2007/GPU_market_Q307.shtml (2007)
AMD Stream Computing SDK. Available from: http://ati.amd.com/technology/streamcomputing/ (2008)
ATI Close To Metal (CTM) Technical Reference Version 1.01 Manual (2008)
Open, C.L.: Available from: http://www.khronos.org/opencl/ (2008)
RapidMind Development Platform. Available from: http://www.rapidmind.net/ (2008)
Buck, I., Foley, T., Horn, D., Sugerman, J., et al.: Brook for GPUs: stream computing on graphics hardware. ACM Trans. Graph. 23(3), 777–786 (2004). doi:10.1145/1015706.1015800
Article Google Scholar
McCool, M., Toit, S.D.: Metaprogramming GPUs with Sh. AK Peters, Wellesley (2004)
Google Scholar
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA tesla: a unified graphics and computing architecture. In: IEEE Micro, pp. 39–55, 2008
Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., et al.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008
Che, S., Boyer, M., Meng, J., Tarjan, D. et al.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), (2008). doi:10.1016/j.jpdc.2008.05.014
Merkle, R.: A certified digital signature. In: Advances in Cryptology—CRYPTO. Lecture Notes in Computer Science. Springer, Berlin (1989)
Damgard, I.: A design principle for hash functions. In: Advances in Cryptology—CRYPTO. Lecture Notes in Computer Science. Springer, Berlin (1989)
Hargrove, P.H., Duell, J.C.: Berkeley Lab Checkpoint/Restart (BLCR) for Linux clusters. In: Scientific Discovery through Advanced Computing Program (SciDAC), 2006
Altschul, S.F., Gish, W., Miller, W., Myers, E., et al.: Basic local alignment tool. Mol. Biol. 215, 403–410 (1990)
Google Scholar
Liu, W., Schmidt, B., Voss, G., Schroder, A., et al.: Bio-sequence database scanning on a GPU. In: IPDPS, 2006
Thompson, C.J., Hahn, S., Oskin, M.: Using modern graphics architectures for general-purpose computing: a framework and analysis. In: ACM/IEEE International Symposium on Microarchitecture, 2002
Kruger, J., Westermann, R.: Linear algebra operators for GPU implementation of numerical algorithms. In: ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques, 2003
Govindaraju, N.K., Lloyd, B., Wang, W., Manocha, M.L.: Fast computation of database operations using graphics processors. In: ACM SIGMOD International Conference on Management of Data, 2004
Curry, M.L., Skjellum, A., Ward, H.L., Brightwell, R.: Accelerating Reed–Solomon coding in RAID systems with GPUs. In: IPDPS, 2008
Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math. 8(2), 300–304 (1960). doi:10.1137/0108018
Article MATH MathSciNet Google Scholar
Falcao, G., Sousa, L., Silva, V.: Massive parallel LDPC decoding on GPU. In: ACM SIGPLAN Symposium on Principles and practice of Parallel Programming (PPoPP), Salt Lake City, 2008
Harrison, O., Waldron, J.: AES encryption implementation and analysis on commodity graphics processing units, In: Workshop on Cryptographic Hardware and Embedded Systems (CHES), Vienna, Austria, 2007
Harrison, O., Waldron, J.: Practical symmetric key cryptography on modern graphics hardware. In: USENIX Security Symposium, San Jose, CA, 2008
Manavski, S.A.: CUDA compatible GPU as an efficient hardware accelerator for AES cryptography. In: IEEE International Conference on Signal Processing and Communications (ICSPC), Dubai, United Arab Emirates, 2007
Moss, A., Page, D., Smart, N.: Toward acceleration of RSA using 3D graphics hardware. In: Cryptography and Coding, 2007
Kaspersky Antivirus. Available from: http://www.kaspersky.com/ (2008)
Elcomsoft password recovery software. Available from: http://www.elcomsoft.com (2008)
Geforce 9 Series. http://www.nvidia.com/object/geforce9.html (2008)
Dabiri, D., Blake, I.F.: Fast parallel algorithms for decoding Reed–Solomon codes based on remainder polynomials. IEEE Trans. Inf. Theory 41(4), 873–885 (1995). doi:10.1109/18.391235
Article MATH MathSciNet Google Scholar
Gilchrist, J.: Parallel compression with BZIP2. In: IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), 2004
Nightingale, E.B., Peek, D., Chen, P.M., Flinn, J.: Parallelizing security checks on commodity hardware. In: ASPLOS, Seattle, WA, 2008
Geforce 8 Series. http://www.nvidia.com/page/geforce8.html (2008)
Bakhoda, A., Yuan, G., Fung, W.W.L., Wong, H., et al.: Performance analysis of GPU compute workloads via detailed simulation. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, MA, 2009

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering Department, The University of British Columbia, Vancouver, BC, Canada, V6T 1Z4
Samer Al-Kiswany, Abdullah Gharaibeh, Elizeu Santos-Neto & Matei Ripeanu

Authors

Samer Al-Kiswany
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Gharaibeh
View author publications
You can also search for this author in PubMed Google Scholar
Elizeu Santos-Neto
View author publications
You can also search for this author in PubMed Google Scholar
Matei Ripeanu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samer Al-Kiswany.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Kiswany, S., Gharaibeh, A., Santos-Neto, E. et al. On GPU’s viability as a middleware accelerator. Cluster Comput 12, 123–140 (2009). https://doi.org/10.1007/s10586-009-0076-0

Download citation

Received: 01 January 2009
Accepted: 05 January 2009
Published: 17 January 2009
Issue Date: June 2009
DOI: https://doi.org/10.1007/s10586-009-0076-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On GPU’s viability as a middleware accelerator

Abstract

Access this article

Similar content being viewed by others

GPU-Accelerated Cloud Computing for Data-Intensive Applications

Exploring the Design Space of a GPU-Aware Database Architecture

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On GPU’s viability as a middleware accelerator

Abstract

Access this article

Similar content being viewed by others

GPU-Accelerated Cloud Computing for Data-Intensive Applications

Exploring the Design Space of a GPU-Aware Database Architecture

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation