research-article

Collaborative processing of data-intensive algorithms with CPU, intelligent SSD, and GPU

Authors:
Yong-Yeon Jo

Hanyang University, Korea

Hanyang University, Korea
View Profile

,
SungWoo Cho

Hanyang University, Korea

Hanyang University, Korea
View Profile

,
Sang-Wook Kim

Hanyang University, Korea

Hanyang University, Korea
View Profile

,
Hyunok Oh

Hanyang University, Korea

Hanyang University, Korea
View Profile

SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied ComputingApril 2016Pages 1865–1870https://doi.org/10.1145/2851613.2851741

Published:04 April 2016Publication History

SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing

Pages 1865–1870

ABSTRACT

The graphic processing unit (GPU) is a computing resource to process graphics-related applications. The intelligent SSD (iSSD) is a solid state device (SSD) that is provided with data processing power. These days, CPU, GPU, and SSD are equipped together in most processing environment. If SSD is replaced with iSSD later on, we have a new processing environment where three computing resources collaborate one another to process a huge volume of data (so called big data) quite effectively. In this paper, we address how to exploit all these computing resources for efficient processing of data-intensive algorithms.Through extensive experiment, we verify the effectiveness and potential of the proposed collaborative processing environment by processing data concurrently with multiple computing resources. The results reveal that processing in the our environment outperforms that in the traditional one by up to 3.5 times.

References

D. Bae et al., "Intelligent SSD: A Turbo for Big Data Mining," In Proc. of ACM Int'l Conf. on Information and Knowledge Management, ACM CIKM, pp. 1553--1556, 2013. Google ScholarDigital Library
N. Gov et al., "GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management," In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp. 325--336, 2006. Google ScholarDigital Library
J. Fung and S. Mann, "Using Graphics Devices in Reverse: GPU-based Image Processing and Computer Vision," In Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 9--12, 2008.Google Scholar
S. Ryoo et al., "Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA," In Proc. ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, PPoPP, pp. 73--82, 2008 Google ScholarDigital Library
S. Kim et al., "Fast, Energy Efficient Scan inside Flash Memory SSDs," In Proc. Int'l Workshop on Accelerating Data Management Systems using Modern Processor and Storage Architectures, ADMS, 2011.Google Scholar
Y. Jo et al., "On Running Data-Intensive Algorithms with Intelligent SSD and Host CPU: A Collaborative Approach," In Proc. Int'l Conf. on ACM/SIGAPP Symposium On Applied Computing, ACM SAC, pp. 2060--2065, 2015. Google ScholarDigital Library
S. Pabst, A. Koch, and W. Straber, "Fast and Scalable CPU=GPU Collision Detection for Rigid and Deformable Surfaces," Computer Graphics Forum, Vol. 29, No. 5, pp. 1605--1612, 2010.Google Scholar
H. Oh and S. Ha, "A Static Scheduling Heuristic for Heterogeneous Processors," In Proc. Int'l Conf. Euro-Par Parallel Processing, pp. 573--577, 1996. Google ScholarDigital Library
H. Topcuoglu, S. Hariri and M. Wu, "Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing," IEEE Transactions on Parallel and Distributed Systems, Vol. 3, No. 3, pp. 260--274. 2002. Google ScholarDigital Library
G. Sih and E. Lee, "A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures," IEEE Transactions on Parallel and Distributed Systems, Vol. 4, No. 2, pp. 175--187. 1993. Google ScholarDigital Library
Y. Kwok and I. Ahmad, "Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors," ACM Computing Surveys, Vol. 31, No. 4, 1999. Google ScholarDigital Library
N. Bell and M. Garland, Efficient Sparse Matrix-Vector Multiplication on CUDA, NVIDIA Technical Report, NVIDIA Corporation, 2008.Google Scholar
E. Lee and D. Messerschmitt, "Synchronous data flow," Proceedings of the IEEE, Vol. 75, No. 9, pp. 1235--1245, 1987.Google ScholarCross Ref
J. MacQueen et al., "Some Methods for Classification and Analysis of Multivariate Observations," In Proc. of Berkeley Symp. on Mathematical Statistics and Probability, pp. 281--297, 1967.Google Scholar
L. Page et al., The PageRank Citation Ranking: Bringing Order to the Web, Technical Report, Stanford University, 1999.Google Scholar
G. Jeh and J. Widom, "SimRank: a measure of structural-context similarity," In Proc. of ACM Int'l. Conf. on Knowledge discovery and data mining, ACM SIGKDD, pp. 538--543, 2002. Google ScholarDigital Library
Intel, Intel VTune Amplifier, https://software.intel.com/en-us/node/529213, 2014.Google Scholar

Index Terms

Collaborative processing of data-intensive algorithms with CPU, intelligent SSD, and GPU
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation evaluation

Recommendations

On running data-intensive algorithms with intelligent SSD and host CPU: a collaborative approach
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

A solid state device (SSD), which has the characteristics such as high IO bandwidth and low access latency, is drawing attention as a next-generation storage device. Even though SSD provides a high internal bandwidth, the performance bottleneck exists ...
Read More
Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing
CLUSTER '10: Proceedings of the 2010 IEEE International Conference on Cluster Computing

In this paper, we describe our experiment developing an implementation of the Linpack benchmark for TianHe-1, a petascale CPU/GPU supercomputer system, the largest GPU-accelerated system ever attempted before. An adaptive optimization framework is ...
Read More
Heterogeneous concurrent execution of Monte Carlo photon transport on CPU, GPU and MIC
IA³ '14: Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms

In this paper, a new level of heterogeneous concurrent execution of Monte Carlo photon transport is presented. ARCHER, an application for computing radiation dosimetry for CT imaging involving whole-body patient phantoms has been extended to execute on ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing
April 2016
2360 pages
ISBN:9781450337397
DOI:10.1145/2851613
Conference Chair:
Sascha Ossowski
University Rey Juan Carlos, Spain
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 April 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPU
SSD
collaborative processing
heterogeneous
scheduling
Qualifiers
- research-article
Conference

Acceptance Rates
SAC '16 Paper Acceptance Rate252of1,047submissions,24%Overall Acceptance Rate1,650of6,669submissions,25%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 220
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Collaborative processing of data-intensive algorithms with CPU, intelligent SSD, and GPU

SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

On running data-intensive algorithms with intelligent SSD and host CPU: a collaborative approach

Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing

Heterogeneous concurrent execution of Monte Carlo photon transport on CPU, GPU and MIC