cuSCNN : an Efficient CUDA Implementation of Sparse CNNs
Abstract
References
Index Terms
- cuSCNN : an Efficient CUDA Implementation of Sparse CNNs
Recommendations
Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms
Graphics processor units (GPU) that are originally designed for graphics rendering have emerged as massively-parallel "co-processors" to the central processing unit (CPU). Small-footprint multi-GPU workstations with hundreds of processing elements can ...
CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware
Scanning protein sequence database is an often repeated task in computational biology and bioinformatics. However, scanning large protein databases, such as GenBank, with popular tools such as BLASTP requires long runtimes on sequential architectures. ...
CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms
As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern ...
Comments
Information & Contributors
Information
Published In
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Research-article
- Research
- Refereed limited
Conference
Acceptance Rates
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 124Total Downloads
- Downloads (Last 12 months)54
- Downloads (Last 6 weeks)1
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign inFull Access
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderHTML Format
View this article in HTML Format.
HTML Format