Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
- Download citation
- Copy citation

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

Go to Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

March 2011

2011 Proceeding

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

GPGPU-4: Fourth Workshop on General Purpose Processing on Graphics Processing Units Newport Beach California USA 5 March 2011

ISBN:

978-1-4503-0569-3

Published:

05 March 2011

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Reflects downloads up to 07 Mar 2025Bibliometrics

Citation Count

449

Downloads (6 weeks)

Downloads (12 months)

202

Downloads (cumulative)

7,479

Sections

GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

2011

Previous Next

Abstract

No abstract available.

Proceeding Downloads

PDFFront matter (Table of contents)

Skip Table Of Content Section

Select All

Export Citations Save to Binder

SESSION: Applications I

research-article

High performance predictable histogramming on GPUs: exploring and evaluating algorithm trade-offs

Cedric Nugteren,
Gert-Jan van den Braak,
Henk Corporaal,
Bart Mesman

Article No.: 1, Pages 1–8https://doi.org/10.1145/1964179.1964181

Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image processing, due to their massive parallel processing power. Many image processing applications use the histogramming algorithm, which fills a set of bins ...

- 37
- 658
Metrics
Total Citations37
Total Downloads658
Last 12 Months14
Last 6 weeks1

Abstract
Get Access

research-article

A new method for GPU based irregular reductions and its application to k-means clustering

Balaji Dhanasekaran,
Norman Rubin

Article No.: 2, Pages 1–8https://doi.org/10.1145/1964179.1964182

A frequently used method of clustering is a technique called k-means clustering. The k-means algorithm consists of two steps: A map step, which is simple to execute on a GPU, and a reduce step, which is more problematic. Previous researchers have used a ...

- 15
- 411
Metrics
Total Citations15
Total Downloads411
Last 12 Months2
Last 6 weeks0

Abstract
Get Access

SESSION: Optimizations

research-article

Reducing branch divergence in GPU programs

Tianyi David Han,
Tarek S. Abdelrahman

Article No.: 3, Pages 1–8https://doi.org/10.1145/1964179.1964184

Branch divergence has a significant impact on the performance of GPU programs. We propose two novel software-based optimizations, called iteration delaying and branch distribution that aim to reduce branch divergence. Iteration delaying targets a ...

- 129
- 1,588
Metrics
Total Citations129
Total Downloads1,588
Last 12 Months83
Last 6 weeks8

Abstract
Get Access

research-article

Register packing for cyclic reduction: a case study

Andrew Davidson,
John D. Owens

Article No.: 4, Pages 1–6https://doi.org/10.1145/1964179.1964185

We generalize a method for avoiding GPU shared communication when dealing with a downsweep pattern. We apply this generalization to Cyclic Reduction, a tridiagonal solver with this pattern. Previously, Cyclic Reduction suffered poor performance when ...

- 36
- 206
Metrics
Total Citations36
Total Downloads206
Last 12 Months9
Last 6 weeks1

Abstract
Get Access

research-article

Caracal: dynamic translation of runtime environments for GPUs

Rodrigo Domínguez,
Dana Schaa,
David Kaeli

Article No.: 5, Pages 1–7https://doi.org/10.1145/1964179.1964186

Graphics Processing Units (GPU) have become the platform of choice for accelerating a large range of data parallel and task parallel applications. Both AMD and NVIDIA have developed GPU implementations targeted at the high performance computing market. ...

- 11
- 428
Metrics
Total Citations11
Total Downloads428
Last 12 Months5
Last 6 weeks0

Abstract
Get Access

SESSION: Applications II

research-article

Fast Mersenne prime testing on the GPU

Andrew Thall

Article No.: 6, Pages 1–8https://doi.org/10.1145/1964179.1964188

The Lucas-Lehmer test for Mersenne primality can be efficiently parallelized for GPU-based computation. The gpuLucas project implements an irrational-base discrete weighted transform approach (IBDWT) using balanced-integers, non-power-of-two transforms, ...

- 1
- 223
Metrics
Total Citations1
Total Downloads223
Last 12 Months6
Last 6 weeks0

Abstract
Get Access

research-article

Floating-point data compression at 75 Gb/s on a GPU

Molly A. O'Neil,
Martin Burtscher

Article No.: 7, Pages 1–7https://doi.org/10.1145/1964179.1964189

Numeric simulations often generate large amounts of data that need to be stored or sent to other compute nodes. This paper investigates whether GPUs are powerful enough to make real-time data compression and decompression possible in such environments, ...

- 47
- 524
Metrics
Total Citations47
Total Downloads524
Last 12 Months23
Last 6 weeks3

Abstract
Get Access

research-article

Real-time rendering and dynamic updating of 3-d volumetric data

Andrew Miller,
Vishal Jain,
Joseph L. Mundy

Article No.: 8, Pages 1–8https://doi.org/10.1145/1964179.1964190

A dense 3-d terrain model obtained using reconstruction methods from aerial images is represented in a probabilistic volumetric framework. The choice of probabilistic representation is to represent inherent ambiguity in reconstruction of surface from ...

- 18
- 599
Metrics
Total Citations18
Total Downloads599
Last 12 Months15
Last 6 weeks0

Abstract
Get Access

SESSION: Instrumentation and analysis

research-article

A framework for dynamically instrumenting GPU compute applications within GPU Ocelot

Naila Farooqui,
Andrew Kerr,
Gregory Diamos,
S. Yalamanchili,
K. Schwan

Article No.: 9, Pages 1–9https://doi.org/10.1145/1964179.1964192

In this paper we present the design and implementation of a dynamic instrumentation infrastructure for PTX programs that procedurally transforms kernels and manages related data structures. We show how performing instrumentation within the GPU Ocelot ...

- 37
- 559
Metrics
Total Citations37
Total Downloads559
Last 12 Months12
Last 6 weeks1

Abstract
Get Access

research-article

Analyzing program flow within a many-kernel OpenCL application

Perhaad Mistry,
Chris Gregg,
Norman Rubin,
David Kaeli,
Kim Hazelwood

Article No.: 10, Pages 1–8https://doi.org/10.1145/1964179.1964193

Many developers have begun to realize that heterogeneous multi-core and many-core computer systems can provide significant performance opportunities to a range of applications. Typical applications possess multiple components that can be parallelized; ...

- 31
- 883
Metrics
Total Citations31
Total Downloads883
Last 12 Months2
Last 6 weeks1

Abstract
Get Access

research-article

Quantifying NUMA and contention effects in multi-GPU systems

Kyle Spafford,
Jeremy S. Meredith,
Jeffrey S. Vetter

Article No.: 11, Pages 1–7https://doi.org/10.1145/1964179.1964194

As system architects strive for increased density and power efficiency, the traditional compute node is being augmented with an increasing number of graphics processing units (GPUs). The integration of multiple GPUs per node introduces complex ...

- 34
- 470
Metrics
Total Citations34
Total Downloads470
Last 12 Months15
Last 6 weeks5

Abstract
Get Access

SESSION: Applications III

research-article

Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation

Dominik Grewe,
Anton Lokhmotov

Article No.: 12, Pages 1–8https://doi.org/10.1145/1964179.1964196

We propose a system-independent representation of sparse matrix formats that allows a compiler to generate efficient, system-specific code for sparse matrix operations. To show the viability of such a representation we have developed a compiler that ...

- 35
- 452
Metrics
Total Citations35
Total Downloads452
Last 12 Months9
Last 6 weeks2

Abstract
Get Access

research-article

Unstructured grid applications on GPU: performance analysis and improvement

Lizandro Solano-Quinde,
Zhi Jian Wang,
Brett Bode,
Arun K. Somani

Article No.: 13, Pages 1–8https://doi.org/10.1145/1964179.1964197

Performance of applications running on GPUs is mainly affected by hardware occupancy and global memory latency. Scientific applications that rely on analysis using unstructured grids could benefit from the high performance capabilities provided by GPUs, ...

- 17
- 378
Metrics
Total Citations17
Total Downloads378
Last 12 Months7
Last 6 weeks0

Abstract
Get Access

Cited By

Röning J, Casasent D, Mundhenk T, Flores A and Hoffman H (2014). Classification and segmentation of orbital space based objects against terrestrial distractors for the purpose of finding holes in shape from motion 3D reconstruction IS&T/SPIE Electronic Imaging, 10.1117/12.2047338, , (90250K), Online publication date: 3-Feb-2014.

Save to Binder

Create a New Binder

Name

Comments

0 Comments

Recommendations

ICVGIP '10: Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
SSPW '10: Proceedings of the 2nd international workshop on Social signal processing
VG '03: Proceedings of the 2003 Eurographics/IEEE TVCG Workshop on Volume graphics

Acceptance Rates

Overall Acceptance Rate 57 of 129 submissions, 44%

Year	Submitted	Accepted	Rate
GPGPU '20	12	7	58%
GPGPU '19	15	6	40%
GPGPU-10	15	8	53%
GPGPU '16	23	9	39%
GPGPU-7	27	12	44%
GPGPU-6	37	15	41%
Overall	129	57	44%

Save to Binder

Sections

Proceeding Downloads

Cited By

Save to Binder

Recommendations

ICVGIP '10: Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing

SSPW '10: Proceedings of the 2nd international workshop on Social signal processing

VG '03: Proceedings of the 2003 Eurographics/IEEE TVCG Workshop on Volume graphics

Acceptance Rates