Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units

GPGPU-6: Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units

March 2013

2013 Proceeding

Editors:
John Cavazos
University of Delaware
,
Xiang Gong,
David Kaeli

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

GPGPU-6: Sixth Workshop on General Purpose Processing Using GPUs Houston Texas USA 16 March 2013

ISBN:

978-1-4503-2017-7

Published:

16 March 2013

In-Cooperation:

SIGARCH

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Reflects downloads up to 20 Feb 2025Bibliometrics

Citation Count

298

Downloads (6 weeks)

Downloads (12 months)

258

Downloads (cumulative)

5,815

Sections

GPGPU-6: Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units

2013

Previous Next

Skip Abstract Section

Abstract

We would like to welcome you to the proceedings of the 6th Annual Workshop on General Purpose Processing using Graphics Processors. We have another strong program that includes a keynote by Robert Geva from Intel on the programming model for the Phi accelerator, with presentations of 15 out of the 37 submitted papers.

Proceeding Downloads

PDFFront matter (Title page, Organization, TOC)

Skip Table Of Content Section

Select All

Export Citations Save to Binder

research-article

Comparison based sorting for systems with multiple GPUs

Ivan Tanasic,
Lluís Vilanova,
Marc Jordà,
Javier Cabezas,
Isaac Gelado,
Nacho Navarro,
Wen-mei Hwu

Pages 1–11https://doi.org/10.1145/2458523.2458524

As a basic building block of many applications, sorting algorithms that efficiently run on modern machines are key for the performance of these applications. With the recent shift to using GPUs for general purpose compuing, researches have proposed ...

- 12
- 513
Metrics
Total Citations12
Total Downloads513
Last 12 Months12
Last 6 weeks1

Abstract
Get Access

research-article

Reducing divergence in GPGPU programs with loop merging

Tianyi David Han,
Tarek S. Abdelrahman

Pages 12–23https://doi.org/10.1145/2458523.2458525

Branch divergence can incur a high performance penalty on GPGPU programs. We propose a software optimization, called loop merging, that aims to reduce divergence due to varying trip-count of a loop across warp threads. This optimization merges the ...

- 10
- 529
Metrics
Total Citations10
Total Downloads529
Last 12 Months25
Last 6 weeks0

Abstract
Get Access

research-article

Split tiling for GPUs: automatic parallelization using trapezoidal tiles

Tobias Grosser,
Albert Cohen,
Paul H. J. Kelly,
J. Ramanujam,
P. Sadayappan,
Sven Verdoolaege

Pages 24–31https://doi.org/10.1145/2458523.2458526

Tiling is a key technique to enhance data reuse. For computations structured as one sequential outer "time" loop enclosing a set of parallel inner loops, tiling only the parallel inner loops may not enable enough data reuse in the cache. Tiling the ...

- 73
- 645
Metrics
Total Citations73
Total Downloads645
Last 12 Months32
Last 6 weeks2

Abstract
Get Access

research-article

Formalizing address spaces with application to Cuda, OpenCL, and beyond

Benedict R. Gaster,
Lee Howes

Pages 32–41https://doi.org/10.1145/2458523.2458527

Cuda and OpenCL are aimed at programmers developing parallel applications targeting GPUs and embedded micro-processors. These systems often have explicitly managed memories exposed directly though a notion of disjoint address spaces. OpenCL address ...

- 1
- 213
Metrics
Total Citations1
Total Downloads213
Last 12 Months10
Last 6 weeks0

Abstract
Get Access

research-article

Memory reuse optimizations in the R-Stream compiler

Nicolas Vasilache,
Muthu Baskaran,
Benoit Meister,
Richard Lethin

Pages 42–53https://doi.org/10.1145/2458523.2458528

We propose a new set of automated techniques to optimize memory reuse in programs with explicitly managed memory. Our techniques are inspired by hand-tuned seismic kernels on GPUs. The solutions we develop reduce the cost of transferring data across ...

- 4
- 207
Metrics
Total Citations4
Total Downloads207
Last 12 Months3
Last 6 weeks0

Abstract
Get Access

research-article

Valar: a benchmark suite to study the dynamic behavior of heterogeneous systems

Perhaad Mistry,
Yash Ukidave,
Dana Schaa,
David Kaeli

Pages 54–65https://doi.org/10.1145/2458523.2458529

Heterogeneous systems have grown in popularity within the commercial platform and application developer communities. We have seen a growing number of systems incorporating CPUs, Graphics Processors (GPUs) and Accelerated Processing Units (APUs combine a ...

- 13
- 386
Metrics
Total Citations13
Total Downloads386
Last 12 Months20
Last 6 weeks2

Abstract
Get Access

research-article

Input-aware auto-tuning for directive-based GPU programming

Alberto Magni,
Dominik Grewe,
Nick Johnson

Pages 66–75https://doi.org/10.1145/2458523.2458530

The difficulties posed by GPGPU programming and the need to increase productivity have guided research towards directive-based high-level programs for accelerators. This effort has led to the definition of the OpenACC industry standard. It significantly ...

- 19
- 307
Metrics
Total Citations19
Total Downloads307
Last 12 Months3
Last 6 weeks0

Abstract
Get Access

research-article

Betweenness centrality on GPUs and heterogeneous architectures

Ahmet Erdem Sariyüce,
Kamer Kaya,
Erik Saule,
Ümit V. Çatalyürek

Pages 76–85https://doi.org/10.1145/2458523.2458531

The betweenness centrality metric has always been intriguing for graph analyses and used in various applications. Yet, it is one of the most computationally expensive kernels in graph mining. In this work, we investigate a set of techniques to make the ...

- 50
- 500
Metrics
Total Citations50
Total Downloads500
Last 12 Months23
Last 6 weeks3

Abstract
Get Access

research-article

OpenCL C++

Benedict R. Gaster,
Lee Howes

Pages 86–95https://doi.org/10.1145/2458523.2458532

With the success of programming models such as Khronos' OpenCL, heterogeneous computing is going mainstream. However, these models are low-level, even when considering them as systems programming models. For example, OpenCL is effectively an extended ...

- 4
- 437
Metrics
Total Citations4
Total Downloads437
Last 12 Months4
Last 6 weeks0

Abstract
Get Access

research-article

Atomic-free irregular computations on GPUs

Rupesh Nasre,
Martin Burtscher,
Keshav Pingali

Pages 96–107https://doi.org/10.1145/2458523.2458533

Atomic instructions are a key ingredient of codes that operate on irregular data structures like trees and graphs. It is well known that atomics can be expensive, especially on massively parallel GPUs, and are often on the critical path of a program. In ...

- 47
- 402
Metrics
Total Citations47
Total Downloads402
Last 12 Months28
Last 6 weeks1

Abstract
Get Access

research-article

Accelerating simulation of agent-based models on heterogeneous architectures

Jin Wang,
Norman Rubin,
Haicheng Wu,
Sudhakar Yalamanchili

Pages 108–119https://doi.org/10.1145/2458523.2458534

The wide usage of GPGPU programming models and compiler techniques enables the optimization of data-parallel programs on commodity GPUs. However, mapping GPGPU applications running on discrete parts to emerging integrated heterogeneous architectures ...

- 8
- 255
Metrics
Total Citations8
Total Downloads255
Last 12 Months6
Last 6 weeks0

Abstract
Get Access

research-article

Fast dynamic memory allocator for massively parallel architectures

Sven Widmer,
Dominik Wodniok,
Nicolas Weber,
Michael Goesele

Pages 120–126https://doi.org/10.1145/2458523.2458535

Dynamic memory allocation in massively parallel systems often suffers from drastic performance decreases due to the required global synchronization. This is especially true when many allocation or deallocation requests occur in parallel. We propose a ...

- 19
- 398
Metrics
Total Citations19
Total Downloads398
Last 12 Months33
Last 6 weeks5

Abstract
Get Access

research-article

Accelerating financial applications on the GPU

Scott Grauer-Gray,
William Killian,
Robert Searles,
John Cavazos

Pages 127–136https://doi.org/10.1145/2458523.2458536

The QuantLib library is a popular library used for many areas of computational finance. In this work, the parallel processing power of the GPU is used to accelerate QuantLib financial applications. Black-Scholes, Monte-Carlo, Bonds, and Repo code paths ...

- 30
- 519
Metrics
Total Citations30
Total Downloads519
Last 12 Months50
Last 6 weeks5

Abstract
Get Access

research-article

Exploring GPU architectures to accelerate semantic comparison for intention-based search

Ozgur Gonen,
Sonali Mahapatra,
Jaskirat Batra,
J. C. Liu

Pages 137–145https://doi.org/10.1145/2458523.2458537

Semantic comparison is the basic computational task behind meaningful search techniques being deployed by most of the new search engines. This report presents performance comparison among three GPU architectures implementing semantic comparison. We have ...

- 1
- 178
Metrics
Total Citations1
Total Downloads178
Last 12 Months2
Last 6 weeks0

Abstract
Get Access

research-article

Warp size impact in GPUs: large or small?

Ahmad Lashgar,
Amirali Baniasadi,
Ahmad Khonsari

Pages 146–152https://doi.org/10.1145/2458523.2458538

There are a number of design decisions that impact a GPU's performance. Among such decisions deciding the right warp size can deeply influence the rest of the design. Small warps reduce the performance penalty associated with branch divergence at the ...

- 7
- 290
Metrics
Total Citations7
Total Downloads290
Last 12 Months7
Last 6 weeks0

Abstract
Get Access

Save to Binder

Create a New Binder

Name

Contributors

John Cavazos
University of Delaware
- Publication Years1996 - 2018
- Publication counts46
- Citation count1,329
- Available for Download37
- Downloads (cumulative)21,715
- Downloads (12 months)2,476
- Downloads (6 weeks)206
- Average Downloads per Article587
- Average Citation per Article29
View Full Profile
Xiang Gong
- Publication Years
- Publication counts0
- Citation count0
- Available for Download0
- Downloads (cumulative)0
- Downloads (12 months)0
- Downloads (6 weeks)0
- Average Downloads per Article0
- Average Citation per Article0
View Full Profile
David R. Kaeli
Northeastern University
- Publication Years1991 - 2024
- Publication counts193
- Citation count2,240
- Available for Download106
- Downloads (cumulative)60,612
- Downloads (12 months)11,506
- Downloads (6 weeks)1,276
- Average Downloads per Article572
- Average Citation per Article12
View Full Profile

Index Terms

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units

Index terms have been assigned to the content through auto-classification.

Comments

Recommendations

GPGPU-5: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
GPGPU-3: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units

Acceptance Rates

GPGPU-6 Paper Acceptance Rate 15 of 37 submissions, 41%;

Overall Acceptance Rate 57 of 129 submissions, 44%

Year	Submitted	Accepted	Rate
GPGPU '20	12	7	58%
GPGPU '19	15	6	40%
GPGPU-10	15	8	53%
GPGPU '16	23	9	39%
GPGPU-7	27	12	44%
GPGPU-6	37	15	41%
Overall	129	57	44%

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Proceeding Downloads

Save to Binder

Index Terms

Recommendations

GPGPU-5: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units

GPGPU-3: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units

GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units

Acceptance Rates