research-article

Identifying hotspots in a program for data parallel architecture: an early experience

Authors:

Santonu Sarkar,

Mageri Filali MaltoufAuthors Info & Claims

ISEC '12: Proceedings of the 5th India Software Engineering Conference

Pages 131 - 137

https://doi.org/10.1145/2134254.2134277

Published: 22 February 2012 Publication History

Abstract

In applications that rely on data intensive computation, one can gain significant performance if the source code is suitably transformed for parallel hardware. A common approach is to identify the loops inside the program that consume a significant amount of time, that we call hotspots. One of the impending business need here is to quickly identify such loops for further transformation. However, the exact identification of such hotspots requires an elaborate runtime analysis. When we deal with a third party business application, only a partial version of the source code is available, with limited test inputs, which hinders a correct runtime analysis. Therefore, we resort to static analysis of source code to get a conservative loop iteration count. In this paper we describe our approach to analyze a source code to find hotspots. Our approach is based on estimating the iteration count of a loop using the polytope model for volume computation. This is then combined with the cyclomatic complexity measurement of the loop body. Both these metrics together provides an approximate idea of hotspots in a program and serves as a code transformation clue to programmers. We have run our tool on Rodinia benchmark applications and found encouraging results.

References

[1]

International technology roadmap for semiconductors. Executive summary, 2005 and 2007.

[2]

S. K. Abd-El-Hafiz and V. R. Basili. A knowledge-based approach to the analysis of loops. IEEE Transactions on Software Engineering, 22:339--360, 1996.

Digital Library

[3]

S. V. Adve, V. S. Adve, G. Agha, M. I. Frank, M. J. Garzarán, J. C. Hart, W. mei W. Hwu, R. E. Johnson, L. Kale, R. Kumar, D. Marinov, K. Nahrstedt, D. Padua, M. Parthasarathy, S. Patel, G. Rosu, D. Roth, M. Snir, J. Torrellas, and C. Zilles. Parallel@illinois. Technical report, University of Illinois at Urbana-Champaign, 2008.

[4]

C. Ancourt and F. Irigoin. Scanning Polyhedra with DO Loops. In 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 39--50, 1991.

Digital Library

[5]

K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, Dec 2006.

[6]

K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. Kubiatowicz, N. Morgan, D. Patterson, K. Sen, J. Wawrzynek, D. Wessel, and K. Yelick. A view of the parallel computing landscape. Communications of ACM, 52:56--67, October 2009.

Digital Library

[7]

M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories. In Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming ((PPoPP). ACM SIGPLAN, 2008.

Digital Library

[8]

C. Bastoul. Code generation in the polyhedral model is easier than you think. In International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society, 2004.

Digital Library

[9]

M. Beck and S. Robins. Computing the Continuous Discretely, Integer-point enumeration in polyhedra. Springer-Verlag, 2007.

[10]

S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A Benchmark Suite for Heterogeneous Computing. In Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC), pages 45--54. IEEE Computer Society, 2009.

Digital Library

[11]

C. Cullmann and F. Martin. Data-Flow Based Detection of Loop Bounds. In Proc. of WCET, 2007.

[12]

M. R. de Alba and D. R. Kaeli. Runtime predictability of loops. In Proceedings of the Workload Characterization, pages 91--98. IEEE Computer Society, 2001.

Digital Library

[13]

A. Ermedahl and J. Gustafsson. Deriving Annotations for Tight Calculation of Execution Time. In Proc. of Euro-Par, 1997.

Digital Library

[14]

D. Fandrey. Clang/LLVM Maturity Report. Technical report, Computer Science Dept., University of Applied Sciences Karlsruhe, June 2010.

[15]

P. N. Glaskowsky. NVidia fermi: The first complete GPU computing architecture. Technical report, 2009.

[16]

D. Kuck. The structure of computers and computations. John Wiley & Sons, 1978.

Digital Library

[17]

C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the international symposium on Code generation and optimization, pages 75--. IEEE Computer Society, 2004.

Digital Library

[18]

P. Lokuciejewski, D. Cordes, H. Falk, and P. Marwedel. A fast and precise static loop analysis based on abstract interpretation, program slicing and polytope models. In International Symposium on Code Generation and Optimization, pages 136--146. IEEE Computer Society, 2009.

Digital Library

[19]

T. Mattson and K. Keutzer. Patterns for parallel programming. Technical report, UCB EECS, 2008.

[20]

T. J. McCabe and A. H. Watson. Software Complexity. Crosstalk, Journal of Defense Software Engineering, 7(12):5--9, December 1994.

[21]

J. Nickolls and W. J. Dally. The GPU computing era. IEEE Micro, 30(2), 2010.

Digital Library

[22]

D. Patterson. The trouble with multi-core. IEEE Spectrum, 2010.

Digital Library

[23]

D. Patterson and J. Hennessy. Graphics and Computing GPUs: Computer Organization and Design: The Hardware/Software Interface. Morgan Kaufmann, 2009.

[24]

L.-N. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen, J. Ramanujam, P. Sadayappan, and N. Vasilache. Loop Transformations: Convexity, Pruning and Optimization. In POPL'11, pages 549--561. ACM, 2011.

Digital Library

[25]

A. Schrijver. Theory of linear and integer programming. John Wiley & Sons, 1986.

Digital Library

[26]

J. Sjödin, S. Pop, H. Jagasia, T. Grosser, and A. Pop. Design of Graphite and the Polyhedral Compilation Package. In Proc. GCC Developers' Summit, pages 113--123, 2009.

Index Terms

Identifying hotspots in a program for data parallel architecture: an early experience

Recommendations

Polyhedral parallel code generation for CUDA
Special Issue on High-Performance Embedded Architectures and Compilers

This article addresses the compilation of a sequential program for parallel execution on a modern GPU. To this end, we present a novel source-to-source compiler called PPCG. PPCG singles out for its ability to accelerate computations from any static ...
CLAPP: characterizing loops in Android applications (invited talk)
DeMobile 2015: Proceedings of the 3rd International Workshop on Software Development Lifecycle for Mobile

When performing program analysis, loops are one of the most important aspects that needs to be taken into account. In the past, many approaches have been proposed to analyze loops to perform different tasks, ranging from compiler optimizations to Worst-...
A Fast and Precise Static Loop Analysis Based on Abstract Interpretation, Program Slicing and Polytope Models
CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization

A static loop analysis is a program analysis computing loop iteration counts. This information is crucial for different fields of applications. In the domain of compilers, the knowledge about loop iterations can be exploited for aggressive loop ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ISEC '12: Proceedings of the 5th India Software Engineering Conference

February 2012

174 pages

ISBN:9781450311427

DOI:10.1145/2134254

General Chairs:
Sanjeev Aggarwal
Indian Institute of Technology, Kanpur, India
,
T. V. Prabhakar
Indian Institute of Technology, Kanpur, India
,
Program Chairs:
Vasudeva Varma
IIIT Hyderabad, India
,
Srinivas Padmanabhuni
Infosys Labs, India

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IITK: Indian Institute of Technology Kanpur

In-Cooperation

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISEC '12

Sponsor:

IITK

ISEC '12: India Software Engineering Conference 2012

February 22 - 25, 2012

Kanpur, India

Acceptance Rates

ISEC '12 Paper Acceptance Rate 26 of 107 submissions, 24%;

Overall Acceptance Rate 76 of 315 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
157
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents