Feedback-Based Global Instruction Scheduling for GPGPU Applications

Timm, Constantin; Görlich, Markus; Weichert, Frank; Marwedel, Peter; Müller, Heinrich

doi:10.1007/978-3-642-31125-3_2

Constantin Timm²³,
Markus Görlich²³,
Frank Weichert²⁴,
Peter Marwedel²³ &
…
Heinrich Müller²⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7333))

Included in the following conference series:

International Conference on Computational Science and Its Applications

2095 Accesses

Abstract

In the face of the memory wall even in high bandwidth systems such as GPUs, an efficient handling of memory accesses and memory-related instructions is mandatory. Up to now, memory performance considerations were only made for GPGPU applications at source code level. This is not enough when optimizing an application towards high performance: The code has to be optimized at assembly level as well. Due to the spreading of GPGPU-capable hardware in smaller and smaller devices, the energy consumption of a program is – besides the performance – an important optimization goal.

In this paper, a novel compiler optimization technique, called FALIS (Feedback-based and memory-Aware gLobal Instruction Scheduling), is presented based on global instruction scheduling and multi-objective genetic algorithms. The approach uses a profiling-based feedback in order to take the measured performance and energy consumption values inside a compiler into account. Profiling on the real hardware platform is important in order to consider the characteristics of the underlying hardware. FALIS increases runtime performance of a GPGPU application by up to 13.02% and decreases energy consumption by up to 10.23%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

GPPRMon: GPU Runtime Memory Performance and Power Monitoring Tool

An Analytical Model-Based Auto-tuning Framework for Locality-Aware Loop Scheduling

A methodology correlating code optimizations with data memory accesses, execution time and energy consumption

Article 13 May 2019

References

Banerjia, S., Havanki, W.A., Conte, T.M.: Treegion Scheduling for Highly Parallel Processors. In: Lengauer, C., Griebl, M., Gorlatch, S. (eds.) Euro-Par 1997. LNCS, vol. 1300, pp. 1074–1078. Springer, Heidelberg (1997)
Chapter Google Scholar
De Bosschere, K., Luk, W., Martorell, X., Navarro, N., O’Boyle, M., Pnevmatikatos, D., Ramírez, A., Sainrat, P., Seznec, A., Stenström, P., Temam, O.: High-Performance Embedded Architecture and Compilation Roadmap. In: Stenström, P. (ed.) Transactions on High-Performance Embedded Architectures and Compilers I. LNCS, vol. 4050, pp. 5–29. Springer, Heidelberg (2007)
Chapter Google Scholar
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: A Benchmark Suite for Heterogeneous Computing. In: Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pp. 44–54 (2009)
Google Scholar
Cho, S., Melhem, R.: Corollaries to Amdahl’s Law for Energy. IEEE Computer Architecture Letters, 25–28 (2008)
Google Scholar
Dominguez, R., Kaeli, D.R.: Improving the open64 backend for GPUs. Poster at Google Summer School (2009)
Google Scholar
Görlich, M.: Untersuchung und Verbesserung der Speicherzugriffsverteilung in GPGPU-Programmen unter Nutzung von lokalen Schedulingmethoden. Master’s thesis, Embedded System Group, Faculty of Computer Science, TU Dortmund (2011)
Google Scholar
Han, T.D., Abdelrahman, T.S.: Reducing branch Divergence in GPU Programs. In: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, pp. 1–8 (2011)
Google Scholar
Hong, S., Kim, H.: An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness. In: Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA), pp. 152–163 (2009)
Google Scholar
Kerns, D.R., Eggers, S.J.: Balanced Scheduling: Instruction Scheduling When Memory Latency is Uncertain. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 278–289 (1993)
Google Scholar
Kerr, A., Campbell, D., Richards, M.: GPU VSIPL: High-Performance VSIPL Implementation for GPUs. In: Proceedings of the 12th High Performance Embedded Computing Workshop (HPEC), Lexington, Massachusetts, USA (2008)
Google Scholar
Kung, S.Y., Kailath, T., Whitehouse, H.J.: VLSI and Modern Signal Processing. Prentice Hall Professional Technical Reference (1984)
Google Scholar
Leupers, R.: Instruction Scheduling for Clustered VLIW DSPs. In: Proceedings of the International Conference on Parallel Architecture and Compilation Techniques (PACT), pp. 291–300 (2000)
Google Scholar
Machanick, P.: Approaches to Addressing the Memory Wall. Technical report, School of IT and Electrical Engineering, University of Queensland (2002)
Google Scholar
NVIDIA Corporation: CUDA Architecture (2009)
Google Scholar
NVIDIA Corporation: The CUDA Compiler Driver NVCC (2009)
Google Scholar
Open64 Project at Rice University: Open64 Compiler: Whirl Intermediate Representation (2007), www.mcs.anl.gov/OpenAD/open64A.pdf
Owens, J., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A., Purcell, T.: A Survey of General-Purpose Computation on Graphics Hardware. Computer Graphics Forum, 80–113 (2007)
Google Scholar
Risco-Martin, J.: Java Evolutionary COmputation library (JECO) (2012), https://sourceforge.net/projects/jeco
Rofouei, M., Stathopoulos, T., Ryffel, S., Kaiser, W., Sarrafzadeh, M.: Energy-Aware High Performance Computing with Graphic Processing Units. In: Proceedings of the Workshop on Power Aware Computing and Systems, HotPower (2008)
Google Scholar
Timm, C., Gelenberg, A., Marwedel, P., Weichert, F.: Energy Considerations within the Integration of General Purpose GPUs in Embedded Systems. In: Proceedigns of the Annual Internation Conference on Advances in Distributed and Parallel Computing, ADPC (2010)
Google Scholar
Timm, C., Weichert, F., Marwedel, P., Müller, H.: Multi-Objective Local Instruction Scheduling for GPGPU Applications. In: Proceedings of the International Conference on Parallel and Distributed Computing Systems, PDCS (2011)
Google Scholar
Tseng, C.J., Siewiorek, D.: Automated Synthesis of Data Paths in Digital Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 379–395 (1986)
Google Scholar
Valluri, M., John, L.: Is Compiling for Performance == Compiling for Power? In: Proceedings oh the Workshop on Interaction between Compilers and Computer Architectures, INTERACT (2001)
Google Scholar
Voorneveld, M.: Characterization of Pareto Dominance. Operations Research Letters, 7–11 (2003)
Google Scholar
Wang, Z., Hu, X.S.: Energy-Aware Variable Partitioning and Instruction Scheduling for Multibank Memory Architectures. ACM Transactions on Design Automation of Electronic Systems (TODAES), 369–388 (2005)
Google Scholar
Woo, D.H., Lee, H.H.: Extending Amdahl’s Law for Energy-Efficient Computing in the Many-Core Era. IEEE Computer, 24–31 (2008)
Google Scholar
Zitzler, E., Giannakoglou, K., Tsahalis, D., Periaux, J., Papailiou, K., Fogarty, T., Ler, E.Z., Laumanns, M., Thiele, L.: SPEA2: Improving the Strength Pareto Evolutionary Algorithm For Multiobjective Optimization. In: Proceedings of the International Conference on Evolutionary and Deterministic Methods for Design, Optimization and Control with Applications to Industrial and Societal Problems, EUROGEN (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science 12, TU Dortmund, Germany
Constantin Timm, Markus Görlich & Peter Marwedel
Computer Science 7, TU Dortmund, Germany
Frank Weichert & Heinrich Müller

Authors

Constantin Timm
View author publications
You can also search for this author in PubMed Google Scholar
Markus Görlich
View author publications
You can also search for this author in PubMed Google Scholar
Frank Weichert
View author publications
You can also search for this author in PubMed Google Scholar
Peter Marwedel
View author publications
You can also search for this author in PubMed Google Scholar
Heinrich Müller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratory of Urban and Territorial Systems, University of Basilicata, 10, Viale dell’Ateneo Lucano, 85100, Potenza, Italy
Beniamino Murgante
Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli 1, 06123, Perugia, Italy
Osvaldo Gervasi
Department of Cyber Security Science, Federal University of Technology, Gidan Kwano Campus, Minna, Nigeria
Sanjay Misra
Faculty of Engineering, Department of Electronics Engineering and Telecommunications, State University of Rio de Janeiro, Rua Sao Francisco Xavier, 524, 50. andar, sala 5145-F, Maracana, 20, 550-013, Rio de Janeiro, RJ, Brazil
Nadia Nedjah
Department of Production and Systems, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal
Ana Maria A. C. Rocha
School of Business Systems, Monash University, 3800, Clayton, VIC, Australia
David Taniar
Department of Intelligent Informatics, Kyushu Sangyo University, 2-3-1 Matsukadai, Higashi-ku, 813-8503, Fukuoka, Japan
Bernady O. Apduhan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Timm, C., Görlich, M., Weichert, F., Marwedel, P., Müller, H. (2012). Feedback-Based Global Instruction Scheduling for GPGPU Applications. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2012. ICCSA 2012. Lecture Notes in Computer Science, vol 7333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31125-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-31125-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31124-6
Online ISBN: 978-3-642-31125-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics