Performance Evaluation of Stencil Computations Based on Source-to-Source Transformations

Martínez, Víctor; Serpa, Matheus S.; Pavan, Pablo J.; Padoin, Edson Luiz; Navaux, Philippe O. A.

doi:10.1007/978-3-030-16205-4_16

Víctor Martínez¹²,
Matheus S. Serpa¹²,
Pablo J. Pavan¹²,
Edson Luiz Padoin¹³ &
…
Philippe O. A. Navaux¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 979))

Included in the following conference series:

Latin American High Performance Computing Conference

402 Accesses
1 Citations

Abstract

Stencil computations are commons in High Performance Computing (HPC) applications, they consist in a pattern that replicates the same calculation in a data domain. The Finite-Difference Method is an example of stencil computations and it is used to solve real problems in diverse areas related to Partial Differential Equations (electromagnetics, fluid dynamics, geophysics, etc.). Although a large body of literature on optimization of this class of applications is available, the performance evaluation and its optimization on different HPC architectures remain a challenge. In this work, we implemented the 7-point Jacobian stencil in a Source-to-Source Transformation Framework (BOAST) to evaluate the performance of different HPC architectures. Achieved results present that the same source code can be executed on current architectures with a performance improvement, and it helps the programmer to develop the applications without dependence on hardware features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Breuer, A., Heinecke, A., Bader, M.: Petascale local time stepping for the ADER-DG finite element method. In: 2016 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016, Chicago, IL, USA, 23–27 May 2016, pp. 854–863 (2016)
Google Scholar
Buchty, R., Heuveline, V., Karl, W., Weiss, J.P.: A survey on hardware-aware and heterogeneous computing on multicore processors and accelerators. Concurrency Comput. Pract. Exp. 24(7), 663–675 (2012). https://doi.org/10.1002/cpe.1904
Article Google Scholar
Christen, M., Schenk, O., Burkhart, H.: Automatic code generation and tuning for stencil kernels on modern shared memory architectures. Comput. Sci. 26(3–4), 205–210 (2011)
Google Scholar
Cronsioe, J., Videau, B., Marangozova-Martin, V.: Boast: bringing optimization through automatic source-to-source transformations. In: 2013 IEEE 7th International Symposium on Embedded Multicore SoCs, pp. 129–134, September 2013. https://doi.org/10.1109/MCSoC.2013.12
Datta, K., Kamil, S., Williams, S., Oliker, L., Shalf, J., Yelick, K.: Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev. 51(1), 129–159 (2009). https://doi.org/10.1137/070693199
Article MATH Google Scholar
Datta, K., et al.: Auto-Tuning Stencil Computations on Multicore and Accelerators. CRC Press, Taylor & Francis Group (2010)
Google Scholar
Dupros, F., Boulahya, F., Aochi, H., Thierry, P.: Communication-avoiding seismic numerical kernels on multicore processors. In: 2015 IEEE 17th International Conference on High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), pp. 330–335, August 2015. https://doi.org/10.1109/HPCC-CSS-ICESS.2015.230
Dupros, F., Do, H., Aochi, H.: On scalability issues of the elastodynamics equations on multicore platforms. In: Proceedings of the International Conference on Computational Science, ICCS 2013, Barcelona, Spain, 5–7 June 2013, pp. 1226–1234 (2013)
Google Scholar
Forth, S.A., Tadjouddine, M., Pryce, J.D., Reid, J.K.: Jacobian code generated by source transformation and vertex elimination can be as efficient ash and-coding. ACM Trans. Math. Softw. 30(3), 266–299 (2004). https://doi.org/10.1145/1024074.1024076. http://doi.acm.org/10.1145/1024074.1024076
Genssler, T., Kuttruff, V.: Source-to-source transformation in the large. In: Böszörményi, L., Schojer, P. (eds.) JMLC 2003. LNCS, vol. 2789, pp. 254–265. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45213-3_31
Chapter Google Scholar
Khan, M., Priyanka, N., Ahmed, W., Radhika, N., Pavithra, M., Parimala, K.: Understanding source-to-source transformations for frequent porting of applications on changing cloud architectures. In: 2014 International Conference on Parallel, Distributed and Grid Computing, pp. 350–354, December 2014. https://doi.org/10.1109/PDGC.2014.7030769
Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. SIGPLAN Not. 44(4), 101–110 (2009). https://doi.org/10.1145/1594835.1504194. http://doi.acm.org/10.1145/1594835.1504194
Loveman, D.B.: Program improvement by source-to-source transformation. J. ACM 24(1), 121–145 (1977). https://doi.org/10.1145/321992.322000. http://doi.acm.org/10.1145/321992.322000
Martínez, V., Dupros, F., Castro, M., Navaux, P.: Performance improvement of stencil computations for multi-core architectures based on machine learning. Procedia Comput. Sci. 108, 305–314 (2017). https://doi.org/10.1016/j.procs.2017.05.164. http://www.sciencedirect.com/science/article/pii/S1877050917307408. international Conference on Computational Science, ICCS 2017, 12–14 June 2017, Zurich, Switzerland
Mijakovic, R., Firbach, M., Gerndt, M.: An architecture for flexible auto-tuning: the periscope tuning framework 2.0. In: International Conference on Green High Performance Computing (ICGHPC), pp. 1–9, February 2016. https://doi.org/10.1109/ICGHPC.2016.7508066
Mittal, S., Vetter, J.S.: A survey of CPU-GPU heterogeneous computing techniques. ACM Comput. Surv. 47(4), 69:1–69:35 (2015). https://doi.org/10.1145/2788396
Article Google Scholar
Moczo, P., Robertsson, J., Eisner, L.: The finite-difference time-domain method for modeling of seismic wave propagation. In: Advances in Wave Propagation in Heterogeneous Media, Advances in Geophysics, vol. 48, chap. 8, pp. 421–516. Elsevier - Academic Press (2007)
Google Scholar
Nguyen, A., Satish, N., Chhugani, J., Kim, C., Dubey, P.: 3.5-D blocking optimization for stencil computations on modern CPUs and GPUs. In: 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–13, November 2010. https://doi.org/10.1109/SC.2010.2
Noaje, G., Jaillet, C., Krajecki, M.: Source-to-source code translator: OpenMP C to CUDA. In: 2011 IEEE International Conference on High Performance Computing and Communications, pp. 512–519, September 2011. https://doi.org/10.1109/HPCC.2011.73
Renault, E., Ancelin, C., Jimenez, W., Botero, O.: Using source-to-source transformation tools to provide distributed parallel applications from openMP source code. In: 2008 International Symposium on Parallel and Distributed Computing, pp. 197–204, July 2008. https://doi.org/10.1109/ISPDC.2008.65
Sodani, A., et al.: Knights landing: second-generation intelxeon phi product. IEEE Micro 36(2), 34–46 (2016). https://doi.org/10.1109/MM.2016.25
Article Google Scholar
Stojanovic, S., Bojic, D., Bojovic, M., Valero, M., Milutinovic, V.: An overview of selected hybrid and reconfigurable architectures. In: 2012 IEEE International Conference on Industrial Technology (ICIT), pp. 444–449, March 2012. https://doi.org/10.1109/ICIT.2012.6209978
Tang, Y., Chowdhury, R.A., Kuszmaul, B.C., Luk, C.K., Leiserson, C.E.: The pochoir stencil compiler. In: ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2011, pp. 117–128. ACM, New York (2011). https://doi.org/10.1145/1989493.1989508. http://doi.acm.org/10.1145/1989493.1989508
Videau, B., et al.: Boast: a meta programming framework to produce portable and efficient computing kernels for HPC applications. Int. J. High Perform. Comput. Appl. 32(1), 28–44 (2018). https://doi.org/10.1177/1094342017718068
Article Google Scholar
Wahib, M., Maruyama, N.: Automated GPU kernel transformations in large-scale production stencil applications. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2015, pp. 259–270. ACM, New York (2015). https://doi.org/10.1145/2749246.2749255. http://doi.acm.org/10.1145/2749246.2749255
Zhao, B., Li, Z., Jannesari, A., Wolf, F., Wu, W.: Dependence-based code transformation for coarse-grained parallelism. In: Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, COSMIC 2015, pp. 1:1–1:10. ACM, New York (2015). https://doi.org/10.1145/2723772.2723777. http://doi.acm.org/10.1145/2723772.2723777

Download references

Acknowledgments

This work has been granted by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), the Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (FAPERGS). Research has received funding from the EU H2020 Programme and from MCTI/RNP-Brazil under the HPC4E Project, grant agreement n.^o 689772. It was also supported by Intel under the Modern Code project, and the PETROBRAS oil company under Ref. 2016/00133-9. We also thank to RICAP, partially funded by the Ibero-American Program of Science and Technology for Development (CYTED), Ref. 517RT0529.

Author information

Authors and Affiliations

Informatics Institute, UFRGS, Porto Alegre, Brazil
Víctor Martínez, Matheus S. Serpa, Pablo J. Pavan & Philippe O. A. Navaux
Department of Exact Sciences and Engineering, UNIJUI, Ijuí, Brazil
Edson Luiz Padoin

Authors

Víctor Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Matheus S. Serpa
View author publications
You can also search for this author in PubMed Google Scholar
Pablo J. Pavan
View author publications
You can also search for this author in PubMed Google Scholar
Edson Luiz Padoin
View author publications
You can also search for this author in PubMed Google Scholar
Philippe O. A. Navaux
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Víctor Martínez .

Editor information

Editors and Affiliations

Instituto Tecnológico de Costa Rica, Centro Nacional de Alta Tecnología , Pavas, Costa Rica
Esteban Meneses
Universidad de los Andes, Bogotá, Colombia
Harold Castro
Universidad Industrial de Santander, Bucaramanga, Colombia
Carlos Jaime Barrios Hernández
Universidad de Antioquia, Medellín, Colombia
Raul Ramos-Pollan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martínez, V., Serpa, M.S., Pavan, P.J., Padoin, E.L., Navaux, P.O.A. (2019). Performance Evaluation of Stencil Computations Based on Source-to-Source Transformations. In: Meneses, E., Castro, H., Barrios Hernández, C., Ramos-Pollan, R. (eds) High Performance Computing. CARLA 2018. Communications in Computer and Information Science, vol 979. Springer, Cham. https://doi.org/10.1007/978-3-030-16205-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-16205-4_16
Published: 31 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16204-7
Online ISBN: 978-3-030-16205-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics