Skip to main content
Log in

Assessing and discovering parallelism in C\(++\) code for heterogeneous platforms

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Massively parallel architectures are mainly based on a parallel heterogeneous setup. They are composed by different computing devices that speed up specific code regions, named kernels. These kernels are usually executed offline in the corresponding devices. Porting applications to a specific heterogeneous platform is a costly task in terms of time and human resources. The key points in the porting process are the manual analysis of the source code and kernel detection. Each device of these heterogeneous platforms has their own restrictions, such as the memory allocation support. Kernels must be mapped with suitable computing devices. We introduced AKI as an automatic kernel identification and annotation tool that aims to identify potential kernels on C\(++\) sequential applications. AKI identifies those kernels that can be offlined on heterogeneous computing devices. To annotate these kernels, REPARA C++ attributes have been defined. This annotation mechanism can aid future automatic source-to-source transformation tools to facilitate the work for parallel heterogeneous platforms. AKI has been evaluated over all benchmarks included in the NAS suite. The benchmark suite incorporates a big set of realistic high performance applications. The evaluation results demonstrate that AKI is a competitive solution for identifying and annotating parallel code fragments (aka kernels).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Athavale A, Ranadive P, Babu MN, Pawar P, Sah S, Vaidya V, Rajguru C (2012) Automatic sequential to parallel code conversion the S2P tool and performance analysis. GSTF J Comput (JoC) 1(4)

  2. Brown C, Janjic V, Hammond K, Schner H, Idrees K, Glass C (2014) Agricultural reform: more efficient farming using advanced parallel refactoring tools. In: 22nd Euromicro international conference on parallel, distributed, and network-based processing

  3. Binkley D (2007) Source code analysis: a road map. In: Future of software engineering, FOSE ’07. IEEE Computer Society, Washington, DC, pp 104–119

  4. Bozó I, Fordós V, Horvath Z, Tóth M, Horpácsi D, Kozsik T, Köszegi J, Barwell A, Brown C, Hammond K (2014) Discovering parallel pattern candidates in Erlang. In: 13th ACM SIGPLAN workshop on Erlang, Erlang ’14. ACM, New York, pp 13–23

  5. Brown C, Hammond K, Danelutto M, Kilpatrick P, Schöner H, Breddin T (2013) Paraphrasing: generating parallel programs using refactoring. In: Beckert B, Damiani F, de Boer F, Bonsangue MM (eds) Formal methods for components and objects, LNCS, vol 7542. Springer, Berlin, pp 237–256

    Chapter  Google Scholar 

  6. Castro PDO, Akel C, Petit E, Popov M, Jalby W (2015) CERE: LLVM-based Codelet Extractor and REplayer for piecewise benchmarking and optimization. ACM Trans Archit Code Optim 12:6:1–6:24

    Article  Google Scholar 

  7. Cevelop (2016) The C\(++\) IDE for professional developers. https://www.cevelop.com/. Accessed 5 Apr 2016

  8. Göhringer D, Tepelmann J (2014) An interactive tool based on polly for detection and parallelization of loops. In: Workshop on parallel programming and run-time management techniques for many-core architectures and design tools and architectures for multicore embedded computing platforms, PARMA-DITAM ’14. ACM, New York, pp 1:1–1:6

  9. González-Vélez H, Leyton M (2010) A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw Pract Exp 40(12):1135–1160

    Article  Google Scholar 

  10. Herb Sutter: welcome to the jungle (2012). http://herbsutter.com/welcome-to-the-jungle/. Accessed 6th May 2015

  11. ISO/IEC (2011) Information technology—programming languages—C++. In: International standard ISO/IEC 14882:20111. ISO/IEC, Geneva

  12. Jin H, Jost G, Yan J, Ayguade E, Gonzalez M, Martorell X (2003) Automatic multilevel parallelization using OpenMP. Sci Program 11(2):177–190

    Google Scholar 

  13. Kevin H, Allan P, Nick C, Hartmut K, Malony Allen D, Thomas S, Rob F (2015) An autonomic performance environment for exascale. In: Supercomputing frontiers and innovations, pp 49–66

  14. Lee S, Vetter JS (2014) OpenARC: Open accelerator research compiler for directive-based, efficient heterogeneous computing. In: 23rd international symposium on high-performance parallel and distributed computing, HPDC ’14. ACM, New York, pp 115–120

  15. Li Z, Atre R, Ul-Huda Z, Jannesari A, Wolf F (2015) Discopop: a profiling tool to identify parallelization opportunities. In: Tools for high performance computing 2014, chap 3. Springer, New York, , pp 37–54

    Chapter  Google Scholar 

  16. Lattner C (2008) LLVM and Clang: next generation compiler technology. In: The BSD conference, pp 1–2

  17. McCool M, Reinders J, Robison A (2012) Structured parallel programming: patterns for efficient computation, 1st edn. Morgan Kaufmann, San Francisco

    Google Scholar 

  18. REPARA FP-7 European Project (2015). http://repara-project.eu/. Accessed 1 Apr 2015

  19. Sotomayor R, Sanchez LM, Garcia Blas J, Calderon A, Fernandez J (2015) AKI: automatic kernel identification and annotation tool based on C\(++\) attributes. In: IEEE TrustCom-BigDataSE-ISPA 2015, pp 148–156

  20. Seo S, Jo G, Lee J (2011) Performance characterization of the NAS parallel benchmarks in OpenCL. In: 2011 IEEE international symposium on workload characterization (IISWC), pp 137–148

  21. Torquati M, Vanneschi M, Amini M, Guelton S, Keryell R, Lanore V, Pasquier FX, Barreteau M, Barrère R, Petrisor CT et al (2012) An innovative compilation tool-chain for embedded multi-core architectures. In: Embedded world conference

  22. Tournavitis G, Franke B (2010) Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information. PACT 2010:377–388

    Google Scholar 

  23. Vandierendonck H, Rul S, De Bosschere K (2010) The paralax infrastructure: automatic parallelization with a helping hand. PACT 2010:389–400

    MATH  Google Scholar 

Download references

Acknowledgments

The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under Grant Agreement No. 609666 and by the Spanish Ministry of Economics and Competitiveness under the Grant TIN2013-41350-P.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Miguel Sanchez.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

del Rio Astorga, D., Sotomayor, R., Sanchez, L.M. et al. Assessing and discovering parallelism in C\(++\) code for heterogeneous platforms. J Supercomput 74, 5674–5689 (2018). https://doi.org/10.1007/s11227-016-1794-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1794-8

Keywords

Navigation