Skip to main content
Log in

Skeletal based programming for dynamic programming on MultiGPU systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Current parallel systems composed of mixed multi/manycore systems and/with GPUs become more complex due to their heterogeneous nature. The programmability barrier inherent to parallel systems increases almost with each new architecture delivery. The development of libraries, languages, and tools that allow an easy and efficient use in this new scenario is mandatory. Among the proposals found to broach this problem, skeletal programming appeared as a natural alternative to easy the programmability of parallel systems in general, but also the GPU programming in particular. In this paper, we develop a programming skeleton for Dynamic Programming on MultiGPU systems. The skeleton, implemented in CUDA, allows the user to execute parallel codes for MultiGPU just by providing sequential C++ specifications of her problems. The performance and easy of use of this skeleton has been tested on several optimization problems. The experimental results obtained over a cluster of Nvidia Fermi prove the advantages of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Listing 1
Listing 2
Listing 3
Listing 4
Listing 5
Listing 6
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Schordan M, Quinlan DJ (2003) A source-to-source architecture for user-defined optimizations. In: JMLC, pp 214–223

    Google Scholar 

  2. Blume W, Doallo R, Eigenmann R, Grout J, Hoeflinger J, Lawrence T, Lee J, Padua D, Paek Y, Pottenger B, Rauchwerger L, Tu P (1996) Parallel programming with Polaris. Computer 29:78–82

    Article  Google Scholar 

  3. Dooley I (2006) Automated source-to-source translations to assist parallel programmers. Master’s thesis, Dept of Computer Science, University of Illinois http://charm.cs.uiuc.edu/papers/DooleyMSThesis06.shtml

  4. Ueng Sz, Lathara M, Baghsorkhi SS, Hwu WmW (2008) Cuda-lite: reducing CPU programming complexity. In: LCPC’08. Lecture notes in computer science, vol 5335. Springer, Berlin, pp 1–15

    Google Scholar 

  5. Lionetti FV, McCulloch AD, Baden SB (2010) Source-to-source optimization of cuda C for GPU accelerated cardiac cell modeling. In: Proceedings of the 16th international Euro-Par conference on parallel processing: part I (EuroPar’10). Springer, Berlin, pp 38–49

    Google Scholar 

  6. Par4All. www.par4all.org

  7. Cole MI (1988) Algorithmic skeletons: a structured approach to the management of parallel computation. PhD thesis. AAID-85022

  8. Bischof H, Gorlatch S (2002) Double-scan: introducing and implementing a new data-parallel skeleton. In: Proceedings of the 8th international Euro-Par conference on parallel processing (Euro-Par ’02). Springer, London, pp 640–647

    Chapter  Google Scholar 

  9. Darlington J, Field AJ, Harrison PG, Kelly PHJ, Sharp DWN, Wu Q, While RL (1993) Parallel programming using skeleton functions. Springer, Berlin

    Google Scholar 

  10. Cole M (2004) Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput 30:389–406

    Article  Google Scholar 

  11. Benoit A, Cole M (2005) Two fundamental concepts in skeletal parallel programming. In: The international conference on computational science (ICCS 2005), part II. Lecture notes in computer science, vol 3515. Springer, Berlin, pp 764–771

    Google Scholar 

  12. Buono D, Danelutto M, Lametti S (2010) Map, reduce and mapreduce, the skeleton way. Proc Comput Sci 1(1):2095–2103

    Article  Google Scholar 

  13. González-Vélez H, Leyton M (2010) A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw Pract Exp 40:1135–1160

    Article  Google Scholar 

  14. ROSE. www.rosecompiler.org

  15. Pai S, Govindarajan R, Thazhuthaveetil MJ (2010) Plasma: portable programming for SIMD heterogeneous accelerators

  16. Benkner S, Mehofer E, Pllana S (2008) Towards an intelligent environment for programming multi-core computing systems. In: Proceedings of the 2nd workshop on highly parallel processing on a chip (HPPC 2008), in conjunction with Euro-Par 2008, August 2008

    Google Scholar 

  17. Dave C, Bae H, Min S-J, Lee S, Eigenmann R, Midkiff SP (2009) Cetus: a source-to-source compiler infrastructure for multicores. Computer 42(11):36–42

    Article  Google Scholar 

  18. Christen M, Schenk O, Burkhart H (2011) Automatic code generation and tuning for stencil kernels on modern shared memory architectures. Comput Sci 26(3-4):205–210

    Google Scholar 

  19. Brown KJ, Sujeeth AK, Lee HJ, Rompf T, Chafi H, Odersky M, Olukotun K (2011) A heterogeneous parallel framework for domain-specific languages. In: Proceedings of the 2011 international conference on parallel architectures and compilation techniques (PACT ’11). IEEE Computer Society, Washington, pp 89–100

    Chapter  Google Scholar 

  20. Steuwer M, Kegel P, Gorlatch S (2011) Skelcl—a portable skeleton library for high-level CPU programming. In: Proceedings of the 2011 IEEE international symposium on parallel and distributed processing workshops and PhD forum (IPDPSW ’11). IEEE Computer Society, Washington, pp 1176–1182

    Chapter  Google Scholar 

  21. Enmyren J, Kessler CW (2010) Skepu: a multi-backend skeleton programming library for multi-CPU systems. In: Proceedings of the fourth international workshop on high-level parallel programming and applications (HLPP ’10). ACM, New York, pp 5–14

    Chapter  Google Scholar 

  22. Nascimento J, Powell W (2010) Dynamic programming models and algorithms for the mutual fund cash balance problem. Manage Sci 56:801–815

    Article  MATH  Google Scholar 

  23. Erdelyi A, Topaloglu H (2010) A dynamic programming decomposition method for making overbooking decisions over an airline network. INFORMS J Comput 22:443–456

    Article  MATH  Google Scholar 

  24. Huang K, Liang Y-T (2011) A dynamic programming algorithm based on expected revenue approximation for the network revenue management problem. Transp Res Part E, Logist Transp Rev 47(3), 333-341

    Article  Google Scholar 

  25. Shachter R, Bhattacharjya D (2010) Dynamic programming in influence diagrams with decision circuits. In: Twenty-sixth conference on uncertainty in artificial intelligence, pp 509–516

    Google Scholar 

  26. Peláez I, Almeida F, Suárez F (2007) Dpskel: a skeleton based tool for parallel dynamic programming. In: Seventh international conference on parallel processing and applied mathematics (PPAM2007)

    Google Scholar 

  27. Helman P (1989) A common schema for dynamic programming and branch and bound algorithms. J ACM 36:97–128

    Article  MathSciNet  MATH  Google Scholar 

  28. Karp RM, Held M (1967) Finite state process and dynamic programming. SIAM J Appl Math 15:693–718

    Article  MathSciNet  MATH  Google Scholar 

  29. Ibaraki T (1988) Enumerative approaches to combinatorial optimization, part II. Ann Oper Res 11:1–4

    MathSciNet  Google Scholar 

  30. de Moor O (1999) Dynamic programming as a software component. In: Mastorakis N (ed) Proc 3rd WSEAS int conf circuits, systems, communications and computers

    Google Scholar 

  31. Andonov R, Balev S, Rajopadhye S, Yanev N (2001) Otimal semi-oblique tiling and its application to sequence comparison. In: 13th ACM symposium on parallel algorithms and architectures (SPAA)

    Google Scholar 

  32. Andonov R, Rajopadhye S (1997) Optimal orthogonal tiling of 2-d iterations. J Parallel Distrib Comput 45:159–165

    Article  MATH  Google Scholar 

  33. Morales D, ALmeida F, Rodríguez C, Roda J, Delgado CAI (2000) Parallel dynamic programming and automata theory. Parallel Computing 26(1), 113–134

    Article  MathSciNet  MATH  Google Scholar 

  34. Eckstein J, Phillips CA, Hart WE (2000) PICO: an object-oriented framework for parallel branch and bound. Technical report, RUTCOR

  35. Le Cun B (2001) Bob++ library illustrated by VRP. In: European operational research conference (EURO’2001), Rotterdam, p 157

    Google Scholar 

  36. Lubow BC (1997) SDP: generalized software for solving stochastic dynamic optimization problems. Wildl Soc Bull 23:738–742

    Google Scholar 

  37. Lohmander P Deterministic and stochastic dynamic programming. www.sekon.slu.se/PLO/diskreto/dynp.htm

Download references

Acknowledgements

This work has been supported by the EC (FEDER) and the Spanish MEC with the I + D + I contract number: TIN2011-24598.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Acosta.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Acosta, A., Almeida, F. Skeletal based programming for dynamic programming on MultiGPU systems. J Supercomput 65, 1125–1136 (2013). https://doi.org/10.1007/s11227-013-0895-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-0895-x

Keywords

Navigation