Abstract
Current parallel systems composed of mixed multi/manycore systems and/with GPUs become more complex due to their heterogeneous nature. The programmability barrier inherent to parallel systems increases almost with each new architecture delivery. The development of libraries, languages, and tools that allow an easy and efficient use in this new scenario is mandatory. Among the proposals found to broach this problem, skeletal programming appeared as a natural alternative to easy the programmability of parallel systems in general, but also the GPU programming in particular. In this paper, we develop a programming skeleton for Dynamic Programming on MultiGPU systems. The skeleton, implemented in CUDA, allows the user to execute parallel codes for MultiGPU just by providing sequential C++ specifications of her problems. The performance and easy of use of this skeleton has been tested on several optimization problems. The experimental results obtained over a cluster of Nvidia Fermi prove the advantages of the approach.
Similar content being viewed by others
References
Schordan M, Quinlan DJ (2003) A source-to-source architecture for user-defined optimizations. In: JMLC, pp 214–223
Blume W, Doallo R, Eigenmann R, Grout J, Hoeflinger J, Lawrence T, Lee J, Padua D, Paek Y, Pottenger B, Rauchwerger L, Tu P (1996) Parallel programming with Polaris. Computer 29:78–82
Dooley I (2006) Automated source-to-source translations to assist parallel programmers. Master’s thesis, Dept of Computer Science, University of Illinois http://charm.cs.uiuc.edu/papers/DooleyMSThesis06.shtml
Ueng Sz, Lathara M, Baghsorkhi SS, Hwu WmW (2008) Cuda-lite: reducing CPU programming complexity. In: LCPC’08. Lecture notes in computer science, vol 5335. Springer, Berlin, pp 1–15
Lionetti FV, McCulloch AD, Baden SB (2010) Source-to-source optimization of cuda C for GPU accelerated cardiac cell modeling. In: Proceedings of the 16th international Euro-Par conference on parallel processing: part I (EuroPar’10). Springer, Berlin, pp 38–49
Par4All. www.par4all.org
Cole MI (1988) Algorithmic skeletons: a structured approach to the management of parallel computation. PhD thesis. AAID-85022
Bischof H, Gorlatch S (2002) Double-scan: introducing and implementing a new data-parallel skeleton. In: Proceedings of the 8th international Euro-Par conference on parallel processing (Euro-Par ’02). Springer, London, pp 640–647
Darlington J, Field AJ, Harrison PG, Kelly PHJ, Sharp DWN, Wu Q, While RL (1993) Parallel programming using skeleton functions. Springer, Berlin
Cole M (2004) Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput 30:389–406
Benoit A, Cole M (2005) Two fundamental concepts in skeletal parallel programming. In: The international conference on computational science (ICCS 2005), part II. Lecture notes in computer science, vol 3515. Springer, Berlin, pp 764–771
Buono D, Danelutto M, Lametti S (2010) Map, reduce and mapreduce, the skeleton way. Proc Comput Sci 1(1):2095–2103
González-Vélez H, Leyton M (2010) A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw Pract Exp 40:1135–1160
ROSE. www.rosecompiler.org
Pai S, Govindarajan R, Thazhuthaveetil MJ (2010) Plasma: portable programming for SIMD heterogeneous accelerators
Benkner S, Mehofer E, Pllana S (2008) Towards an intelligent environment for programming multi-core computing systems. In: Proceedings of the 2nd workshop on highly parallel processing on a chip (HPPC 2008), in conjunction with Euro-Par 2008, August 2008
Dave C, Bae H, Min S-J, Lee S, Eigenmann R, Midkiff SP (2009) Cetus: a source-to-source compiler infrastructure for multicores. Computer 42(11):36–42
Christen M, Schenk O, Burkhart H (2011) Automatic code generation and tuning for stencil kernels on modern shared memory architectures. Comput Sci 26(3-4):205–210
Brown KJ, Sujeeth AK, Lee HJ, Rompf T, Chafi H, Odersky M, Olukotun K (2011) A heterogeneous parallel framework for domain-specific languages. In: Proceedings of the 2011 international conference on parallel architectures and compilation techniques (PACT ’11). IEEE Computer Society, Washington, pp 89–100
Steuwer M, Kegel P, Gorlatch S (2011) Skelcl—a portable skeleton library for high-level CPU programming. In: Proceedings of the 2011 IEEE international symposium on parallel and distributed processing workshops and PhD forum (IPDPSW ’11). IEEE Computer Society, Washington, pp 1176–1182
Enmyren J, Kessler CW (2010) Skepu: a multi-backend skeleton programming library for multi-CPU systems. In: Proceedings of the fourth international workshop on high-level parallel programming and applications (HLPP ’10). ACM, New York, pp 5–14
Nascimento J, Powell W (2010) Dynamic programming models and algorithms for the mutual fund cash balance problem. Manage Sci 56:801–815
Erdelyi A, Topaloglu H (2010) A dynamic programming decomposition method for making overbooking decisions over an airline network. INFORMS J Comput 22:443–456
Huang K, Liang Y-T (2011) A dynamic programming algorithm based on expected revenue approximation for the network revenue management problem. Transp Res Part E, Logist Transp Rev 47(3), 333-341
Shachter R, Bhattacharjya D (2010) Dynamic programming in influence diagrams with decision circuits. In: Twenty-sixth conference on uncertainty in artificial intelligence, pp 509–516
Peláez I, Almeida F, Suárez F (2007) Dpskel: a skeleton based tool for parallel dynamic programming. In: Seventh international conference on parallel processing and applied mathematics (PPAM2007)
Helman P (1989) A common schema for dynamic programming and branch and bound algorithms. J ACM 36:97–128
Karp RM, Held M (1967) Finite state process and dynamic programming. SIAM J Appl Math 15:693–718
Ibaraki T (1988) Enumerative approaches to combinatorial optimization, part II. Ann Oper Res 11:1–4
de Moor O (1999) Dynamic programming as a software component. In: Mastorakis N (ed) Proc 3rd WSEAS int conf circuits, systems, communications and computers
Andonov R, Balev S, Rajopadhye S, Yanev N (2001) Otimal semi-oblique tiling and its application to sequence comparison. In: 13th ACM symposium on parallel algorithms and architectures (SPAA)
Andonov R, Rajopadhye S (1997) Optimal orthogonal tiling of 2-d iterations. J Parallel Distrib Comput 45:159–165
Morales D, ALmeida F, Rodríguez C, Roda J, Delgado CAI (2000) Parallel dynamic programming and automata theory. Parallel Computing 26(1), 113–134
Eckstein J, Phillips CA, Hart WE (2000) PICO: an object-oriented framework for parallel branch and bound. Technical report, RUTCOR
Le Cun B (2001) Bob++ library illustrated by VRP. In: European operational research conference (EURO’2001), Rotterdam, p 157
Lubow BC (1997) SDP: generalized software for solving stochastic dynamic optimization problems. Wildl Soc Bull 23:738–742
Lohmander P Deterministic and stochastic dynamic programming. www.sekon.slu.se/PLO/diskreto/dynp.htm
Acknowledgements
This work has been supported by the EC (FEDER) and the Spanish MEC with the I + D + I contract number: TIN2011-24598.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Acosta, A., Almeida, F. Skeletal based programming for dynamic programming on MultiGPU systems. J Supercomput 65, 1125–1136 (2013). https://doi.org/10.1007/s11227-013-0895-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-0895-x