Abstract
This article presents the parallel implementation on CPU/GPU of two variants of a stochastic local search method to efficiently solve the scheduling problem in heterogeneous computing systems. Both methods are based on a set of simple operators to keep the computational complexity as low as possible, thus allowing large instances of the scheduling problem to be efficiently addressed. The experimental analysis demonstrates that both versions of the parallel CPU/GPU stochastic local search are able to compute accurate suboptimal schedules in significantly shorter execution times than state-of-the-art schedulers, while also outperforming a recently published GPU parallel evolutionary scheduler in terms of both efficiency and solution quality.
Similar content being viewed by others
References
Alba E, Luque G (2007) A new local search algorithm for the DNA fragment assembly problem. In: Proceedings of the 7th European conference on evolutionary computation in combinatorial optimization, pp 1–12
Alba E, Luque G, Nesmachnow S (2013) Parallel metaheuristics: recent advances and new trends. Int Trans Oper Res 20(1):1–48
Ali S, Siegel H, Maheswaran M, Ali S, Hensgen D (2000) Task execution time modeling for heterogeneous computing systems. In: Proceedings of the 9th heterogeneous computing workshop, Washington, DC, USA, pp 185
Blazewicz J, Frohmberg W, Kierzynka M, Wojciechowski P (2013) G-MSA—a GPU-based, fast and accurate algorithm for multiple sequence alignment. J Parallel Distrib Comput 73(1):32–41
Bordoloi U, Suri B, Nunna S, Chakraborty S, Eles P, Peng Z (2012) Customizing instruction set extensible reconfigurable processors using GPUs. In: Proceedings of the 25th international conference on VLSI design, pp 418–423
Braun T, Siegel H, Beck N, Bölöni L, Maheswaran M, Reuther A, Robertson J, Theys M, Yao B, Hensgen D, Freund R (2001) A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J Parallel Distrib Comput 61(6):810–837
Canabé M, Nesmachnow S (2012) Parallel implementations of the minmin heterogeneous computing scheduler in GPU. CLEI Electron J 15(3):1–12
Chen Y, Hung C, Lin Y, Lin C, Lee T, Lee K (2012) Parallel UPGMA algorithm on graphics processing units using CUDA. In: Proceedings of 14th international conference on high performance computing and communication, pp 849–854
Croes A (1958) A method for solving traveling salesman problems. Oper Res 5:791–812
Czapiński M (2013) An effective parallel multistart tabu search for quadratic assignment problem on CUDA platform. J Parallel Distrib Comput 73(11):1461–1468
Delévacq A, Delisle P, Krajecki M (2012) Parallel GPU implementation of iterated local search for the travelling salesman problem. In: Hamadi Y, Schoenauer M (eds) Learning and intelligent optimization. Lecture notes in computer science. Springer, Berlin, pp 372–377
El-Rewini H, Lewis T, Ali H (1994) Task scheduling in parallel and distributed systems. Prentice-Hall Inc, Upper Saddle River
Eshaghian M (1996) Heterogeneous computing. Artech House, Norwood
Foster I, Kesselman C (1998) The grid: blueprint for a future computing infrastructure. Morgan Kaufmann Publishers, Menlo Park
Freund R, Sunderam V, Gottlieb A, Hwang K, Sahni S (1994) Special issue on heterogeneous processing. J Parallel Distrib Comput 21(3):1
Garey M, Johnson D (1979) Computers and intractability. Freeman, San Francisco
Graham R, Lawler J, Lenstra E, Kan A (1979) Optimization and approximation in deterministic sequencing and scheduling: a survey. Ann Discret Math 5:287–326
Gulati K, Khatri SP (2010) Boolean satisfiability on a graphics processor. In: Proceedings of the 20th symposium on great lakes symposium on VLSI, pp 123–126
Kider J, Henderson M, Likhachev M, Safonova A (2010) High-dimensional planning on the GPU. In: IEEE international conference on robotics and automation, pp 2515–2522
Kwok Y, Ahmad I (1999) Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput Surv 31(4):406–471
Leung J, Kelly L, Anderson J (2004) Handbook of scheduling: algorithms, models, and performance analysis. CRC Press Inc, Boca Raton
Luna F, Nesmachnow S, Alba E (2010) Búsqueda local paralela para la planificación de tareas en sistemas heterogéneos. In: Proceedings of VIII Congreso Español de Metaheurísticas, Algoritmos Evolutivos y Bioinspirados, Albacete, España, pp 1–8
Luong TV, Loukil L, Melab N, Talbi E-G (2010) A GPU-based iterated tabu search for solving the quadratic 3-dimensional assignment problem. In: 2010 IEEE/ACS international conference on computer systems and applications, pp 1–8
Luong TV, Melab N, Talbi E-G (2011) GPU-based multi-start local search algorithms. In: Proceedings of the 5th international conference on learning and intelligent optimization, pp 321–335
Luong TV, Melab N, Talbi E-G (2010) Neighborhood structures for gpu-based local search algorithms. Parallel Process Lett 20(4):307–324
Luong TV, Melab N, Talbi E-G (2013) GPU computing for parallel local search metaheuristic algorithms. IEEE Trans Comput 62(1):173–185
Melab N, Luong TV, Boufaras K, Talbi E-G (2013) Paradiseo-mo-GPU: a framework for parallel GPU-based local search metaheuristics. In: Proceedings of the 15th genetic and evolutionary computation conference, pp 1189–1196
Nashed Y, Ugolotti R, Mesejo P, Cagnoni S (2012) libCudaOptimize: an open source library of GPU-based metaheuristics. In: Proceedings of the 14th international conference on genetic and evolutionary computation conference companion, pp 117–124
Nesmachnow S, Cancela H, Alba E (2010) Heterogeneous computing scheduling with evolutionary algorithms. Soft Comput 15(4):685–701
Nesmachnow S, Cancela H, Alba E (2012) A parallel micro evolutionary algorithm for heterogeneous computing and grid scheduling. Appl Soft Comput 12(2):626–639
Nesmachnow S, Luna F, Alba E (2012) An efficient stochastic local search for heterogeneous computing scheduling. In: Proceedings of the 26th international parallel and distributed processing symposium, pp 593–600
nVidia (2010) CUDA website. http://www.nvidia.com/object/cuda_home.html. Accessed March 2014
nVidia Corporation (2011) 2701 San Tomas Expressway, Santa Clara 95050, USA. CUDA C Best Practices Guide, 4.0 edn
Pinel F, Dorronsoro B, Bouvry P (2013) Solving very large instances of the scheduling of independent tasks problem on the GPU. J Parallel Distrib Comput 73(1):101–110
Pinel F, Pecero J, Bouvry P, Khan SU (2011) A two-phase heuristic for the scheduling of independent tasks on computational grids. In: Proceedings of international conference on high performance computing and simulation, pp 471–477
Ritchie G, Levine J (2003) A fast, effective local search for scheduling independent jobs in heterogeneous computing environments. In: Proceedings of the 22nd workshop of the UK Planning and Scheduling Special Interest Group
Rocki K, Suda R (2012) Accelerating 2-opt and 3-opt local search using GPU in the travelling salesman problem. In: Proceedings of the international conference on high performance computing and simulation, pp 489–495
Rocki K, Suda R (2012) An efficient GPU implementation of a multi-start TSP solver for large problem instances. In: Proceedings of the 14th international conference on genetic and evolutionary computation companion, pp 1441–1442
Roverso R, Naiem A, El-Beltagy M, El-Ansary S, Haridi S (2010) A GPU-enabled solver for time-constrained linear sum assignment problems. In: Proceedings of 7th international conference on informatics and systems, pp 1–6
Schulz C (2013) Efficient local search on the GPU—investigations on the vehicle routing problem. J Parallel Distrib Comput 73:14–31
Acknowledgments
The work of S. Iturriaga and S. Nesmachnow has been partially supported by ANII and PEDECIBA, Uruguay. The work of F. Luna and E. Alba has been partially funded by FEDER (TIN2011-28194). The experiments were carried out using the HPC facility of the University of Luxembourg.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Iturriaga, S., Nesmachnow, S., Luna, F. et al. A parallel local search in CPU/GPU for scheduling independent tasks on large heterogeneous computing systems. J Supercomput 71, 648–672 (2015). https://doi.org/10.1007/s11227-014-1315-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1315-6