Abstract
In this paper, we propose a new parallel optimization algorithm that combines ideas from the fields of metaheuristics and Systolic Computing. The algorithm, called Systolic Genetic Search (SGS), is designed to explicitly exploit the high degree of parallelism available in modern Graphics Processing Unit (GPU) architectures. In SGS, solutions circulate synchronously through a grid of processing cells, which apply adapted evolutionary operators on their inputs to compute their outputs that are then ejected from the cells and continue moving through the grid. Four different variants of SGS are experimentally studied for solving two classical benchmarking problems and a real-world application. An extensive experimental analysis, which considered several instances for each problem, shows that three of the SGS variants designed are highly effective since they can obtain the optimal solution in almost every execution for the instances and problems studied, as well as they outperform a Random Search (sanity check) and two Genetic Algorithms. The parallel implementation on GPU of the proposed algorithm has achieved a high performance obtaining runtime reductions from the sequential implementation that, depending on the instance considered, can arrive to around a hundred times, and have also exhibited a good scalability behavior when solving highly dimensional problem instances.
Similar content being viewed by others
Notes
The B stands for Both flows.
The random number generation on the CPU guarantees that, using the same seed, the results obtained by a stochastic algorithm in a CPU and in a GPU are the same.
The two solutions are read from the first memory space of the GPU global memory, one from the array that stores the solutions moving horizontally and the other from the array that stores the solutions moving vertically.
It should be noted that the two solutions are written in the second memory space of the GPU global memory, one in the array that stores the solutions moving horizontally and the other in the array that stores the solutions moving vertically.
We made this decision, rather than making each thread calculate these values redundantly, in order to reduce the number of registers used by the block.
References
Alba E (ed) (2005) Parallel metaheuristics: a new class of algorithms. Wiley, New York
Alba E, Dorronsorso B (eds) (2008) Cellular genetic algorithms. Springer, New York
Alba E, Vidal P (2011) Systolic optimization on GPU platforms. In: 13th international conference on computer aided systems theory (EUROCAST 2011)
Bagnall A, Rayward-Smith V, Whittley I (2001) The next release problem. Inf Softw Technol 43(14):883–890
Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput Surv 35(3):268–308
Cecilia JM, García JM, Ujaldon M, Nisbet A, Amos M (2011) Parallelization strategies for ant colony optimisation on gpus. In: 25th IEEE international symposium on parallel and distributed processing, IPDPS 2011, workshop proceedings, pp 339–346
Chan H, Mazumder P (1995) A systolic architecture for high speed hypergraph partitioning using a genetic algorithm. In: Yao X (ed) Progress in evolutionary computation, vol 956., Lecture Notes in Computer ScienceSpringer, Berlin, pp 109–126
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, New York
Durillo JJ, Zhang Y, Alba E, Harman M, Nebro AJ (2011) A study of the bi-objective next release problem. Empirical Softw Eng 16(1):29–60
Furber S (2000) ARM system-on-chip architecture, 2nd edn. Addison-Wesley Longman Publishing Co., Inc.
Gaster B, Howes L, Kaeli D, Mistry P, Schaa D (2012) Heterogeneous computing with OpenCL, 2nd edn. Morgan Kaufmann
Goldberg D, Deb K, Horn J (1992) Massively multimodality, deception and genetic algorithms. In: Proceedings of the international conference on parallel problem solving from nature II (PPSNII), pp 37–46
Guyton AC, Hall JE (2006) Textbook of medical physiology, 11th edn. Elsevier Saunders
Harding S, Banzhaf W (2011) Implementing cartesian genetic programming classifiers on graphics processing units using gpu.net. In: 13th annual genetic and evolutionary computation conference, GECCO 2011, companion material, pp 463–470
Hennessy J, Patterson D (2011) Computer architecture: a quantitative approach. The Morgan Kaufmann Series in Computer Architecture and Design. Morgan Kaufmann
Intel Corporation (2013a) Intel xeon phi core micro-architecture. White paper, Intel Corporation. http://software.intel.com/en-us/articles/intel-xeon-phi-core-micro-architecture
Intel Corporation (2013b) Intel xeon phi product family: performance brief. White paper, Intel Corporation. http://www.intel.com/content/www/us/en/benchmarks/xeon-phi-product-family-performance-brief.html
Kirk D, Hwu W (2012) Programming Massively parallel processors. A hands-on approach. 2nd edn. Morgan Kaufmann
Kung HT (1982) Why systolic architectures? Computer 15(1):37–46
Kung HT, Leiserson CE (1978) Systolic arrays (for VLSI). In: Sparse matrix proceedings, pp 256–282
Langdon WB (2011) Graphics processing units and genetic programming: an overview. Soft Comput 15(8):1657–1669
Langdon WB, Banzhaf W (2008) A simd interpreter for genetic programming on gpu graphics cards. In: Genetic programming, 11th European conference, EuroGP 2008. Proceedings, Springer, Lecture Notes in Computer Science, vol 4971, pp 73–85
Lewis TE, Magoulas GD (2009) Strategies to minimise the total run time of cyclic graph based genetic programming with gpus. Genetic and evolutionary computation conference, GECCO 2009, pp 1379–1386
Libby P, Bonow R, Mann D, Zipes D (2007) Braunwald’s heart disease: a textbook of cardiovascular medicine. Elsevier Health Sciences
Maitre O, Krüger F, Querry S, Lachiche N, Collet P (2012) Easea: specification and execution of evolutionary algorithms on gpgpu. Soft Comput 16(2):261–279
Marler R, Arora J (2004) Survey of multi-objective optimization methods for engineering. Struct Multidiscip Optim 26(6):369–395
McCool MD, Robison AD, Reinders J (2012) Structured parallel programming, patterns for efficient computation. Morgan Kaufmann
Megson G, Bland I (1998) Synthesis of a systolic array genetic algorithm. In: Parallel processing symposium, 1998. IPPS/SPDP 1998, pp 316–320
Miettinen K (1999) Nonlinear multiobjective optimization. International series in operations research and management science. Kluwer Academic Publishers
Nvidia Corporation (2009) NVIDIA’s next generation CUDA compute architecture: fermi. Nvidia Corporation, Whitepaper
Nvidia Corporation (2012a) CUDA C Best Practices Guide Version 5.0. Nvidia Corporation
Nvidia Corporation (2012b) CUDA Toolkit 5.0 CURAND Guide. Nvidia Corporation
Nvidia Corporation (2012c) NVIDIA CUDA C Programming Guide Version 5.0. Nvidia Corporation
Nvidia Corporation (2012d) NVIDIA’s next generation CUDA compute architecture: Kepler GK110. Whitepaper, the fastest, most efficient HPC architecture ever built. Nvidia Corporation
Owens JD, Luebke D, Govindaraju N, Harris M, Krnger J, Lefohn A, Purcell TJ (2007) A survey of general-purpose computation on graphics hardware. Comput Graphics Forum 26(1):80–113
Pedemonte M, Alba E, Luna F (2011) Bitwise operations for gpu implementation of genetic algorithms. In: Genetic and evolutionary computation conference, GECCO’11. Companion Publication, pp 439–446
Pedemonte M, Alba E, Luna F (2012) Towards the design of systolic genetic search. In: IEEE 26th international parallel and distributed processing symposium workshops and PhD Forum. IEEE Computer Society, pp 1778–1786
Pedemonte M, Luna F, Alba E (2013) New ideas in parallel metaheuristics on gpu: systolic genetic search. In: Tsutsui S, Collet P (eds) Massively parallel evolutionary computation on GPGPUs, Natural Computing Series, chap 10. Springer, Berlin, pp 203–225
Pisinger D (1997) A minimal algorithm for the 0–1 knapsack problem. Oper Res 45:758–767
Pisinger D (1999) Core problems in knapsack algorithms. Oper Res 47:570–575
Sheskin DJ (2011) Handbook of parametric and nonparametric statistical procedures, 5th edn. Chapman and Hall/CRC
Soca N, Blengio J, Pedemonte M, Ezzatti P (2010) PUGACE, a cellular evolutionary algorithm framework on GPUs. In: 2010 IEEE world congress on computational intelligence. WCCI 2010–2010 IEEE Congress on Evolutionary Computation, CEC 2010, pp 1–8
Tsutsui S, Fujimoto N (2011) Fast qap solving by aco with 2-opt local search on a gpu. In: 2011 IEEE congress of evolutionary computation, CEC 2011, pp 812–819
Veronese LDP, Krohling RA (2010) Differential evolution algorithm on the gpu with c-cuda. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2010, pp 1–7
Vidal P, Alba E (2010a) Cellular genetic algorithm on graphic processing units. In: Nature inspired cooperative strategies for optimization (NICSO 2010), pp 223–232
Vidal P, Alba E (2010b) A multi-gpu implementation of a cellular genetic algorithm. In: IEEE congress on evolutionary computation, pp 1–7
Vidal P, Luna F, Alba E (2013) Systolic neighborhood search on graphics processing units. Soft Computing, pp 1–18
Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731
Zhang S, He Z (2009) Implementation of parallel genetic algorithm based on CUDA. In: ISICA 2009, LNCS 5821, pp 24–30
Zhang Y, Harman M, Mansouri SA (2007) The multi-objective next release problem. In: Proceedings of the 9th annual conference on genetic and evolutionary computation, ACM, GECCO ’07, pp 1129–1137
Zhou Y, Tan Y (2009) Gpu-based parallel particle swarm optimization. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2009, pp 1493–1500
Acknowledgments
Martín Pedemonte acknowledges support from Programa de Desarrollo de las Ciencias Básicas, Universidad de la República, and Agencia Nacional de Investigación e Innovación, Uruguay. Francisco Luna and Enrique Alba acknowledge partial support from the Spanish Ministry of Economy and Competitiveness and FEDER under contract TIN2011-28194. Francisco Luna also acknowledges partial support from TIN2011-28336. The authors would like to thank to M.Sc. Leonella Luzardo for her valuable comments and suggestions to improve the description of the biological phenomenon that inspires Systolic Computing and systolic based metaheuristics. The authors would also like to thank to the anonymous reviewers for their insightful and constructive suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Pedemonte, M., Luna, F. & Alba, E. Systolic genetic search, a systolic computing-based metaheuristic. Soft Comput 19, 1779–1801 (2015). https://doi.org/10.1007/s00500-014-1363-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1363-0