research-article

FastPSO: Towards Efficient Swarm Intelligence Algorithm on GPUs

Authors:
Hanfeng Liu

DMAI ltd, China

DMAI ltd, China
View Profile

,
Zeyi Wen

The University of Western Australia

The University of Western Australia
View Profile

,
Wei Cai

The Chinese University of Hong Kong, Shenzhen

The Chinese University of Hong Kong, Shenzhen
View Profile

ICPP '21: Proceedings of the 50th International Conference on Parallel ProcessingAugust 2021Article No.: 1Pages 1–10https://doi.org/10.1145/3472456.3472474

Published:05 October 2021Publication History

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

Pages 1–10

ABSTRACT

Particle Swarm Optimization (PSO) has been widely used in various optimization tasks (e.g., neural architecture search and autonomous vehicle navigation), because it can solve non-convex optimization problems with simplicity and efficacy. However, the PSO algorithm is often time-consuming to use, especially for high-dimensional problems, which hinders its applicability in time-critical applications. In this paper, we propose novel techniques to accelerate the PSO algorithm with GPUs. To mitigate the efficiency bottleneck, we formally model the PSO optimization as a process of element-wise operations on matrices. Based on the modeling, we develop an efficient GPU algorithm to perform the element-wise operations in massively parallel using the tensor cores and shared memory. Moreover, we propose a series of novel techniques to improve our proposed algorithm, including (i) GPU resource-aware thread creation to prevent creating too many threads when the number of particles/dimensions is large; (ii) designing parallel techniques to initialize swarm particles with fast random number generation; (iii) exploiting GPU memory caching to manage swarm information instead of allocating new memory and (iv) developing a schema to support customized swarm evaluation functions. We conduct extensive experiments on four optimization applications to study the efficiency of our algorithm called “FastPSO”. Experimental results show that FastPSO consistently outperforms the existing CPU-based PSO libraries by two orders of magnitude, and transcends the existing GPU-based implementation by 5 to 7 times, while achieving better or competitive optimization results.

References

Enrique Alba, Gabriel Luque, and Sergio Nesmachnow. 2013. Parallel metaheuristics: recent advances and new trends. International Transactions in Operational Research 20, 1 (2013), 1–48.Google ScholarCross Ref
Emilio Fortunato Campana, Matteo Diez, Giovanni Fasano, and Daniele Peri. 2013. Initial particles position for PSO, in bound constrained optimization. In International Conference in Swarm Intelligence. Springer, 112–119.Google ScholarCross Ref
Songsak Chusanapiputt, Dulyatat Nualhong, Sujate Jantarang, and Sukumvit Phoomvuthisarn. 2005. Relative velocity updating in parallel particle swarm optimization based lagrangian relaxation for large-scale unit commitment problem. In TENCON 2005-2005 IEEE Region 10 Conference. IEEE, 1–6.Google ScholarCross Ref
Ashraf Darwish, Aboul Ella Hassanien, and Swagatam Das. 2020. A survey of swarm and evolutionary computing approaches for deep learning. Artificial Intelligence Review 53, 3 (2020), 1767–1812.Google ScholarCross Ref
Marco Dorigo, Mauro Birattari, and Thomas Stutzle. 2006. Ant colony optimization. IEEE Computational Intelligence Magazine 1, 4 (2006), 28–39.Google ScholarDigital Library
Dennis Gies and Yahya Rahmat-Samii. 2003. Reconfigurable array design using parallel particle swarm optimization. In IEEE Antennas and Propagation Society International Symposium. Digest. Held in conjunction with: USNC/CNC/URSI North American Radio Sci. Meeting (Cat. No. 03CH37450), Vol. 1. IEEE, 177–180.Google ScholarCross Ref
John L Gustafson. 1988. Reevaluating Amdahl’s law. Commun. ACM 31, 5 (1988), 532–533.Google ScholarDigital Library
Hashim A Hashim and Mohammad A Abido. 2019. Location management in LTE networks using multi-objective particle swarm optimization. Computer Networks 157(2019), 78–88.Google ScholarDigital Library
Roger A Horn. 1990. The hadamard product. In Proc. Symp. Appl. Math, Vol. 40. 87–169.Google ScholarCross Ref
Md Maruf Hussain, Hiroshi Hattori, and Noriyuki Fujimoto. 2016. A CUDA implementation of the standard particle swarm optimization. In 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). IEEE, 219–226.Google ScholarCross Ref
Jing Jiang, Fei Han, Qinghua Ling, Jie Wang, Tiange Li, and Henry Han. 2020. Efficient network architecture search via multiobjective particle swarm optimization based on decomposition. Neural Networks 123(2020), 305–316.Google ScholarDigital Library
Francisco Erivaldo Fernandes Junior and Gary G Yen. 2019. Particle swarm optimization of deep neural networks architectures for image classification. Swarm and Evolutionary Computation 49 (2019), 62–74.Google ScholarCross Ref
Dervis Karaboga. 2010. Artificial bee colony algorithm. Scholarpedia 5, 3 (2010), 6915.Google ScholarCross Ref
Massimiliano Kaucic. 2013. A multi-start opposition-based particle swarm optimization algorithm with adaptive velocity for bound constrained global optimization. Journal of Global Optimization 55, 1 (2013), 165–188.Google ScholarDigital Library
James Kennedy and Russell Eberhart. 1995. Particle swarm optimization. In Proceedings of ICNN’95-International Conference on Neural Networks, Vol. 4. IEEE, 1942–1948.Google ScholarCross Ref
Byung-Il Koh, Alan D George, Raphael T Haftka, and Benjamin J Fregly. 2006. Parallel asynchronous particle swarm optimization. Internat. J. Numer. Methods Engrg. 67, 4 (2006), 578–595.Google ScholarCross Ref
Gerardo A Laguna-Sánchez, Mauricio Olguín-Carbajal, Nareli Cruz-Cortés, Ricardo Barrón-Fernández, and Jesús A Álvarez-Cedillo. 2009. Comparative study of parallel variants for a particle swarm optimization algorithm implemented on a multithreading GPU. Journal of Applied Research and Technology 7, 3 (2009), 292–307.Google ScholarCross Ref
Andrew W McNabb, Christopher K Monson, and Kevin D Seppi. 2007. Parallel PSO using MapReduce. In 2007 IEEE Congress on Evolutionary Computation. IEEE, 7–14.Google ScholarCross Ref
Lester James Miranda. 2018. PySwarms: a research toolkit for particle swarm optimization in Python. Journal of Open Source Software 3, 21 (2018), 433.Google ScholarCross Ref
Marcin Molga and Czesław Smutnicki. 2005. Test functions for optimization needs. Test Functions for Optimization Needs 101 (2005), 48.Google Scholar
Luca Mussi, Fabio Daolio, and Stefano Cagnoni. 2011. Evaluation of parallel particle swarm optimization algorithms within the CUDA-TM architecture. Information Sciences 181, 20 (2011), 4642–4657.Google ScholarDigital Library
Anand Nayyar and Nhu Gia Nguyen. 2018. Introduction to swarm intelligence. Advances in Swarm Intelligence for Optimizing Problems in Computer Science (2018), 53–78.Google Scholar
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine learning in Python. the Journal of Machine Learning Research 12 (2011), 2825–2830.Google Scholar
Vincent Roberge and Mohammed Tarbouchi. 2012. Parallel particle swarm optimization on graphical processing unit for pose estimation. WSEAS Trans. Comput 11, 6 (2012), 170–179.Google Scholar
Shane Ryoo, Christopher I Rodrigues, Sara S Baghsorkhi, Sam S Stone, David B Kirk, and Wen-mei W Hwu. 2008. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 73–82.Google ScholarDigital Library
JF Schutte, BJ Fregly, RT Haftka, and AD George. 2003. A parallel particle swarm optimizer. Technical Report. FLORIDA UNIV GAINESVILLE MECHANICAL AND AEROSPACE ENGINEERING.Google Scholar
Andrea Serani, Cecilia Leotardi, Umberto Iemma, Emilio F Campana, Giovanni Fasano, and Matteo Diez. 2016. Parameter selection in synchronous and asynchronous deterministic particle swarm optimization for ship hydrodynamics problems. Applied Soft Computing 49 (2016), 313–334.Google ScholarDigital Library
Ying Tan and Ke Ding. 2015. A survey on GPU-based implementation of swarm intelligence algorithms. IEEE Transactions on Cybernetics 46, 9 (2015), 2028–2041.Google ScholarCross Ref
Gerhard Venter and Jaroslaw Sobieszczanski-Sobieski. 2006. Parallel particle swarm optimization algorithm accelerated by asynchronous evaluations. Journal of Aerospace Computing, Information, and Communication 3, 3(2006), 123–137.Google ScholarCross Ref
Vasily Volkov. 2010. Better performance at lower occupancy. In Proceedings of the GPU Technology Conference, GTC, Vol. 10. San Jose, CA, 16.Google Scholar
Mark P Wachowiak, Mitchell C Timson, and David J DuVal. 2017. Adaptive particle swarm optimization with heterogeneous multicore parallelism and GPU acceleration. IEEE Transactions on Parallel and Distributed Systems 28, 10 (2017), 2784–2793.Google ScholarDigital Library
Zeyi Wen, Hanfeng Liu, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. 2020. ThunderGBM: Fast GBDTs and Random Forests on GPUs. Journal of Machine Learning Research 21, 108 (2020), 1–5.Google Scholar

Recommendations

Theory of swarm intelligence
GECCO Comp '14: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation

Social animals as found in fish schools, bird flocks, bee hives, and ant colonies are able to solve highly complex problems in nature. This includes foraging for food, constructing astonishingly complex nests, and evading or defending against predators. ...
Read More
An Improved Particle Swarm Algorithm for Search Optimization
GCIS '09: Proceedings of the 2009 WRI Global Congress on Intelligent Systems - Volume 01

To address the problem of space locus searching, a slowdown particle swarm optimization (SPSO) is proposed to improve the convergence performance of particle swarm from the position viewpoint. The particle swarm in SPSO is divided into many independent ...
Read More
Theory of Swarm Intelligence
GECCO Companion '15: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation

Social animals as found in fish schools, bird flocks, bee hives, and ant colonies are able to solve highly complex problems in nature. This includes foraging for food, constructing astonishingly complex nests, and evading or defending against predators. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing
August 2021
927 pages
ISBN:9781450390682
DOI:10.1145/3472456

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPU
High Performance Computing
Particle Swarm Optimization
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate91of313submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 267
  Total Downloads
- Downloads (Last 12 months)40
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

FastPSO: Towards Efficient Swarm Intelligence Algorithm on GPUs

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Recommendations

Theory of swarm intelligence

An Improved Particle Swarm Algorithm for Search Optimization

Theory of Swarm Intelligence

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

FastPSO: Towards Efficient Swarm Intelligence Algorithm on GPUs

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Recommendations

Theory of swarm intelligence

An Improved Particle Swarm Algorithm for Search Optimization

Theory of Swarm Intelligence

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media