skip to main content
10.1145/3472456.3472474acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

FastPSO: Towards Efficient Swarm Intelligence Algorithm on GPUs

Published:05 October 2021Publication History

ABSTRACT

Particle Swarm Optimization (PSO) has been widely used in various optimization tasks (e.g., neural architecture search and autonomous vehicle navigation), because it can solve non-convex optimization problems with simplicity and efficacy. However, the PSO algorithm is often time-consuming to use, especially for high-dimensional problems, which hinders its applicability in time-critical applications. In this paper, we propose novel techniques to accelerate the PSO algorithm with GPUs. To mitigate the efficiency bottleneck, we formally model the PSO optimization as a process of element-wise operations on matrices. Based on the modeling, we develop an efficient GPU algorithm to perform the element-wise operations in massively parallel using the tensor cores and shared memory. Moreover, we propose a series of novel techniques to improve our proposed algorithm, including (i) GPU resource-aware thread creation to prevent creating too many threads when the number of particles/dimensions is large; (ii) designing parallel techniques to initialize swarm particles with fast random number generation; (iii) exploiting GPU memory caching to manage swarm information instead of allocating new memory and (iv) developing a schema to support customized swarm evaluation functions. We conduct extensive experiments on four optimization applications to study the efficiency of our algorithm called “FastPSO”. Experimental results show that FastPSO consistently outperforms the existing CPU-based PSO libraries by two orders of magnitude, and transcends the existing GPU-based implementation by 5 to 7 times, while achieving better or competitive optimization results.

References

  1. Enrique Alba, Gabriel Luque, and Sergio Nesmachnow. 2013. Parallel metaheuristics: recent advances and new trends. International Transactions in Operational Research 20, 1 (2013), 1–48.Google ScholarGoogle ScholarCross RefCross Ref
  2. Emilio Fortunato Campana, Matteo Diez, Giovanni Fasano, and Daniele Peri. 2013. Initial particles position for PSO, in bound constrained optimization. In International Conference in Swarm Intelligence. Springer, 112–119.Google ScholarGoogle ScholarCross RefCross Ref
  3. Songsak Chusanapiputt, Dulyatat Nualhong, Sujate Jantarang, and Sukumvit Phoomvuthisarn. 2005. Relative velocity updating in parallel particle swarm optimization based lagrangian relaxation for large-scale unit commitment problem. In TENCON 2005-2005 IEEE Region 10 Conference. IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  4. Ashraf Darwish, Aboul Ella Hassanien, and Swagatam Das. 2020. A survey of swarm and evolutionary computing approaches for deep learning. Artificial Intelligence Review 53, 3 (2020), 1767–1812.Google ScholarGoogle ScholarCross RefCross Ref
  5. Marco Dorigo, Mauro Birattari, and Thomas Stutzle. 2006. Ant colony optimization. IEEE Computational Intelligence Magazine 1, 4 (2006), 28–39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dennis Gies and Yahya Rahmat-Samii. 2003. Reconfigurable array design using parallel particle swarm optimization. In IEEE Antennas and Propagation Society International Symposium. Digest. Held in conjunction with: USNC/CNC/URSI North American Radio Sci. Meeting (Cat. No. 03CH37450), Vol. 1. IEEE, 177–180.Google ScholarGoogle ScholarCross RefCross Ref
  7. John L Gustafson. 1988. Reevaluating Amdahl’s law. Commun. ACM 31, 5 (1988), 532–533.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hashim A Hashim and Mohammad A Abido. 2019. Location management in LTE networks using multi-objective particle swarm optimization. Computer Networks 157(2019), 78–88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Roger A Horn. 1990. The hadamard product. In Proc. Symp. Appl. Math, Vol. 40. 87–169.Google ScholarGoogle ScholarCross RefCross Ref
  10. Md Maruf Hussain, Hiroshi Hattori, and Noriyuki Fujimoto. 2016. A CUDA implementation of the standard particle swarm optimization. In 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). IEEE, 219–226.Google ScholarGoogle ScholarCross RefCross Ref
  11. Jing Jiang, Fei Han, Qinghua Ling, Jie Wang, Tiange Li, and Henry Han. 2020. Efficient network architecture search via multiobjective particle swarm optimization based on decomposition. Neural Networks 123(2020), 305–316.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Francisco Erivaldo Fernandes Junior and Gary G Yen. 2019. Particle swarm optimization of deep neural networks architectures for image classification. Swarm and Evolutionary Computation 49 (2019), 62–74.Google ScholarGoogle ScholarCross RefCross Ref
  13. Dervis Karaboga. 2010. Artificial bee colony algorithm. Scholarpedia 5, 3 (2010), 6915.Google ScholarGoogle ScholarCross RefCross Ref
  14. Massimiliano Kaucic. 2013. A multi-start opposition-based particle swarm optimization algorithm with adaptive velocity for bound constrained global optimization. Journal of Global Optimization 55, 1 (2013), 165–188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. James Kennedy and Russell Eberhart. 1995. Particle swarm optimization. In Proceedings of ICNN’95-International Conference on Neural Networks, Vol. 4. IEEE, 1942–1948.Google ScholarGoogle ScholarCross RefCross Ref
  16. Byung-Il Koh, Alan D George, Raphael T Haftka, and Benjamin J Fregly. 2006. Parallel asynchronous particle swarm optimization. Internat. J. Numer. Methods Engrg. 67, 4 (2006), 578–595.Google ScholarGoogle ScholarCross RefCross Ref
  17. Gerardo A Laguna-Sánchez, Mauricio Olguín-Carbajal, Nareli Cruz-Cortés, Ricardo Barrón-Fernández, and Jesús A Álvarez-Cedillo. 2009. Comparative study of parallel variants for a particle swarm optimization algorithm implemented on a multithreading GPU. Journal of Applied Research and Technology 7, 3 (2009), 292–307.Google ScholarGoogle ScholarCross RefCross Ref
  18. Andrew W McNabb, Christopher K Monson, and Kevin D Seppi. 2007. Parallel PSO using MapReduce. In 2007 IEEE Congress on Evolutionary Computation. IEEE, 7–14.Google ScholarGoogle ScholarCross RefCross Ref
  19. Lester James Miranda. 2018. PySwarms: a research toolkit for particle swarm optimization in Python. Journal of Open Source Software 3, 21 (2018), 433.Google ScholarGoogle ScholarCross RefCross Ref
  20. Marcin Molga and Czesław Smutnicki. 2005. Test functions for optimization needs. Test Functions for Optimization Needs 101 (2005), 48.Google ScholarGoogle Scholar
  21. Luca Mussi, Fabio Daolio, and Stefano Cagnoni. 2011. Evaluation of parallel particle swarm optimization algorithms within the CUDA-TM architecture. Information Sciences 181, 20 (2011), 4642–4657.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Anand Nayyar and Nhu Gia Nguyen. 2018. Introduction to swarm intelligence. Advances in Swarm Intelligence for Optimizing Problems in Computer Science (2018), 53–78.Google ScholarGoogle Scholar
  23. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine learning in Python. the Journal of Machine Learning Research 12 (2011), 2825–2830.Google ScholarGoogle Scholar
  24. Vincent Roberge and Mohammed Tarbouchi. 2012. Parallel particle swarm optimization on graphical processing unit for pose estimation. WSEAS Trans. Comput 11, 6 (2012), 170–179.Google ScholarGoogle Scholar
  25. Shane Ryoo, Christopher I Rodrigues, Sara S Baghsorkhi, Sam S Stone, David B Kirk, and Wen-mei W Hwu. 2008. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 73–82.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. JF Schutte, BJ Fregly, RT Haftka, and AD George. 2003. A parallel particle swarm optimizer. Technical Report. FLORIDA UNIV GAINESVILLE MECHANICAL AND AEROSPACE ENGINEERING.Google ScholarGoogle Scholar
  27. Andrea Serani, Cecilia Leotardi, Umberto Iemma, Emilio F Campana, Giovanni Fasano, and Matteo Diez. 2016. Parameter selection in synchronous and asynchronous deterministic particle swarm optimization for ship hydrodynamics problems. Applied Soft Computing 49 (2016), 313–334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ying Tan and Ke Ding. 2015. A survey on GPU-based implementation of swarm intelligence algorithms. IEEE Transactions on Cybernetics 46, 9 (2015), 2028–2041.Google ScholarGoogle ScholarCross RefCross Ref
  29. Gerhard Venter and Jaroslaw Sobieszczanski-Sobieski. 2006. Parallel particle swarm optimization algorithm accelerated by asynchronous evaluations. Journal of Aerospace Computing, Information, and Communication 3, 3(2006), 123–137.Google ScholarGoogle ScholarCross RefCross Ref
  30. Vasily Volkov. 2010. Better performance at lower occupancy. In Proceedings of the GPU Technology Conference, GTC, Vol. 10. San Jose, CA, 16.Google ScholarGoogle Scholar
  31. Mark P Wachowiak, Mitchell C Timson, and David J DuVal. 2017. Adaptive particle swarm optimization with heterogeneous multicore parallelism and GPU acceleration. IEEE Transactions on Parallel and Distributed Systems 28, 10 (2017), 2784–2793.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zeyi Wen, Hanfeng Liu, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. 2020. ThunderGBM: Fast GBDTs and Random Forests on GPUs. Journal of Machine Learning Research 21, 108 (2020), 1–5.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICPP '21: Proceedings of the 50th International Conference on Parallel Processing
    August 2021
    927 pages
    ISBN:9781450390682
    DOI:10.1145/3472456

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 5 October 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate91of313submissions,29%
  • Article Metrics

    • Downloads (Last 12 months)40
    • Downloads (Last 6 weeks)7

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format