Optimization of N-Queens Solvers on Graphics Processors

Zhang, Tao; Shu, Wei; Wu, Min-You

doi:10.1007/978-3-642-24151-2_11

Tao Zhang^19,20,
Wei Shu²⁰ &
Min-You Wu¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6965))

Included in the following conference series:

International Workshop on Advanced Parallel Processing Technologies

733 Accesses
12 Citations

Abstract

While graphics processing units (GPUs) show high performance for problems with regular structures, they do not perform well for irregular tasks due to the mismatches between irregular problem structures and SIMD-like GPU architectures. In this paper, we explore software approaches for improving the performance of irregular parallel computation on graphics processors. We propose general approaches that can eliminate the branch divergence and allow runtime load balancing. We evaluate the optimization rules and approaches with the n-queens problem benchmark. The experimental results show that the proposed approaches can substantially improve the performance of irregular computation on GPUs. These general approaches could be easily applied to many other irregular problems to improve their performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hussein, M., Abd-Almageed, W.: Efficnent Band Approximation of Gram Matrices for Large Scale Kernel Methods on GPUs. In: Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–10. ACM Press, New York (2009)
Chapter Google Scholar
Zhang, E.Z., Jiang, Y., Guo, Z., Shen, X.: Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping. In: 24th ACM International Conference on Supercomputing (ICS), pp. 115–126. ACM Press, New York (2010)
Chapter Google Scholar
Cederman, D., Tsigas, P.: On Dynamic Load Balancing on Graphics Processors. In: 23rd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, pp. 57–64. ACM Press, New York (2008)
Google Scholar
Tzeng, S., Patney, A., Owens, J.D.: Task Management for Irregular-ParallelWorkloads on the GPU. In: High Performance Graphics 2010, pp. 29–37. ACM Press, New York (2010)
Google Scholar
Aila, T., Laine, S.: Understanding the efficiency of ray traversal on GPUs. In: Proceedings of High Performance Graphics 2009, pp. 145–149. ACM Press, New York (2009)
Google Scholar
Solomon, S., Thulasiraman, P.: Performance Study of Mapping Irregular Computations on GPUs. In: 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8. IEEE Press, New York (2010)
Chapter Google Scholar
Deng, Y., Wang, B.D., Mu, S.: Taming Irregular EDA Applications on GPUs. In: Proceedings of the 2009 International Conference on Computer-Aided Design, pp. 539–546. ACM Press, New York (2009)
Google Scholar
Vuduc, R., Chandramowlishwaran, A., Choi, J.W., Guney, M.E., Shringarpure, A.: On the Limits of GPU Acceleration. In: Hot Topics in Parallelism (HotPar). USENIX Association, Berkeley (2010)
Google Scholar
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A Unified Graphics and Computing Architecture. J. IEEE Micro. 28, 39–55 (2008)
Article Google Scholar
Fung, W.W.L., Sham, I., Yuan, G., Aamodt, T.M.: Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 407–420. IEEE Press, New York (2007)
Chapter Google Scholar
Bell, J., Stevens, B.: A survey of known results and research areas for n-queens. J. Discrete Math. 309, 1–31 (2009)
Article MathSciNet MATH Google Scholar
Bozinovski, A., Bozinovski, S.: n-queenss pattern generation: an insight into space complexity of a backtracking algorithm. In: 2004 International Symposium on Information and Communication Technologies, pp. 281–286. Trinity College Dublin, Dublin (2004)
Google Scholar
Khan, S., Bilal, M., Sharif, M., Sajid, M., Baig, R.: Solution of n-Queen Problem Using ACO. In: IEEE 13th International Multitopic Conference (INMIC), pp. 1–5. IEEE Press, New York (2009)
Google Scholar
QUEESNTUD project, http://queens.inf.tu-dresden.de/
Shu, W., Wu, M.Y.: Asynchronous problems on SIMD parallel computers. J. IEEE Trans. on Parallel and Distributed Systems 6, 704–713 (1995)
Article Google Scholar
Blas, A.D., Hughey, R.: Explicit SIMD Programming for Asynchronous Applications. In: IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pp. 258–267. IEEE Press, New York (2000)
Chapter Google Scholar
Cull, P., Pandey, R.: Isomorphism and the n-queenss problem. J. ACM SIGCSE Bulletin 26, 29–36 (1994)
Article Google Scholar
NVIDIA CUDA C Programming Guide, http://developer.download.nvidia.com/compute/cuda/4_0_rc2/toolkit/docs/CUDA_C_Programming_Guide.pdf

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Tao Zhang & Min-You Wu
University of New Mexico, Albuquerque, USA
Tao Zhang & Wei Shu

Authors

Tao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Shu
View author publications
You can also search for this author in PubMed Google Scholar
Min-You Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA Saclay, Parc Club Universite, rue Jean Rostand,Batiment G, 91893, Orsay Cedex, France
Olivier Temam
Department of Computer Science and Engineering, University of Minnesota, 200 Union Street, SE, 55455, Minneapolis, MN, USA
Pen-Chung Yew
Fudan University, Software Building, 825 Zhangheng Road, 200433, Shanghai, China
Binyu Zang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, T., Shu, W., Wu, MY. (2011). Optimization of N-Queens Solvers on Graphics Processors. In: Temam, O., Yew, PC., Zang, B. (eds) Advanced Parallel Processing Technologies. APPT 2011. Lecture Notes in Computer Science, vol 6965. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24151-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-24151-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24150-5
Online ISBN: 978-3-642-24151-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics