Skip to main content

Optimizing Sweep3D for Graphic Processor Unit

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6081))

Abstract

As a powerful and flexible processor, the Graphic Processing Unit (GPU) can offer great faculty in solving many high-performance computing applications. Sweep3D, which simulates a single group time-independent discrete ordinates (S n ) neutron transport deterministically on 3D Cartesian geometry space, represents the key part of a real ASCI application. The wavefront process for parallel computation in Sweep3D limits the concurrent threads on the GPU. In this paper, we present multi-dimensional optimization methods for Sweep3D, which can be efficiently implemented on the fine grained parallel architecture of the GPU. Our results show that the performance of overall Sweep3D on CPU-GPU hybrid platform can be improved up to 2.25 times as compared to the CPU-based implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nguyen, H.: GPU Gems 3. Addison Wesley, Reading (2007)

    Google Scholar 

  2. Kirk, D.: Innovation in graphics technology. In: Talk in Canadian Undergraduate Technology Conference (2004)

    Google Scholar 

  3. AMD Corporation: ATI Radeon HD 5870 Feature Summary, http://www.amd.com/

  4. NVIDIA Corporation: CUDA Programming Guide Version 2.1 (2008)

    Google Scholar 

  5. AMD Corporation: ATI Stream Computing User Guide Version 1.4.0a (2009)

    Google Scholar 

  6. Munshi, A.: The OpenCL Specification Version: 1.0. Khronos OpenCL Working Group (2009)

    Google Scholar 

  7. NVIDIA Corporation: Vertical solutions on CUDA, http://www.nvidia.com/object/vertical_solutions.html

  8. Mathis, M.M., Amato, N., Adams, M., Zhao, W.: A General Performance Model for Parallel Sweeps on Orthogonal Grids for Particle Transport Calculations. In: Proc. ACM Int. Conf. Supercomputing, pp. 255–263. ACM, New York (2000)

    Google Scholar 

  9. Hoisie, A., Lubeck, O., Wasserman, H.: Scalability analysis of multidimensional wavefront algorithms on large-scale SMP clusters. In: The 7th Symposium on the Frontiers of Massively Parallel Computation, pp. 4–15. IEEE Computer Society, Los Alamitos (1999)

    Chapter  Google Scholar 

  10. Hoisie, A., Lubeck, O., Wasserman, H.: Performance and scalability analysis of teraflop- scale parallel architectures using multidimensional wavefront applications. International Journal of High Performance Computing Applications 14(4), 330–346 (2000)

    Article  Google Scholar 

  11. Los Alamos National Laboratory: Sweep3D, http://wwwc3.lanl.gov/pal/software/sweep3d/

  12. Davis, K., Hoisie, A., Johnson, G., Kerbyson, D.J., Lang, M., Pakin, M., Petrini, F.: A Performance and Scalability Analysis of the BlueGene/L Architecture. In: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, pp. 41–50 (2004)

    Google Scholar 

  13. Barker, K.J., Davis, K., Hoisie, A., Kerbyson, D.J., Lang, M., Pakin, S., Sancho, J.C.: Entering the petaflop era: the architecture and performance of Roadrunner. In: Proceedings of the 2008 ACM/IEEE conference on Supercomputing (2008)

    Google Scholar 

  14. Lewis, E.E., Miller, W.F.: Computational Methods of Neutron Transport. American Nuclear Society, LaGrange Park (1993)

    Google Scholar 

  15. Koch, K., Baker, R., Alcouffe, R.: Solution of the First-Order Form of Three-Dimensional Discrete Ordinates Equations on a Massively Parallel Machine. Transactions of American Nuclear Society 65, 198–199 (1992)

    Google Scholar 

  16. Mathis, M.M., Kerbyson, D.J.: A General Performance Model of structured and Unstructured Mesh Particle Transport Computations. Journal of Supercomputing 34, 181–199 (2005)

    Article  Google Scholar 

  17. Kerbyson, D.J., Hoisie, A.: Analysis of Wavefront Algorithms on Large-scale Two-level Heterogeneous Processing Systems. In: Workshop on Unique Chips and Systems, pp. 259–279 (2006)

    Google Scholar 

  18. Petrini, F., Fossum, G., Fernandez, J., Varbanescu, A.L., Kistler, N., Perrone, M.: Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine. In: The 21th International Parallel and Distributed Processing Symposium (2007)

    Google Scholar 

  19. NVIDIA Corporation: NVIDIA Tesla S1070 1U Computing System, http://www.nvidia.com/object/product_tesla_s1070_us.html

  20. Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: Proceedings of the 2008 ACM/IEEE conference on Supercomputing (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gong, C., Liu, J., Gong, Z., Qin, J., Xie, J. (2010). Optimizing Sweep3D for Graphic Processor Unit. In: Hsu, CH., Yang, L.T., Park, J.H., Yeo, SS. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2010. Lecture Notes in Computer Science, vol 6081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13119-6_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13119-6_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13118-9

  • Online ISBN: 978-3-642-13119-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics