Skip to main content

A GPU-Accelerated Parallel Preconditioner for the Solution of the Boltzmann Transport Equation for Semiconductors

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7174))

Abstract

The solution of large systems of linear equations is typically achieved by iterative methods. The rate of convergence of these methods can be substantially improved by the use of preconditioners, which can be either applied in a black-box fashion to the linear system, or exploit properties specific to the underlying problem for maximum efficiency. However, with the shift towards multi- and many-core computing architectures, the design of sufficiently parallel preconditioners is increasingly challenging.

This work presents a parallel preconditioning scheme for a state-of-the-art semiconductor device simulator and allows for the acceleration of the iterative solution process of the resulting system of linear equations. The method is based on physical properties of the underlying system of partial differential equations and results in a block preconditioner scheme, where each block can be computed in parallel by established serial preconditioners. The efficiency of the proposed scheme is confirmed by numerical experiments using a serial incomplete LU factorization preconditioner, which is accelerated by one order of magnitude on both multi-core central processing units and graphics processing units with the proposed scheme.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boost C++ libraries, http://www.boost.org/

  2. Bordawekar, R., Bondhugula, U., Rao, R.: Can CPUs Match GPUs on Performance with Productivity? Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU. Technical report, IBM T. J. Watson Research Center (2010)

    Google Scholar 

  3. Cusp Library, http://code.google.com/p/cusp-library/

  4. Gnudi, A., Ventura, D., Baccarani, G.: One-dimensional Simulation of a Bipolar Transistor by means of Spherical Harmonics Expansion of the Boltzmann Transport Equation. In: Proc. SISDEP, vol. 4, pp. 205–213 (1991)

    Google Scholar 

  5. Gnudi, A., Ventura, D., Baccarani, G., Odeh, F.: Two-dimensional MOSFET Simulation by Means of a Multidimensional Spherical Harmonics Expansion of the Boltzmann Transport Equation. Solid-State Electr. 36(4), 575–581 (1993)

    Article  Google Scholar 

  6. Grote, M.J., Huckle, T.: Parallel Preconditioning with Sparse Approximate Inverses. SIAM J. Sci. Comput. 18, 838–853 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  7. Haase, G., Liebmann, M., Douglas, C., Plank, G.: A Parallel Algebraic Multigrid Solver on Graphics Processing Units. In: Zhang, W., Chen, Z., Douglas, C.C., Tong, W. (eds.) HPCA 2009. LNCS, vol. 5938, pp. 38–47. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  8. Heuveline, V., Lukarski, D., Weiss, J.P.: Enhanced Parallel ILU(p)-based Preconditioners for Multi-core CPUs and GPUs – The Power(q)-pattern Method. EMCL Preprint 2011-08, EMCL (2011)

    Google Scholar 

  9. Hong, S.M., Jungemann, C.: A Fully Coupled Scheme for a Boltzmann-Poisson Equation Solver based on a Spherical Harmonics Expansion. J. Comp. Electr. 8, 225–241 (2009)

    Article  Google Scholar 

  10. Jungemann, C., Pham, A.T., Meinerzhagen, B., Ringhofer, C., Bollhöfer, M.: Stable Discretization of the Boltzmann Equation based on Spherical Harmonics, Box Integration, and a Maximum Entropy Dissipation Principle. J. Appl. Phys. 100(2), 024502–+ (2006)

    Article  Google Scholar 

  11. Khronos Group. OpenCL, http://www.khronos.org/opencl/

  12. MAGMA library, http://icl.cs.utk.edu/magma/

  13. Nath, R., Tomov, S., Dongarra, J.: An Improved MAGMA GEMM For Fermi Graphics Processing Units. Intl. J. HPC Appl. 24(4), 511–515 (2010)

    Google Scholar 

  14. NVIDIA CUDA, http://www.nvidia.com/

  15. OpenMP, http://openmp.org/

  16. Rupp, K., Jüngel, A., Grasser, T.: Matrix Compression for Spherical Harmonics Expansions of the Boltzmann Transport Equation for Semiconductors. J. Comp. Phys. 229(23), 8750–8765 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  17. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics (2003)

    Google Scholar 

  18. Vassilevski, P.S.: Multilevel Block Factorization Preconditioners. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  19. ViennaCL, http://viennacl.sourceforge.net/

  20. van der Vorst, H.A.: Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Non-Symmetric Linear Systems. SIAM J. Sci. and Stat. Comp. 12, 631–644 (1992)

    Article  MATH  Google Scholar 

  21. Weinbub, J., Rupp, K., Selberherr, S.: Distributed Heterogenous High-Performance Computing with ViennaCL. In: Abstracts Intl. Conf. LSSC, pp. 88–90 (2011)

    Google Scholar 

  22. Xu, K., Ding, D.Z., Fan, Z.H., Chen, R.S.: FSAI Preconditioned CG Algorithm combined with GPU Technique for the Finite Element Analysis of Electromagnetic Scattering Problems. Finite Elem. Anal. Des. 47, 387–393 (2011)

    Article  Google Scholar 

  23. Zang, W., Du, G., Li, Q., Zhang, A., Mo, Z., Liu, X., Zhang, P.: A 3D Parallel Monte Carlo Simulator for Semiconductor Devices. In: Proc. IWCE, pp. 1–4 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rupp, K., Jüngel, A., Grasser, T. (2012). A GPU-Accelerated Parallel Preconditioner for the Solution of the Boltzmann Transport Equation for Semiconductors. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore - Challenge II. Lecture Notes in Computer Science, vol 7174. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30397-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30397-5_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30396-8

  • Online ISBN: 978-3-642-30397-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics