skip to main content
10.1145/2535753.2535754acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

A novel finite element method assembler for co-processors and accelerators

Published:17 November 2013Publication History

ABSTRACT

Finite element method (FEM) is a popular approach to solving Differential equations [5]. Among its many attractive features is its ability to handle complex geometries. The domain is discretised using simple elements whose local contributions are assembled into a global system of equations. This is in contrast to the finite difference method (FDM) which can typically only handle regular geometries. However before solution is possible the system of equations of the FEM has to be assembled, a procedure which can be significant to the computational performance of the FEM solver, particularly when coupled with highly parallel execution [3]. In this work we outline a new algorithm for achieving a highly parallel assembler routine compatible with Intel® Xeon Phi and GPU architectures. We also present performance comparison and analysis of our algorithm and the globalNZ algorithm outlined by Cecka et al. in [2], as implemented on Intel® Xeon Phi architecture and compare these to the serial implementation of Hughes [5].

References

  1. L. Buatois, G. Caumon, and B. LÅl'vy. Concurrent number cruncher: a gpu implementation of a general sparse linear solver. International Journal of Parallel, Emergent and Distributed Systems, 24(3): 205--223, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Cecka, A. J. Lew, and E. Darve. Assembly of finite element methods on graphics processors. International Journal for Numerical Methods in Engineering, 85(5): 640--669, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  3. C. Cecka, A. J. Lew, and E. Darve. GPU Computing Gems Jade Edition, chapter Application of Assembly of Finite Element Methods on Graphics Processors for Real-Time Elastodynamics. Applications of GPU Computing Series. Elsevier Science, 2011.Google ScholarGoogle Scholar
  4. R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM JOURNAL ON APPLIED MATHEMATICS, 17(2): 416--429, 1969.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Hughes. The finite element method: linear static and dynamic finite element analysis. Dover Civil and Mechanical Engineering Series. Dover Publications, 2000.Google ScholarGoogle Scholar
  6. J. Jeffers and J. Reinders. Intel Xeon Phi Coprocessor High Performance Programming. Elsevier Science, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Rao. The Finite Element Method in Engineering. Elsevier Science, 2010.Google ScholarGoogle Scholar
  8. N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey. Fast sort on cpus and gpus: a case for bandwidth oblivious simd sort. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, SIGMOD '10, pages 351--362, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey. Fast sort on cpus, gpus and intel mic architectures. Technical report, Intel Labs, 2010.Google ScholarGoogle Scholar
  10. E. Saule, K. Kaya, and U. V. Catalyurek. Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi. ArXiv e-prints, Feb. 2013.Google ScholarGoogle Scholar
  11. I. Smith and D. Griffiths. Programming the Finite Element Method. Wiley, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Wang, H. Klie, M. Parashar, and H. Sudan. Solving sparse linear systems on nvidia tesla gpus. In G. Allen, J. Nabrzyski, E. Seidel, G. Albada, J. Dongarra, and P. Sloot, editors, Computational Science - ICCS 2009, volume 5544 of Lecture Notes in Computer Science, pages 864--873. Springer Berlin Heidelberg, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    IA3 '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
    November 2013
    92 pages
    ISBN:9781450325035
    DOI:10.1145/2535753

    Copyright © 2013 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 17 November 2013

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    IA3 '13 Paper Acceptance Rate6of21submissions,29%Overall Acceptance Rate18of67submissions,27%
  • Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader