research-article

A novel finite element method assembler for co-processors and accelerators

Authors:
Nina Hanzlikova

Dublin City University, Dublin, Ireland

Dublin City University, Dublin, Ireland
View Profile

,
Eduardo Rocha Rodrigues

IBM Research, Rio de Janeiro, Brazil

IBM Research, Rio de Janeiro, Brazil
View Profile

IA³ '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and AlgorithmsNovember 2013Article No.: 1Pages 1–8https://doi.org/10.1145/2535753.2535754

Published:17 November 2013Publication History

IA³ '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms

Pages 1–8

ABSTRACT

Finite element method (FEM) is a popular approach to solving Differential equations [5]. Among its many attractive features is its ability to handle complex geometries. The domain is discretised using simple elements whose local contributions are assembled into a global system of equations. This is in contrast to the finite difference method (FDM) which can typically only handle regular geometries. However before solution is possible the system of equations of the FEM has to be assembled, a procedure which can be significant to the computational performance of the FEM solver, particularly when coupled with highly parallel execution [3]. In this work we outline a new algorithm for achieving a highly parallel assembler routine compatible with Intel® Xeon Phi and GPU architectures. We also present performance comparison and analysis of our algorithm and the globalNZ algorithm outlined by Cecka et al. in [2], as implemented on Intel® Xeon Phi architecture and compare these to the serial implementation of Hughes [5].

References

L. Buatois, G. Caumon, and B. LÅl'vy. Concurrent number cruncher: a gpu implementation of a general sparse linear solver. International Journal of Parallel, Emergent and Distributed Systems, 24(3): 205--223, 2009. Google ScholarDigital Library
C. Cecka, A. J. Lew, and E. Darve. Assembly of finite element methods on graphics processors. International Journal for Numerical Methods in Engineering, 85(5): 640--669, 2011.Google ScholarCross Ref
C. Cecka, A. J. Lew, and E. Darve. GPU Computing Gems Jade Edition, chapter Application of Assembly of Finite Element Methods on Graphics Processors for Real-Time Elastodynamics. Applications of GPU Computing Series. Elsevier Science, 2011.Google Scholar
R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM JOURNAL ON APPLIED MATHEMATICS, 17(2): 416--429, 1969.Google ScholarDigital Library
T. Hughes. The finite element method: linear static and dynamic finite element analysis. Dover Civil and Mechanical Engineering Series. Dover Publications, 2000.Google Scholar
J. Jeffers and J. Reinders. Intel Xeon Phi Coprocessor High Performance Programming. Elsevier Science, 2013. Google ScholarDigital Library
S. Rao. The Finite Element Method in Engineering. Elsevier Science, 2010.Google Scholar
N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey. Fast sort on cpus and gpus: a case for bandwidth oblivious simd sort. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, SIGMOD '10, pages 351--362, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey. Fast sort on cpus, gpus and intel mic architectures. Technical report, Intel Labs, 2010.Google Scholar
E. Saule, K. Kaya, and U. V. Catalyurek. Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi. ArXiv e-prints, Feb. 2013.Google Scholar
I. Smith and D. Griffiths. Programming the Finite Element Method. Wiley, 2004. Google ScholarDigital Library
M. Wang, H. Klie, M. Parashar, and H. Sudan. Solving sparse linear systems on nvidia tesla gpus. In G. Allen, J. Nabrzyski, E. Seidel, G. Albada, J. Dongarra, and P. Sloot, editors, Computational Science - ICCS 2009, volume 5544 of Lecture Notes in Computer Science, pages 864--873. Springer Berlin Heidelberg, 2009. Google ScholarDigital Library

Recommendations

Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede
XSEDE '13: Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery

In this work, we describe our preliminary experiences on the Stampede system in the context of the Uintah Computational Framework. Uintah was developed to provide an environment for solving a broad class of fluid-structure interaction problems on ...
Read More
Developmental directions in parallel accelerators
AusPDC '14: Proceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 152

Parallel accelerators such as massively-cored graphical processing units or many-cored co-processors such as the Xeon Phi are becoming widespread and affordable on many systems including blade servers and even desktops. The use of a single such ...
Read More
First results of performance comparisons on many-core processors in solving QAP with ACO: kepler GPU versus xeon PHI
GECCO Comp '14: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation

This paper compares the performance of parallel computation on two types of many-core processors, Tesla K20c GPU and Xeon Phi 5110P, in solving the quadratic assignment problem (QAP) with ant colony optimization (ACO). The results show that the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IA³ '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
November 2013
92 pages
ISBN:9781450325035
DOI:10.1145/2535753
Conference Chairs:
Antonino Tumeo
PNNL
,
John Feo
PNNL
,
Oreste Villa
NVIDIA
,
Simone Secchi
Università di Cagliari, Italy
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 November 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
FEM
GPU
MIC
Xeon Phi
algorithm
finite element method
Qualifiers
- research-article
Conference

Acceptance Rates
IA³ '13 Paper Acceptance Rate6of21submissions,29%Overall Acceptance Rate18of67submissions,27%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 129
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A novel finite element method assembler for co-processors and accelerators

IA³ '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms

ABSTRACT

References

Cited By

Recommendations

Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede

Developmental directions in parallel accelerators

First results of performance comparisons on many-core processors in solving QAP with ACO: kepler GPU versus xeon PHI

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A novel finite element method assembler for co-processors and accelerators

IA3 '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms

ABSTRACT

References

Cited By

Recommendations

Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede

Developmental directions in parallel accelerators

First results of performance comparisons on many-core processors in solving QAP with ACO: kepler GPU versus xeon PHI

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media

IA³ '13: Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms