short-paper

Optimizing codes on the Xeon Phi: a case-study with LAMMPS

Authors:

Adam Jundt,

Ananta Tiwari,

William A. Ward, Jr.,

Roy Campbell,

Laura CarringtonAuthors Info & Claims

XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure

Article No.: 28, Pages 1 - 2

https://doi.org/10.1145/2792745.2792773

Published: 26 July 2015 Publication History

Get Access

Abstract

Intel's Xeon Phi co-processor has the potential to provide an impressive 4 GFlops/Watt while promising users that they need only to recompile their code to get it to run on the accelerator. This paper reports our experience on running LAMMPS, a widely-used molecular dynamics code, on the Xeon Phi and the steps we took to optimize its performance on the device. Using performance analysis tools to pinpoint bottlenecks in the code, we were able to achieve a speedup of 2.8x from running the original code on the host processors vs. the optimized code on the Xeon Phi. These optimizations also resulted in an improved LAMMPS' performance on the host -- speeding up the execution by 7x.

References

[1]

Intel(R) Xeon Phi(TM) Coprocessor 7120P. http://tinyurl.com/pjyyya3.

Google Scholar

[2]

LAMMPS* for Intel(R) Xeon Ph(TM) Coprocessor. http://tinyurl.com/kknb59u.

Google Scholar

[3]

LAMMPS Molecular Dynamics Simulator. http://lammps.sandia.gov/.

Google Scholar

[4]

M. A. Laurenzano, M. M. Tikir, L. Carrington, and A. Snavely. Pebil: Efficient static binary instrumentation for linux. In Performance Analysis of Systems & Software (ISPASS), 2010 IEEE International Symposium on, pages 175--183. IEEE, 2010.

Crossref

Google Scholar

[5]

B. Li, H.-C. Chang, S. Song, C.-Y. Su, T. Meyer, J. Mooring, and K. Cameron. The power-performance tradeoffs of the intel xeon phi on hpc applications. In Parallel Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International, May 2014.

Digital Library

Google Scholar

[6]

S. Pennycook, C. Hughes, M. Smelyanskiy, and S. Jarvis. Exploring simd for molecular dynamics, using intel xeon processors and intel xeon phi coprocessors. In Parallel Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, pages 1085--1097, May 2013.

Digital Library

Google Scholar

[7]

J. Peraza, A. Tiwari, M. Laurenzano, L. Carrington, W. Ward, and R. Campbell. Understanding the performance of stencil computations on intel's xeon phi. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1--5, Sept 2013.

Crossref

Google Scholar

[8]

TACC Stampede User Guide. https://portal.xsede.org/tacc-stampede.

Google Scholar

[9]

Top500 Supercomputer Sites. http://www.top500.org/.

Google Scholar

Cited By

View all

Moustafa SKirschenmann WDupros FAochi H(2018)Task-Based Programming on Emerging Parallel Architectures for Finite-Differences Seismic Numerical KernelEuro-Par 2018: Parallel Processing10.1007/978-3-319-96983-1_54(764-777)Online publication date: 1-Aug-2018
https://doi.org/10.1007/978-3-319-96983-1_54
Lawson GSosonkina MEzer TShen Y(2017)Applying EMD/HHT analysis to power traces of applications executed on systems with Intel Xeon PhiThe International Journal of High Performance Computing Applications10.1177/1094342017731612(109434201773161)Online publication date: 31-Oct-2017
https://doi.org/10.1177/1094342017731612
Tiwari ACauble‐Chantrenne AJundt APeraza JLöhner RBaum JCarrington L(2017)Running large‐scale CFD applications on Intel‐KNL–based clustersInternational Journal for Numerical Methods in Fluids10.1002/fld.447486:11(699-716)Online publication date: 14-Nov-2017
https://doi.org/10.1002/fld.4474

Index Terms

Optimizing codes on the Xeon Phi: a case-study with LAMMPS

Recommendations

Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and Simulation

High performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers
Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors
IPDPSW '13: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum

Intel® Xeon Phi™ coprocessor is based on the Intel® Many Integrated Core (Intel® MIC) architecture, which is an innovative new processor architecture that combines abundant thread parallelism with long SIMD vector units. Efficiently exploiting SIMD ...

Comments

Information & Contributors

Information

Published In

XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure

July 2015

296 pages

ISBN:9781450337205

DOI:10.1145/2792745

General Chair:
Gregory D. Peterson
National Institute of Computational Sciences

© 2015 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Short-paper

Funding Sources

Air Force Office of Scientific Research
U.S. Department of Defense
DoD High Performance Computing Modernization Program's PETTT program
Air Force Office of Scientific Research under AFOSR

Conference

XSEDE '15

Sponsor:

San Diego Super Computing Ctr
HPCWire
Omnibond
Indiana University
CASC
NICS
Intel
DDN
CORSA
ALLINEA
RENCI

XSEDE '15: Extreme Science Engineering Discovery Environment 2015 Conference

July 26 - 30, 2015

Missouri, St. Louis

Acceptance Rates

XSEDE '15 Paper Acceptance Rate 49 of 70 submissions, 70%;

Overall Acceptance Rate 129 of 190 submissions, 68%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
132
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Moustafa SKirschenmann WDupros FAochi H(2018)Task-Based Programming on Emerging Parallel Architectures for Finite-Differences Seismic Numerical KernelEuro-Par 2018: Parallel Processing10.1007/978-3-319-96983-1_54(764-777)Online publication date: 1-Aug-2018
https://doi.org/10.1007/978-3-319-96983-1_54
Lawson GSosonkina MEzer TShen Y(2017)Applying EMD/HHT analysis to power traces of applications executed on systems with Intel Xeon PhiThe International Journal of High Performance Computing Applications10.1177/1094342017731612(109434201773161)Online publication date: 31-Oct-2017
https://doi.org/10.1177/1094342017731612
Tiwari ACauble‐Chantrenne AJundt APeraza JLöhner RBaum JCarrington L(2017)Running large‐scale CFD applications on Intel‐KNL–based clustersInternational Journal for Numerical Methods in Fluids10.1002/fld.447486:11(699-716)Online publication date: 14-Nov-2017
https://doi.org/10.1002/fld.4474

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Evaluation of Rodinia Codes on Intel Xeon Phi

Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers

Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations

Practical SIMD Vectorization Techniques for Intel® Xeon Phi Coprocessors