research-article

Public Access

Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing

Authors:
Hui Guan

North Carolina State University

North Carolina State University
View Profile

,
Xipeng Shen

North Carolina State University

North Carolina State University
View Profile

,
Hamid Krim

North Carolina State University

North Carolina State University
View Profile

SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2017Article No.: 10Pages 1–14https://doi.org/10.1145/3126908.3126961

Published:12 November 2017Publication History

SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Pages 1–14

ABSTRACT

Achieving high performance on modern systems is challenging. Even with a detailed profile from a performance tool, writing or refactoring a program to remove its performance issues is still a daunting task for application programmers: it demands lots of program optimization expertise that is often system specific.

Vendors often provide some detailed optimization guides to assist programmers in the process. However, these guides are frequently hundreds of pages long, making it difficult for application programmers to master and memorize all the rules and guidelines and properly apply them to a specific problem instance.

In this work, we develop a framework named Egeria to alleviate the difficulty. Through Egeria, one can easily construct an advising tool for a certain high performance computing (HPC) domain (e.g., GPU programming) by providing Egeria with a optimization guide or other related documents for the target domain. An advising tool produced by Egeria provides a concise list of essential rules automatically extracted from the documents. At the same time, the advising tool serves as a question-answer agent that can interactively offers suggestions for specific optimization questions. Egeria is made possible through a distinctive multi-layered design that leverages natural language processing techniques and extends them with knowledge of HPC domains and how to extract information relevant to code optimization Experiments on CUDA, OpenCL, and Xeon Phi programming guides demonstrate, both qualitatively and quantitatively, the usefulness of Egeria for HPC.

References

Laksono Adhianto, Sinchan Banerjee, Mike Fagan, Mark Krentel, Gabriel Marin, John Mellor-Crummey, and Nathan R Tallent. 2010. HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience 22, 6 (2010), 685--701. Google ScholarCross Ref
Steven Bird. 2006. NLTK: the natural language toolkit. In Proceedings of the COLING/ACL on Interactive presentation sessions. Association for Computational Linguistics, 69--72. Google ScholarDigital Library
Bryan R Buck and Jeffrey K Hollingsworth. 2004. Data centric cache measurement on the Intel ltanium 2 processor. In Proceedings of the 2004 ACM/IEEE conference on Supercomputing. IEEE Computer Society, 58. Google ScholarDigital Library
Xavier Carreras and Lluís Màrquez. 2005. Introduction to the CoNLL-2005 shared task: Semantic role labeling. In Proceedings of the Ninth Conference on Computational Natural Language Learning. Association for Computational Linguistics, 152--164. Google ScholarDigital Library
Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, Aug (2011), 2493--2537. Google ScholarDigital Library
Shane Cook. 2012. CUDA programming: a developer's guide to parallel computing with GPUs. Newnes. Google ScholarDigital Library
Dipanjan Das and André FT Martins. 2007. A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU 4 (2007), 192--195.Google Scholar
Marie-Catherine De Marneffe and Christopher D Manning. 2008. Stanford typed dependencies manual. Technical Report. Technical report, Stanford University.Google Scholar
Daniel J Dean, Hiep Nguyen, Peipei Wang, Xiaohui Gu, Anca Sailer, and Andrzej Kochut. 2016. PerfCompass: Online Performance Anomaly Fault Localization and Inference in Infrastructure-as-a-Service Clouds. IEEE Transactions on Parallel and Distributed Systems 27, 6 (2016), 1742--1755.Google ScholarCross Ref
Paul J Drongowski, AMD Code Analyst Team, and Boston Design Center.2008. An introduction to analysis and optimization with AMD CodeAnalyst Performance Analyzer. Advanced Micro Devices, Inc (2008).Google Scholar
David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A Kalyanpur, Adam Lally, J William Murdock, Eric Nyberg, John Prager, and others. 2010. Building Watson: An overview of the DeepQA project. AI magazine 31, 3 (2010), 59--79.Google Scholar
David Ferrucci, Anthony Levas, Sugato Bagchi, David Gondek, and Erik T Mueller. 2013. Watson: beyond jeopardy! Artificial Intelligence 199 (2013), 93--105. Google ScholarDigital Library
Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological bulletin 76, 5 (1971), 378.Google Scholar
Susan L Graham, Peter B Kessler, and Marshall K Mckusick. 1982. Gprof: A call graph execution profiler. In ACM Sigplan Notices, Vol. 17. ACM, 120--126. Google ScholarDigital Library
Sandra Kübler, Ryan McDonald, and Joakim Nivre. 2009. Dependency parsing. Synthesis Lectures on Human Language Technologies 1, 1 (2009), 1--127.Google ScholarCross Ref
Renaud Lachaize, Baptiste Lepers, and Vivien Quéma. 2012. MemProf: A Memory Profiler for NUMA Multicore Systems. In Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC 12). 53--64. Google ScholarDigital Library
John Levon and Philippe Elie. 2004. Oprofile: A system profiler for linux. (2004).Google Scholar
Xu Liu and John Mellor-Crummey. 2011. Pinpointing data locality problems using data-centric analysis. In Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on. IEEE, 171--180. Google ScholarDigital Library
Xu Liu and John Mellor-Crummey. 2013. Pinpointing data locality bottlenecks with low overhead. In Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on. IEEE, 183--193.Google ScholarCross Ref
Senthil Mani, Rose Catherine, Vibha Singhal Sinha, and Avinava Dubey. 2012. Ausum: approach for unsupervised bug report summarization. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. ACM, 11. Google ScholarDigital Library
Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit.. In ACL (System Demonstrations). 55--60.Google Scholar
Collin McCurdy and Jeffrey Vetter. 2010. Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms. In Performance Analysis of Systems & Software (ISPASS), 2010 IEEE International Symposium on. IEEE, 87--96.Google ScholarCross Ref
Aaftab Munshi, Benedict Gaster, Timothy G Mattson, and Dan Ginsburg. 2011. OpenCL programming guide. Pearson Education. Google ScholarDigital Library
CUDA NVidia. 2014. CUDA Profiler Users Guide (Version 6.5): NVIDIA. Santa Clara, CA, USA (2014), 87.Google Scholar
Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles. Computational linguistics 31, 1 (2005), 71--106. Google ScholarDigital Library
V. Punyakanok, D. Roth, and W. Yih. 2008. The Importance of Syntactic Parsing and Inference in Semantic Role Labeling. Computational Linguistics 34, 2 (2008). http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi07.pdf Google ScholarDigital Library
Inc. Qualcomm Technologies. 2016. Qualcomm Snapdragon Profiler Quick Start Guide. (2016).Google Scholar
Radim Rehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45--50.Google Scholar
James Reinders. 2005. VTune performance analyzer essentials. Intel Press.Google Scholar
Michael Roth and Mirella Lapata. 2016. Neural Semantic Role Labeling with Dependency Path Embeddings. CoRR abs/1605.07515 (2016). http://arxiv.org/abs/1605.07515Google Scholar
AMD Developer Tools Team. 2013. CodeXL Quick Start Guide. (2013). Retrieved Dec. 14, 2016 fromhttp://developer.amd.com/tools-and-sdks/opencl-zone/codexlGoogle Scholar
AMD Developer Tools Team. 2016. GPU PerfStudio. (2016). Retrieved Dec. 14, 2016 from http://developer.amd.com/tools-and-sdks/graphics-development/gpu-perfstudioGoogle Scholar
Yuan Tian, David Lo, and Chengnian Sun. 2012. Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In 2012 19th Working Conference on Reverse Engineering. IEEE, 215--224. Google ScholarDigital Library
Peter D Turney, Patrick Pantel, and others. 2010. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research 37, 1 (2010), 141--188. Google ScholarCross Ref
Xin Ye, Razvan Bunescu, and Chang Liu. 2014. Learning to rank relevant files for bug reports using domain knowledge. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 689--699. Google ScholarDigital Library
Yu Zhou, Yanxiang Tong, Ruihang Gu, and Harald Gall. 2016. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process (2016). Google ScholarDigital Library

Index Terms

Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. General and reference
  1. Cross-computing tools and techniques
    1. Performance

Recommendations

Evaluation of Rodinia Codes on Intel Xeon Phi
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and Simulation

High performance computing (HPC) is a niche area where various parallel benchmarks are constantly used to explore and evaluate the performance of Heterogeneous computing systems on the horizon. The Rodinia benchmark suite, a collection of parallel ...
Read More
MIC acceleration of short-range molecular dynamics simulations
COSMIC '13: Proceedings of the First International Workshop on Code OptimiSation for MultI and many Cores

Heterogeneous systems containing accelerators such as GPUs or co-processors such as Intel MIC are becoming more prevalent due to their ability of exploiting large-scale parallelism in applications. In this paper, we have developed a hierarchical ...
Read More
An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs
ICS '12: Proceedings of the 26th ACM international conference on Supercomputing

In heterogeneous systems that include CPUs and GPUs, the data transfers between these components play a critical role in determining the performance of applications. Software pipelining is a common approach to mitigate the overheads of those transfers. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2017
801 pages
ISBN:9781450351140
DOI:10.1145/3126908
General Chair:
Bernd Mohr
Jülich Supercomputing Center, Jülich, Germany
,
Program Chair:
Padma Raghavan
Vanderbilt University, Nashville, TN
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
high performance computing
natural language processing
program optimization
Qualifiers
- research-article
Conference

Acceptance Rates
SC '17 Paper Acceptance Rate61of327submissions,19%Overall Acceptance Rate1,516of6,373submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 400
  Total Downloads
- Downloads (Last 12 months)40
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing

SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evaluation of Rodinia Codes on Intel Xeon Phi

MIC acceleration of short-range molecular dynamics simulations

An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing

SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evaluation of Rodinia Codes on Intel Xeon Phi

MIC acceleration of short-range molecular dynamics simulations

An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media