skip to main content
10.1145/3126908.3126961acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Public Access

Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing

Published:12 November 2017Publication History

ABSTRACT

Achieving high performance on modern systems is challenging. Even with a detailed profile from a performance tool, writing or refactoring a program to remove its performance issues is still a daunting task for application programmers: it demands lots of program optimization expertise that is often system specific.

Vendors often provide some detailed optimization guides to assist programmers in the process. However, these guides are frequently hundreds of pages long, making it difficult for application programmers to master and memorize all the rules and guidelines and properly apply them to a specific problem instance.

In this work, we develop a framework named Egeria to alleviate the difficulty. Through Egeria, one can easily construct an advising tool for a certain high performance computing (HPC) domain (e.g., GPU programming) by providing Egeria with a optimization guide or other related documents for the target domain. An advising tool produced by Egeria provides a concise list of essential rules automatically extracted from the documents. At the same time, the advising tool serves as a question-answer agent that can interactively offers suggestions for specific optimization questions. Egeria is made possible through a distinctive multi-layered design that leverages natural language processing techniques and extends them with knowledge of HPC domains and how to extract information relevant to code optimization Experiments on CUDA, OpenCL, and Xeon Phi programming guides demonstrate, both qualitatively and quantitatively, the usefulness of Egeria for HPC.

References

  1. Laksono Adhianto, Sinchan Banerjee, Mike Fagan, Mark Krentel, Gabriel Marin, John Mellor-Crummey, and Nathan R Tallent. 2010. HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency and Computation: Practice and Experience 22, 6 (2010), 685--701. Google ScholarGoogle ScholarCross RefCross Ref
  2. Steven Bird. 2006. NLTK: the natural language toolkit. In Proceedings of the COLING/ACL on Interactive presentation sessions. Association for Computational Linguistics, 69--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bryan R Buck and Jeffrey K Hollingsworth. 2004. Data centric cache measurement on the Intel ltanium 2 processor. In Proceedings of the 2004 ACM/IEEE conference on Supercomputing. IEEE Computer Society, 58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Xavier Carreras and Lluís Màrquez. 2005. Introduction to the CoNLL-2005 shared task: Semantic role labeling. In Proceedings of the Ninth Conference on Computational Natural Language Learning. Association for Computational Linguistics, 152--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, Aug (2011), 2493--2537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Shane Cook. 2012. CUDA programming: a developer's guide to parallel computing with GPUs. Newnes. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dipanjan Das and André FT Martins. 2007. A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU 4 (2007), 192--195.Google ScholarGoogle Scholar
  8. Marie-Catherine De Marneffe and Christopher D Manning. 2008. Stanford typed dependencies manual. Technical Report. Technical report, Stanford University.Google ScholarGoogle Scholar
  9. Daniel J Dean, Hiep Nguyen, Peipei Wang, Xiaohui Gu, Anca Sailer, and Andrzej Kochut. 2016. PerfCompass: Online Performance Anomaly Fault Localization and Inference in Infrastructure-as-a-Service Clouds. IEEE Transactions on Parallel and Distributed Systems 27, 6 (2016), 1742--1755.Google ScholarGoogle ScholarCross RefCross Ref
  10. Paul J Drongowski, AMD Code Analyst Team, and Boston Design Center.2008. An introduction to analysis and optimization with AMD CodeAnalyst Performance Analyzer. Advanced Micro Devices, Inc (2008).Google ScholarGoogle Scholar
  11. David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A Kalyanpur, Adam Lally, J William Murdock, Eric Nyberg, John Prager, and others. 2010. Building Watson: An overview of the DeepQA project. AI magazine 31, 3 (2010), 59--79.Google ScholarGoogle Scholar
  12. David Ferrucci, Anthony Levas, Sugato Bagchi, David Gondek, and Erik T Mueller. 2013. Watson: beyond jeopardy! Artificial Intelligence 199 (2013), 93--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Joseph L Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological bulletin 76, 5 (1971), 378.Google ScholarGoogle Scholar
  14. Susan L Graham, Peter B Kessler, and Marshall K Mckusick. 1982. Gprof: A call graph execution profiler. In ACM Sigplan Notices, Vol. 17. ACM, 120--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sandra Kübler, Ryan McDonald, and Joakim Nivre. 2009. Dependency parsing. Synthesis Lectures on Human Language Technologies 1, 1 (2009), 1--127.Google ScholarGoogle ScholarCross RefCross Ref
  16. Renaud Lachaize, Baptiste Lepers, and Vivien Quéma. 2012. MemProf: A Memory Profiler for NUMA Multicore Systems. In Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC 12). 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. John Levon and Philippe Elie. 2004. Oprofile: A system profiler for linux. (2004).Google ScholarGoogle Scholar
  18. Xu Liu and John Mellor-Crummey. 2011. Pinpointing data locality problems using data-centric analysis. In Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on. IEEE, 171--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xu Liu and John Mellor-Crummey. 2013. Pinpointing data locality bottlenecks with low overhead. In Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on. IEEE, 183--193.Google ScholarGoogle ScholarCross RefCross Ref
  20. Senthil Mani, Rose Catherine, Vibha Singhal Sinha, and Avinava Dubey. 2012. Ausum: approach for unsupervised bug report summarization. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. ACM, 11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit.. In ACL (System Demonstrations). 55--60.Google ScholarGoogle Scholar
  22. Collin McCurdy and Jeffrey Vetter. 2010. Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms. In Performance Analysis of Systems & Software (ISPASS), 2010 IEEE International Symposium on. IEEE, 87--96.Google ScholarGoogle ScholarCross RefCross Ref
  23. Aaftab Munshi, Benedict Gaster, Timothy G Mattson, and Dan Ginsburg. 2011. OpenCL programming guide. Pearson Education. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. CUDA NVidia. 2014. CUDA Profiler Users Guide (Version 6.5): NVIDIA. Santa Clara, CA, USA (2014), 87.Google ScholarGoogle Scholar
  25. Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles. Computational linguistics 31, 1 (2005), 71--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. V. Punyakanok, D. Roth, and W. Yih. 2008. The Importance of Syntactic Parsing and Inference in Semantic Role Labeling. Computational Linguistics 34, 2 (2008). http://cogcomp.cs.illinois.edu/papers/PunyakanokRoYi07.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Inc. Qualcomm Technologies. 2016. Qualcomm Snapdragon Profiler Quick Start Guide. (2016).Google ScholarGoogle Scholar
  28. Radim Rehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45--50.Google ScholarGoogle Scholar
  29. James Reinders. 2005. VTune performance analyzer essentials. Intel Press.Google ScholarGoogle Scholar
  30. Michael Roth and Mirella Lapata. 2016. Neural Semantic Role Labeling with Dependency Path Embeddings. CoRR abs/1605.07515 (2016). http://arxiv.org/abs/1605.07515Google ScholarGoogle Scholar
  31. AMD Developer Tools Team. 2013. CodeXL Quick Start Guide. (2013). Retrieved Dec. 14, 2016 fromhttp://developer.amd.com/tools-and-sdks/opencl-zone/codexlGoogle ScholarGoogle Scholar
  32. AMD Developer Tools Team. 2016. GPU PerfStudio. (2016). Retrieved Dec. 14, 2016 from http://developer.amd.com/tools-and-sdks/graphics-development/gpu-perfstudioGoogle ScholarGoogle Scholar
  33. Yuan Tian, David Lo, and Chengnian Sun. 2012. Information retrieval based nearest neighbor classification for fine-grained bug severity prediction. In 2012 19th Working Conference on Reverse Engineering. IEEE, 215--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Peter D Turney, Patrick Pantel, and others. 2010. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research 37, 1 (2010), 141--188. Google ScholarGoogle ScholarCross RefCross Ref
  35. Xin Ye, Razvan Bunescu, and Chang Liu. 2014. Learning to rank relevant files for bug reports using domain knowledge. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 689--699. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yu Zhou, Yanxiang Tong, Ruihang Gu, and Harald Gall. 2016. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
        November 2017
        801 pages
        ISBN:9781450351140
        DOI:10.1145/3126908
        • General Chair:
        • Bernd Mohr,
        • Program Chair:
        • Padma Raghavan

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 November 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SC '17 Paper Acceptance Rate61of327submissions,19%Overall Acceptance Rate1,516of6,373submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader