Skip to main content

Visualization of OpenMP* Task Dependencies Using Intel® Advisor – Flow Graph Analyzer

  • Conference paper
  • First Online:
Evolving OpenMP for Evolving Architectures (IWOMP 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11128))

Included in the following conference series:

  • 820 Accesses

Abstract

With the introduction of task dependences, the OpenMP API considerably extended the expressiveness of its task-based parallel programming model. With task dependences, programmers no longer have to rely on global synchronization mechanisms like task barriers. Instead they can locally synchronize a restricted subset of generated tasks by expressing an execution order through the depend clause. With the OpenMP tools interface of Technical Report 6 of the OpenMP API specification, it becomes possible to monitor task creation and execution along with the corresponding dependence information of these tasks. We use this information to construct a Task Dependence Graph (TDG) for the Flow Graph Analyzer (FGA) tool of Intel® Advisor. The TDG representation is used in FGA for deriving metrics and performance prediction and analysis of task-based OpenMP codes. We apply the FGA tool to two sample application kernels and expose issues in their usage of OpenMP tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For improved contrast and readability of the dependence graphs, we have modified the FGA tool to use white background instead of the standard gray color for this paper.

References

  1. Tovinkere, V., Voss, M.: Flow graph designer: a tool for designing and analyzing Intel® threading building blocks flow graphs. In: 2014 43rd International Conference on Parallel Processing Workshops (ICCPW), pp. 149–158. IEEE (2014)

    Google Scholar 

  2. Adhianto, L., et al.: HPCToolkit: tools for performance analysis of optimized parallel programs. Concurrency Comput.: Practice Exp. 22(6), 685–701 (2010)

    Google Scholar 

  3. Brinkmann, S., Gracia, J., Niethammer, C., Keller, R.: TEMANEJO - a debugger for task based parallel programming models. In: Proceeding of the International Conference on Parallel Computing, ParCo2011 (2011)

    Google Scholar 

  4. Van der Wijngaart, R.F., Mattson, T.G.: The parallel research kernels. In: Proceedings of the 2014 IEEE High Performance Extreme Computing Conference, Waltham, MA, pp. 1–6, September 2014

    Google Scholar 

  5. Duran, A., et al.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)

    Article  MathSciNet  Google Scholar 

  6. Ghane, M., Malik, A.M., Chapman, B., Qawasmeh, A.: False sharing detection in OpenMP applications using OMPT API. In: Terboven, C., de Supinski, B.R., Reble, P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 102–114. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24595-9_8

    Chapter  Google Scholar 

  7. Intel Corporation. Intel Threading Building Blocks. http://www.threadingbuildingblocks.org/

  8. Khronos Group. OpenCL Overview. https://www.khronos.org/opencl

  9. Khronos Group. OpenVX Overview. https://www.khronos.org/openvx

  10. Llort, G., et al.: The Secrets of the accelerators unveiled: tracing heterogeneous executions through OMPT. In: OpenMP: Memory. Devices, and Tasks - 12th Proceedings of the international Workshop on OpenMP, Nara, Japan, pp. 217–236 (2016)

    Google Scholar 

  11. Microsoft. Asynchronous Agents. https://msdn.microsoft.com/en-us/library/dd551463.aspx

  12. Microsoft. Microsoft TPL Dataflow. https://www.nuget.org/packages/Microsoft.Tpl.Dataflow

  13. OpenMP Architecture Review Board. OpenMP Application Programming Interface Version 3.0, May 2008. http://www.openmp.org/wp-content/uploads/OpenMP3.1.pdf

  14. OpenMP Architecture Review Board. OpenMP Application Programming Interface Version 4.0, July 2013. http://www.openmp.org/wp-content/uploads/OpenMP4.0.0.pdf

  15. OpenMP Architecture Review Board. OpenMP Technical Report 6: Version 5.0 Preview 2, November 2017. http://www.openmp.org/wp-content/uploads/openmp-TR6.pdf

  16. Schmidl, D., Müller, M.S.: NUMA-aware task performance analysis. In: OpenMP: Memory. Devices, and Tasks - 12th Proceedings of the International Workshop on OpenMP, Nara, Japan, pp. 77–88 (2016)

    Google Scholar 

  17. Virouleau, P., et al.: Evaluation of OpenMP dependent tasks with the KASTORS benchmark suite. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 16–29. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11454-5_2

    Chapter  Google Scholar 

Download references

Acknowledgments

We would like to thank Hansang Bae, Johnny Peyton, and Terry Wilmarth for helping us better understand OMPT, and providing useful suggestions.

Intel and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

* Other names and brands are the property of their respective owners.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance. Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Klemm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Agrawal, V., Voss, M.J., Reble, P., Tovinkere, V., Hammond, J., Klemm, M. (2018). Visualization of OpenMP* Task Dependencies Using Intel® Advisor – Flow Graph Analyzer. In: de Supinski, B., Valero-Lara, P., Martorell, X., Mateo Bellido, S., Labarta, J. (eds) Evolving OpenMP for Evolving Architectures. IWOMP 2018. Lecture Notes in Computer Science(), vol 11128. Springer, Cham. https://doi.org/10.1007/978-3-319-98521-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98521-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98520-6

  • Online ISBN: 978-3-319-98521-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics