Abstract
With the introduction of task dependences, the OpenMP API considerably extended the expressiveness of its task-based parallel programming model. With task dependences, programmers no longer have to rely on global synchronization mechanisms like task barriers. Instead they can locally synchronize a restricted subset of generated tasks by expressing an execution order through the depend clause. With the OpenMP tools interface of Technical Report 6 of the OpenMP API specification, it becomes possible to monitor task creation and execution along with the corresponding dependence information of these tasks. We use this information to construct a Task Dependence Graph (TDG) for the Flow Graph Analyzer (FGA) tool of Intel® Advisor. The TDG representation is used in FGA for deriving metrics and performance prediction and analysis of task-based OpenMP codes. We apply the FGA tool to two sample application kernels and expose issues in their usage of OpenMP tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For improved contrast and readability of the dependence graphs, we have modified the FGA tool to use white background instead of the standard gray color for this paper.
References
Tovinkere, V., Voss, M.: Flow graph designer: a tool for designing and analyzing Intel® threading building blocks flow graphs. In: 2014 43rd International Conference on Parallel Processing Workshops (ICCPW), pp. 149–158. IEEE (2014)
Adhianto, L., et al.: HPCToolkit: tools for performance analysis of optimized parallel programs. Concurrency Comput.: Practice Exp. 22(6), 685–701 (2010)
Brinkmann, S., Gracia, J., Niethammer, C., Keller, R.: TEMANEJO - a debugger for task based parallel programming models. In: Proceeding of the International Conference on Parallel Computing, ParCo2011 (2011)
Van der Wijngaart, R.F., Mattson, T.G.: The parallel research kernels. In: Proceedings of the 2014 IEEE High Performance Extreme Computing Conference, Waltham, MA, pp. 1–6, September 2014
Duran, A., et al.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)
Ghane, M., Malik, A.M., Chapman, B., Qawasmeh, A.: False sharing detection in OpenMP applications using OMPT API. In: Terboven, C., de Supinski, B.R., Reble, P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 102–114. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24595-9_8
Intel Corporation. Intel Threading Building Blocks. http://www.threadingbuildingblocks.org/
Khronos Group. OpenCL Overview. https://www.khronos.org/opencl
Khronos Group. OpenVX Overview. https://www.khronos.org/openvx
Llort, G., et al.: The Secrets of the accelerators unveiled: tracing heterogeneous executions through OMPT. In: OpenMP: Memory. Devices, and Tasks - 12th Proceedings of the international Workshop on OpenMP, Nara, Japan, pp. 217–236 (2016)
Microsoft. Asynchronous Agents. https://msdn.microsoft.com/en-us/library/dd551463.aspx
Microsoft. Microsoft TPL Dataflow. https://www.nuget.org/packages/Microsoft.Tpl.Dataflow
OpenMP Architecture Review Board. OpenMP Application Programming Interface Version 3.0, May 2008. http://www.openmp.org/wp-content/uploads/OpenMP3.1.pdf
OpenMP Architecture Review Board. OpenMP Application Programming Interface Version 4.0, July 2013. http://www.openmp.org/wp-content/uploads/OpenMP4.0.0.pdf
OpenMP Architecture Review Board. OpenMP Technical Report 6: Version 5.0 Preview 2, November 2017. http://www.openmp.org/wp-content/uploads/openmp-TR6.pdf
Schmidl, D., Müller, M.S.: NUMA-aware task performance analysis. In: OpenMP: Memory. Devices, and Tasks - 12th Proceedings of the International Workshop on OpenMP, Nara, Japan, pp. 77–88 (2016)
Virouleau, P., et al.: Evaluation of OpenMP dependent tasks with the KASTORS benchmark suite. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 16–29. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11454-5_2
Acknowledgments
We would like to thank Hansang Bae, Johnny Peyton, and Terry Wilmarth for helping us better understand OMPT, and providing useful suggestions.
Intel and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
* Other names and brands are the property of their respective owners.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance. Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Agrawal, V., Voss, M.J., Reble, P., Tovinkere, V., Hammond, J., Klemm, M. (2018). Visualization of OpenMP* Task Dependencies Using Intel® Advisor – Flow Graph Analyzer. In: de Supinski, B., Valero-Lara, P., Martorell, X., Mateo Bellido, S., Labarta, J. (eds) Evolving OpenMP for Evolving Architectures. IWOMP 2018. Lecture Notes in Computer Science(), vol 11128. Springer, Cham. https://doi.org/10.1007/978-3-319-98521-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-98521-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98520-6
Online ISBN: 978-3-319-98521-3
eBook Packages: Computer ScienceComputer Science (R0)