Abstract
SX-Aurora TSUBASA (SX-AT) is a vector supercomputer equipped with Vector Engines (VEs). SX-AT has not only such a new system architecture, but also some execution modes to achieve high performance on executing a real-world application that often consists of vector friendly and unfriendly parts. Vector Engine Offloading (VEO) is a programming framework to offload only a vector-friendly part to VEs, and neoSYCL has been developed on top of VEO to allow programmers to use the standard SYCL interface at offload programming on SX-AT. However, it is unclear how much neoSYCL based on VEO can conform to the SYCL standard, which is primarily based on OpenCL. Therefore, this paper discusses the conformance of neoSYCL to the SYCL standard, and also the performance. Our thorough evaluation with SYCL-Bench kernels demonstrates that neoSYCL is conformant to the SYCL standard except for OpenCL-related features. In addition, the runtime overhead for using the SYCL interface on top of VEO is negligible in most cases, allowing the neoSYCL codes to achieve comparable performance with the VEO codes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chandra, R., Dagum, L., Kohr, D., Menon, R., Maydan, D., McDonald, J.: Parallel Programming in OpenMP. Morgan kaufmann, Burlington (2001)
Focht, E.: Speeding up vector engine offloading with AVEO, pp. 35–47 (2021)
Intel: Data Parallel C++ language. https://software.intel.com/content/www/cn/zh/develop/tools/oneapi/data-parallel-c-plus-plus.html
Ke, Y., Agung, M., Takizawa, H.: neosycl: a sycl implementation for sx-aurora tsubasa. In: The International Conference on High Performance Computing in Asia-Pacific Region, pp. 50–57. HPC Asia 2021, Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3432261.3432268
Khronos: SYCL 1.2.1. Technical report, Khronos Group, Inc. (2020). https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf
Komatsu, K., et al.: Performance evaluation of a vector supercomputer sx-aurora tsubasa. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 685–696 (2018). https://doi.org/10.1109/SC.2018.00057
Lal, S.: SYCL-bench: a versatile cross-platform benchmark suite for heterogeneous computing. In: Malawski, M., Rzadca, K. (eds.) Euro-Par 2020. LNCS, vol. 12247, pp. 629–644. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57675-2_39
Munshi, A.: The opencl specification. In: 2009 IEEE Hot Chips 21 Symposium (HCS), pp. 1–314 (2009). https://doi.org/10.1109/HOTCHIPS.2009.7478342
NEC: SX-Aurora TSUBASA - Vector Engine. https://www.nec.com/en/global/solutions/hpc/sx/vector_engine.html
Noack, M., Focht, E., Steinke, T.: Heterogeneous active messages for offloading on the nec sx-aurora tsubasa. In: 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 26–35 (2019). https://doi.org/10.1109/IPDPSW.2019.00014
Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley Professional, Boston (2010)
Takizawa, H., Shiotsuki, S., Ebata, N., Egawa, R.: An opencl-like offload programming framework for sx-aurora tsubasa. In: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 282–288 (2019). https://doi.org/10.1109/PDCAT46702.2019.00059
Waidyasooriya, H.M., Takei, Y., Tatsumi, S., Hariyama, M.: Opencl-based FPGA-platform for stencil computation and its optimization methodology. IEEE Trans. Parallel Distrib. Syst. 28(5), 1390–1402 (2017). https://doi.org/10.1109/TPDS.2016.2614981
Yin, T.: Lizard: An extensible Cyclomatic Complexity Analyzer (2019)
Acknowledgement
This work is partially supported by MEXT Next Generation High-Performance Computing Infrastructures and Applications R&D Program “R&D of A Quantum-Annealing-Assisted Next Generation HPC Infrastructure and its Applications,” Grant-in-Aid for Scientific Research(A) #20H00593 and Grant-in-Aid for Scientific Research(B) #21H03449.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, J., Agung, M., Takizawa, H. (2022). Evaluating the Performance and Conformance of a SYCL Implementation for SX-Aurora TSUBASA. In: Shen, H., et al. Parallel and Distributed Computing, Applications and Technologies. PDCAT 2021. Lecture Notes in Computer Science(), vol 13148. Springer, Cham. https://doi.org/10.1007/978-3-030-96772-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-96772-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96771-0
Online ISBN: 978-3-030-96772-7
eBook Packages: Computer ScienceComputer Science (R0)