Abstract
The primary way to achieve thread-level parallelism on the Sunway high-performance multicore processor is to use the OpenMP programming technique. To address the problem of low parallelism efficiency caused by slow access to thread private variables in the compilation of Sunway OpenMP programs, this paper proposes a thread private variable access technique based on privileged instructions. The privileged instruction-based thread-private variable access technique centralizes the implementation of thread-private variables at the compiler level, eliminating the model switching overhead of invoking OS core processing and improving the speed of accessing thread-private variables. On the Sunway 1621 server platform, NPB3.3-OMP and SPEC OMP2012 achieved 6.2% and 6.8% running efficiency gains, respectively. The results show that the techniques proposed in this paper can provide technical support for giving full play to the advantages of Sunway’s high-performance multi-core processors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tiotto, E., Mahjour, B., Tsang, W.: OpenMP 4.5 compiler optimization for GPU offloading. IBM J. Res. Dev. 3(5), 1–11 (2020)
Neth, B., Scogland, T.R.W., Strout, M.M., de Supinski, B.R.: Unified Sequential optimization directives in OpenMP. In: Milfeld, K., de Supinski, B., Koesterke, L., Klinkenberg, J. (eds.) IWOMP 2020. LNCS, vol. 12295, pp. 85–97. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58144-2_6
Mosseri, I., Alon, L.O., Harel, R., Oren, G.: ComPar: optimized multi-compiler for automatic OpenMP S2S parallelization. In: Milfeld, K., de Supinski, B., Koesterke, L., Klinkenberg, J. (eds.) IWOMP 2020. LNCS, vol. 12295, pp. 247–262. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58144-2_16
Schreter, I.: Systems and methods for accessing thread private data (2008)
Wei, P.F., Brylinski, M.: Accelerated structural bioinformatics for drug discovery. In: High Performance Parallelism Pearls: Multicore and Many-Core Programming Approaches, pp. 55–72 (2015)
Lin, Y., Chakrabarti, G., Marathe, J., Kwon, O., Sabne, A.: System and method for translating program functions for correct handling of local-scope variables and computing system incorporating the same (2008)
Marathe, V.J., Byan, S., Seltzer, M.I., Mishra, A., Trivedi, A.: Efficient memory management for persistent memory (2019)
Bratanov, S.V.: Method of concurrent instruction execution and parallel work balancing in heterogeneous computer systems, US (2019)
Greenwood, S.R., Peterson, K.R., Schreiber, B.L.: Thread private memory storage for multi-thread digital data processors (1991)
Chen, F., Ganglin, Y., Shen, S., Ye, X., Yang, F., Wang, K.: Parallelization and optimization of RMC for criticality computing based on the heterogeneous architecture of the Sunway Taihu Light supercomputer. Ann. Nucl. Energy 11(145), 1–12 (2020)
Shirakihara, T.: Method and apparatus for managing thread private data in a parallel processing computer, US(1996)
Gerofi, B., Takagi, M., Ishikawa, Y.: Toward operating system support for scalable multithreaded message passing. In: Proceedings of the 22nd European MPI Users’ Group Meeting, pp. 21–23 (2015)
Hori, A., Takagi, M., Si, M., Dayal, J., Ishikawa, Y., Gerofi, B., Balaji, P.: Process-in-process: techniques for practical address-space sharing. In: HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing, pp. 131–143 (2018)
Coon, B.W., Lindholm, J.E.: System and method for grouping execution threads, US (2007)
Kadir, A., Cevdet, A.: Exploiting locality in sparse matrix-matrix multiplication on manycore architectures. IEEE Trans. Parallel Distrib. Syst. 28(8), 2258–2271 (2017)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kong, J., Nie, K., Zhou, Q., Xu, J., Han, L. (2021). Thread Private Variable Access Optimization Technique for Sunway High-Performance Multi-core Processors. In: Zeng, J., Qin, P., Jing, W., Song, X., Lu, Z. (eds) Data Science. ICPCSEE 2021. Communications in Computer and Information Science, vol 1451. Springer, Singapore. https://doi.org/10.1007/978-981-16-5940-9_14
Download citation
DOI: https://doi.org/10.1007/978-981-16-5940-9_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5939-3
Online ISBN: 978-981-16-5940-9
eBook Packages: Computer ScienceComputer Science (R0)