Abstract
This paper applies the MPI-OpenMP-based two-dimensional Hopmoc method using the explicit work-sharing technique with a recently proposed mechanism to reduce implicit barriers in OpenMP. Specifically, this paper applies the numerical algorithm to yield approximate solutions to the advection-diffusion equation. Additionally, this article splits the mesh used by the numerical method and distributes them to over-allocated threads. The mesh partitions became so small that the approach reduced the cache miss rate. Consequently, the strategy accelerated the numerical method in multicore systems. This paper then evaluates the results of implementing the strategy under different metrics. As a result, the set of techniques improved the performance of the parallel numerical method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Cabral, F.L., Gonzaga de Oliveira, S.L., Osthoff, C., Costa, G.P., Brandão, D.N., Kischinhevsky, M.: An evaluation of MPI and OpenMP paradigms in finite-difference explicit methods for PDEs on shared-memory multi- and manycore systems. Concurrency Comput. Pract. Exp. 32(20), e5642 (2020). e5642 cpe.5642
Oliveira, S., Kischinhevsky, M., Gonzaga de Oliveira, S.L.: Convergence analysis of the Hopmoc method. Int. J. Comput. Math. 86, 1375–1393 (2009)
MPI Forum. MPI forum (2022). https://www.mpi-forum.org/. Accessed 20 Mar 2022
Prisaganec, M., Mitrevski, P.J.: Reducing competitive cache misses in modern processor architectures. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 8(6), 49–57 (2016)
Cassales, G., Gomes, H., Bifet, A., Pfahringer, B., Senger, H.: Improving parallel performance of ensemble learners for streaming data through data locality with mini-batching. In: IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Yanuca Island, Cuvu, Fiji. IEEE (2020)
Alperen, A., Afibuzzaman, M., Rabbi, F., Ozkaya, M.Y., Catalyurek, U., Aktulga, H.M.: An evaluation of task-parallel frameworks for sparse solvers on multicore and manycore CPU architectures. In: ICPP 2021: 50th International Conference on Parallel Processing, pp. 1–11 (2021)
Cabral, F.L., Osthoff, C., Costa, G.P., Brandão, D., Kischinhevsky, M., Gonzaga de Oliveira, S.L.: Tuning up the TVD-HOPMOC method on Intel MIC Xeon Phi architectures with Intel Parallel Studio tools. In: 2017 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), pp. 19–24. IEEE (2017)
Brandão, D.N., Gonzaga de Oliveira, S.L., Kischinhevsky, M., Osthoff, C., Cabral, F.: A total variation diminishing Hopmoc scheme for numerical time integration of evolutionary differential equations. In: Gervasi, O., et al. (eds.) ICCSA 2018. LNCS, vol. 10960, pp. 53–66. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95162-1_4
Cabral, F.L., Osthoff, C., Costa, G.P., Gonzaga de Oliveira, S.L., Brandão, D., Kischinhevsky, M.: An OpenMP implementation of the TVD–Hopmoc method based on a synchronization mechanism using locks between adjacent threads on Xeon Phi (TM) accelerators. In: Shi, Y., et al. (eds.) ICCS 2018. LNCS, vol. 10862, pp. 701–707. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93713-7_67
Cabral, F., et al.: An improved OpenMP implementation of the TVD–Hopmoc method based on a cluster of points. In: Senger, H., et al. (eds.) VECPAR 2018. LNCS, vol. 11333, pp. 132–145. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15996-2_10
Cabral, F.L., et al.: Fine-tuning an OpenMP-based TVD–Hopmoc method using Intel® parallel studio XE tools on Intel® Xeon® architectures. In: Meneses, E., Castro, H., Barrios Hernández, C.J., Ramos-Pollan, R. (eds.) CARLA 2018. CCIS, vol. 979, pp. 194–209. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-16205-4_15
Iancu, C., Hofmeyr, S., Blagojevic, F., Zheng, Y.: Oversubscription on multicore processors. In: IEEE International Symposium on Parallel Distributed Processing (IPDPS), Atlanta, GA, pp. 1–11. IEEE (2010)
Navarro, A., Vilches, A., Corbera, F., Asenjo, R.: Strategies for maximizing utilization on multi-CPU and multi-GPU heterogeneous architectures. J. Supercomput. 70, 756–771 (2014)
Jiang, M., Essen, B.V., Harrison, C., Gokhale, M.B.: Multi-threaded streamline tracing for data-intensive architectures. In: Childs, H., Pajarola, R., Vishwanath, V. (eds.) 4th IEEE Symposium on Large Data Analysis and Visualization, LDAV, Paris, France, pp. 11–18. IEEE Computer Society (2014)
Huang, H., et al.: Towards exploiting CPU elasticity via efficient thread oversubscription. In: HPDC 2021: Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing, pp. 215–226 (2021)
Gordon, P.: Nonsymmetric difference equations. SIAM J. Appl. Math. 13, 667–673 (1965)
Douglas, J., Jr., Russell, T.F.: Numerical methods for convection-dominated diffusion problems based on combining the method of characteristics with finite element method or finite difference procedures. SIAM J. Numer. Anal. 19, 871–885 (1982)
Robaina, D.: BDF-Hopmoc: an implicit multi-step method for the solution of partial differential equations based on alternating spatial updates along the characteristic lines. Ph.D. thesis, Instituto de Computação, Universidade Federal Fluminense, July 2018. (in Portuguese)
Gourlay, A.R.: Hopscotch: a fast second-order partial differential equation solver. IMA J. Appl. Math. 6(4), 375–390 (1970)
Acknowledgements
CAPES - Coordenação de Aperfeiçoamento de Pessoal de NÃvel Superior (Coordination for Enhancement of Higher Education Personnel, Brazil) supported this study.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cabral, F.L., Osthoff, C., de Oliveira, S.L.G. (2022). Reducing Cache Miss Rate Using Thread Oversubscription to Accelerate an MPI-OpenMP-Based 2-D Hopmoc Method. In: Gervasi, O., Murgante, B., Hendrix, E.M.T., Taniar, D., Apduhan, B.O. (eds) Computational Science and Its Applications – ICCSA 2022. ICCSA 2022. Lecture Notes in Computer Science, vol 13375. Springer, Cham. https://doi.org/10.1007/978-3-031-10522-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-10522-7_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10521-0
Online ISBN: 978-3-031-10522-7
eBook Packages: Computer ScienceComputer Science (R0)