Skip to main content

High-Efficiency Specialized Support for Dense Linear Algebra Arithmetic in LuNA System

  • Conference paper
  • First Online:
  • 848 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12942))

Abstract

Automatic synthesis of efficient scientific parallel programs for supercomputers is in general a complex problem of system parallel programming. Therefore various specialized synthesis algorithms and heuristics are of use. LuNA system for automatic construction of distributed parallel programs provides a basis for accumulation of such algorithms to provide high-quality parallel programs generation in particular subject domains. If no specialized support is available in LuNA for given input, then the general synthesis algorithm is used, which does construct the required program, but its efficiency may be unsatisfactory. In the paper a specialized run-time system for LuNA is presented, which provides runtime support for dense linear algebra operations implementation on distributed memory multicomputers. Experimental results demonstrate, that automatically generated parallel programs of the class outperform corresponding ScaLAPACK library subroutines, which makes LuNA system practically applicable for generating high performance distributed parallel programs for supercomputers in the dense linear algebra application class.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://gitlab.ssd.sscc.ru/luna/luna.

  2. 2.

    http://www.jscc.ru.

References

  1. Sterling, T., Anderson, M., Brodowicz, M.: A survey: runtime software systems for high performance computing. Supercomput. Front. Innov. 4(1), 48–68 (2017). https://doi.org/10.14529/jsfi170103

    Article  Google Scholar 

  2. Thoman, P., et al.: A taxonomy of task-based parallel programming technologies for high-performance computing. J. Supercomput. 74(4), 1422–1434 (2018). https://doi.org/10.1007/s11227-018-2238-4

    Article  Google Scholar 

  3. Kale, L.V., Krishnan, S.: Charm++ a portable concurrent object oriented system based on C++. In: Proceedings of the Eighth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications, pp. 91–108, October 1993

    Google Scholar 

  4. Acun, B., et al.: Parallel programming with migratable objects: Charm++ in practice. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014, pp. 647–658. IEEE, November 2014

    Google Scholar 

  5. Bosilca, G., Bouteiller, A., Danalis, A., Faverge, M., Hérault, T., Dongarra, J.J.: PaRSEC: exploiting heterogeneity to enhance scalability. Comput. Sci. Eng. 15(6), 36–45 (2013)

    Article  Google Scholar 

  6. Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 1–11. IEEE, November 2012.

    Google Scholar 

  7. Slaughter, E., Lee, W., Treichler, S., Bauer, M., Aiken, A.: Regent: a high-productivity programming language for HPC with logical regions. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12, November 2015

    Google Scholar 

  8. Slaughter, E.: Regent: a high-productivity programming language for implicit parallelism with logical regions. Doctoral dissertation, Stanford University (2017)

    Google Scholar 

  9. Torres, H., Papadakis, M., Jofre Cruanyes, L.: Soleil-X: turbulence, particles, and radiation in the Regent programming language. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019, pp. 1–4 (2019)

    Google Scholar 

  10. Malyshkin, V.E., Perepelkin, V.A.: LuNA fragmented programming system, main functions and peculiarities of run-time subsystem. In: Malyshkin, V. (ed.) PaCT 2011. LNCS, vol. 6873, pp. 53–61. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23178-0_5

    Chapter  Google Scholar 

  11. Valkovsky, V.A., Malyshkin, V.E.: Synthesis of Parallel Programs and Systems on the Basis of Computational Models. Nauka, Novosibirsk (1988). (in Russian)

    Google Scholar 

  12. Malyshkin, V.: Active knowledge, LuNA and literacy for oncoming centuries. In: Bodei, C., Ferrari, G.-L., Priami, C. (eds.) Programming Languages with Applications to Biology and Security. LNCS, vol. 9465, pp. 292–303. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25527-9_19

    Chapter  Google Scholar 

  13. Hiranandani, S., Kennedy, K., Mellor-Crummey, J., Sethi, A.: Compilation techniques for block-cyclic distributions. In: Proceedings of the 8th International Conference on Supercomputing, pp. 392–403, July 1994

    Google Scholar 

  14. Choi, J., Dongarra, J.J., Pozo, R., Walker, D.W.: ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers. In: The Fourth Symposium on the Frontiers of Massively Parallel Computation, pp. 120–121. IEEE Computer Society, January 1992

    Google Scholar 

  15. Goto, K., Van De Geijn, R.: High-performance implementation of the level-3 BLAS. ACM Trans. Math. Softw. 35(1), 1–14 (2008). Article 4. https://doi.org/10.1145/1377603.1377607

  16. Kurzak, J., Ltaief, H., Dongarra, J., Badia, R.M.: Scheduling dense linear algebra operations on multicore processors. Concurr. Comput. Practice Exp. 22(1), 15–44 (2010)

    Article  Google Scholar 

Download references

Acknowledgments

The work was supported by the budget project of the ICMMG SB RAS No. 0251-2021-0005.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladislav Perepelkin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Belyaev, N., Perepelkin, V. (2021). High-Efficiency Specialized Support for Dense Linear Algebra Arithmetic in LuNA System. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2021. Lecture Notes in Computer Science(), vol 12942. Springer, Cham. https://doi.org/10.1007/978-3-030-86359-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86359-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86358-6

  • Online ISBN: 978-3-030-86359-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics