Optimizations for Very Long and Sparse Vector Operations on a RISC-V VPU: A Work-in-Progress

Mahale, Gopinath; Limbasiya, Tejas; Aleem, Muhammad Asad; Plana, Luis; Duricic, Aleksandar; Monemi, Alireza; Abancens, Xabier; Cervero, Teresa; Davis, John D.

doi:10.1007/978-3-031-40843-4_35

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13999))

Included in the following conference series:

International Conference on High Performance Computing

1030 Accesses

Abstract

A substantial scope to vectorize the present-day workloads in scientific computations and machine learning have highlighted Vector Processing Unit (VPU) as a target accelerator in high performance computing systems. The performance of sparse vector operations in these systems is generally limited by memory throughput due to a small fraction of non-zeros in the operand vectors. Beyond the conventional methods used to improve the memory throughput, this work considers an approach of supporting sparse very long vector operations to improve memory-level parallelism. This comes with a need to efficiently handle these sparse long vector operations on vector engines, to improve performance as well as to save energy. This paper presents enhancements to a RISC-V VPU to achieve this and a supporting infrastructure around the VPU in a manycore system. This work-in-progress paper discusses the current results on the enhanced VPU with pointers to the planned modifications.

The work has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No. 946002. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and Spain, Croatia, Turkey.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

open vector interface spec. https://github.com/semidynamics/OpenVectorInterface/blob/master/open_vector_interface_spec.pdf. Accessed 19 June 2023
Risc-v “v” vector extension. https://github.com/riscv/riscv-v-spec/blob/0.7.1/v-spec.adoc. Accessed 19 June 2023
Risc-v vectorized benchmark suite. https://github.com/RALC88/riscv-vectorized-benchmark-suite. Accessed 19 June 2023
Asanovic, K., Wawrzynek, J.: Vector microprocessors. Ph.D. thesis (1998). aAI9901978
Google Scholar
Balkind, J., et al.: OpenPiton: an open source manycore research framework. In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (2016)
Google Scholar
Ceze, L., Tuck, J., Torrellas, J.: Are we ready for high memory-level parallelism (2006)
Google Scholar
Cristal, A., Ortega, D., Llosa, J., Valero, M.: Kilo-instruction processors. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds.) ISHPC 2003. LNCS, vol. 2858, pp. 10–25. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39707-6_2
Chapter Google Scholar
Fell, A., et al.: The Marenostrum experimental exascale platform (MEEP). Supercomput. Front. Innov. 8(1), 62–81 (2021). https://doi.org/10.14529/jsfi210105. https://superfri.org/index.php/superfri/article/view/369
Gale, T.: The future of sparsity in deep neural networks (2020). https://www.sigarch.org/the-future-of-sparsity-in-deep-neural-networks/
Kurth, A., et al.: An open-source platform for high-performance non-coherent on-chip communication. CoRR abs/2009.05334 (2020). https://arxiv.org/abs/2009.05334
McKee, S.A., Wisniewski, R.W.: Memory wall. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1110–1116. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4_234
Chapter Google Scholar
Minervini, F., et al.: Vitruvius+: an area-efficient RISC-V decoupled vector coprocessor for high performance computing applications. ACM Trans. Archit. Code Optim. 20, 1–25 (2022). https://doi.org/10.1145/3575861
Article Google Scholar
Monemi, A., Tang, J., Palesi, M., Marsono, M.N.: ProNoC: a low latency network-on-chip based many-core system-on-chip prototyping platform. Microprocess. Microsyst. 54, 60–74 (2017). https://doi.org/10.1016/j.micpro.2017.08.007
Article Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona Supercomputing Center, Barcelona, Spain
Gopinath Mahale, Tejas Limbasiya, Muhammad Asad Aleem, Luis Plana, Aleksandar Duricic, Alireza Monemi, Xabier Abancens, Teresa Cervero & John D. Davis

Authors

Gopinath Mahale
View author publications
You can also search for this author in PubMed Google Scholar
Tejas Limbasiya
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Asad Aleem
View author publications
You can also search for this author in PubMed Google Scholar
Luis Plana
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandar Duricic
View author publications
You can also search for this author in PubMed Google Scholar
Alireza Monemi
View author publications
You can also search for this author in PubMed Google Scholar
Xabier Abancens
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Cervero
View author publications
You can also search for this author in PubMed Google Scholar
John D. Davis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gopinath Mahale .

Editor information

Editors and Affiliations

University of New Mexico, Albuquerque, NM, USA
Amanda Bienz
University of Edinburgh, Edinburgh, UK
Michèle Weiland
Université Paris-Saclay, Gif sur Yvette, France
Marc Baboulin
CERFACS, Toulouse, France
Carola Kruse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mahale, G. et al. (2023). Optimizations for Very Long and Sparse Vector Operations on a RISC-V VPU: A Work-in-Progress. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13999. Springer, Cham. https://doi.org/10.1007/978-3-031-40843-4_35

Download citation

DOI: https://doi.org/10.1007/978-3-031-40843-4_35
Published: 25 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40842-7
Online ISBN: 978-3-031-40843-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimizations for Very Long and Sparse Vector Operations on a RISC-V VPU: A Work-in-Progress