skip to main content
10.1145/3635035acmotherconferencesBook PagePublication PageshpcasiaConference Proceedingsconference-collections
HPCAsia '24: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region
ACM2024 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
HPCAsia 2024: International Conference on High Performance Computing in Asia-Pacific Region Nagoya Japan January 25 - 27, 2024
ISBN:
979-8-4007-0889-3
Published:
19 January 2024

Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
SESSION: Session: Best Paper Finalists – 1 Programming Models and System Software
research-article
Open Access
Non-Blocking GPU-CPU Notifications to Enable More GPU-CPU Parallelism

GPUs are increasingly popular in HPC systems, and more applications are adopting GPUs each day. However, the control synchronization of GPUs with CPUs is suboptimal and only possible after GPU kernel termination points, resulting in serialized host and ...

research-article
Open Access
Portable Implementations of Work Stealing

Work stealing is a well-known technique for dynamic load balancing; however, manually writing work-stealing protocols is error-prone. We can use the Tascell parallel programming language for the correct and portable implementation of work stealing; the ...

research-article
sKokkos: Enabling Kokkos with Transparent Device Selection on Heterogeneous Systems using OpenACC

This paper presents a new feature to enable Kokkos with transparent device selection. For application developers, it is not easy to identify which device is the most appropriate to use in a heterogeneous system, since this depends on the characteristics ...

SESSION: Session: Best Paper Finalists – 2 Application and Algorithms
research-article
Open Access
Parallelized Remapping Algorithms for km-scale Global Weather and Climate Simulations with Icosahedral Grid System

In weather and climate research, latitude–longitude grid data are typically used for analysis and visualization, and remapping from model native grids to latitude–longitude grids typically requires a significant amount of time. Here, we developed a ...

research-article
Approximate Block Diagonalization of Symmetric Matrices Using Quantum Annealing

We consider the problem of transforming a given symmetric matrix into a nearly block diagonal form by permutation of its rows and columns. Such a transformation is useful as preconditioning to accelerate the convergence of an eigenvalue solver, but the ...

research-article
QUBO formulation using inequalities for problems with complex constraints

Quantum annealing is an optimization technique that uses quantum fluctuation effects to search for solutions and is being applied as a metaheuristic method. Quantum annealing solves a problem expressed as quadratic unconstrained binary optimization (...

SESSION: Session: Research Paper – 1 Architectures and Networks
research-article
Evaluation of POSIT Arithmetic with Accelerators

We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional part. We developed hardware designs for FPGAs and ...

research-article
Open Access
Low-latency Communication in RISC-V Clusters

Low-latency inter-node communication is important in HPC clusters. In this work, we design and integrate a low-cost interconnect, capable for low-latency user-level communication with open-source RISC-V processors, obviating the need for bulky and ...

research-article
Open Access
Flexible Systolic Array Platform on Virtual 2-D Multi-FPGA Plane

Systolic arrays are a promising approach to achieving high-performance processing based on highly parallelized designs in various fields, such as AI and bioinformatics. Many previous studies have devoted considerable effort to exploring efficient ...

SESSION: Session: Research Paper – 2 Parallelism
research-article
Open Access
An Efficient Task-Parallel Pipeline Programming Framework

The pipeline is a fundamental pattern to parallelize a series of stage tasks over a sequence of data in loops. Mainstream pipeline programming frameworks count on data abstractions to perform pipeline scheduling. Although this design is convenient for ...

research-article
Task-based low-rank hybrid parallel Cholesky factorization for distributed memory environment

The primary targets for improving efficiency for large-scale matrix factorization are reducing synchronization, addressing the overlap in communication and computation, and improving load balance. In recent years, tiled algorithms with task parallelism ...

research-article
AshPipe: Asynchronous Hybrid Pipeline Parallel for DNN Training

Deep Neural Networks (DNNs) have become increasingly computationally intensive and have larger parameters, requiring efficient parallelization or distribution using multiple accelerators. Pipeline parallelism has been proposed as an effective way to ...

SESSION: Session: Research Paper – 3 GPU Computing
research-article
Open Access
Bruck Algorithm Performance Analysis for Multi-GPU All-to-All Communication

In high-performance computing, collective communication is critical for facilitating comprehensive data exchange involving all processes within an MPI communicator. Due to their inherently global nature, many collective operations present scalability ...

research-article
Efficient GPU-Implementation of H-P Sort Based on Improved Histogram Computation

We present an enhanced GPU implementation of the H-P sort algorithm, which is a widely used method for integer sorting based on histogram computation and prefix sum calculation. This work extends a previous high-performance GPU version of the algorithm, ...

SESSION: Session: Research Paper – 4 Applications
research-article
Eulerian elastoplastic simulation of vehicle structures by building-cube method on supercomputer Fugaku

This paper presents a novel numerical method for the elastoplastic simulation of vehicle component structures under large deformation problems, such as crash-worthiness analysis. Elastoplastic simulation of vehicle structures is essential for designing ...

research-article
Open Access
Analysis Towards Energy-Aware Image-based In Situ Visualization on the Fugaku

Energy efficiency has become a serious concern when running applications on HPC systems. Although these systems were designed to mainly run simulation codes as fast as possible, due to the ever-increasing size of the simulation outputs, the in situ ...

research-article
Information Entropy-based Camera Focus Point and Zoom Level Adjustment for Smart In-Situ Visualization

With the recent developments in computational science and HPC technology, large-scale numerical simulations have become common in various scientific and technological fields. The output volume data from these simulations have also become larger and more ...

Index terms have been assigned to the content through auto-classification.

Recommendations

Acceptance Rates

Overall Acceptance Rate69of143submissions,48%
YearSubmittedAcceptedRate
HPCAsia '23341544%
HPCAsia '23 Workshops10990%
HPCAsia '19321547%
HPCAsia '18673045%
Overall1436948%