Cited By
View all- Han HLi KCui WBai DZhang YYuan LChen YZhang YCao TYang M(2025)FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core UnitsProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710897(355-368)Online publication date: 28-Feb-2025
- Zhang YLi KYuan LHan HZhang YCao TYang M(2025)Jigsaw: Toward Conflict-free Vectorized Stencil Computation by Tessellating Swizzled RegistersProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710886(481-495)Online publication date: 28-Feb-2025
- Lakshminarasimhan MAntepara OZhao TSepanski BBasu PJohansen HHall MWilliams S(2024)Bricks: A high-performance portability layer for computations on block-structured gridsThe International Journal of High Performance Computing Applications10.1177/1094342024126828838:6(549-567)Online publication date: 19-Aug-2024
- Show More Cited By