A Fine-grained Prefetching Scheme for DGEMM Kernels on GPU with Auto-tuning Compatibility | IEEE Conference Publication | IEEE Xplore