Automatically Selecting Profitable Thread Block Sizes for Accelerated Kernels | IEEE Conference Publication | IEEE Xplore