Architecture-Aware Mapping and Optimization on a 1600-Core GPU | IEEE Conference Publication | IEEE Xplore