Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators | IEEE Conference Publication | IEEE Xplore