Improving GPU Throughput through Parallel Execution Using Tensor Cores and CUDA Cores | IEEE Conference Publication | IEEE Xplore