ABSTRACT
No abstract available.
- [1] Tianqi Chen, et al, TVM: An Automated End-to-End Optimizing Compiler for Deep Learning, OSDI 2018.Google Scholar
- [2] Qualcomm Technologies, Inc., Qualcomm® SnapdragonTM Mobile Platform OpenCL General Programming and Optimization (80-NB295-11 C), 2017. https://developer.qualcomm.com/qfile/33472/80-nb295-11 c.pdfGoogle Scholar
- [3] Qualcomm Technologies, Inc., OpenCL ML SDK https://developer.qualcomm.com/blog/accelerate-your-models-our-opencl-ml-sdkGoogle Scholar
- [4] Zhi Chen and Cody Yu, Amazon Web Services, Inc, Bring Your Own Codegen https://tvm.apache.org/2020/07/15/how-to-bring-your-own-codegen-to-tvmGoogle Scholar
- [5] Jiang, Bolan & Lin, Jeng-Hau & Golikeri, Adarsh & He, Li & Wang, Hongqiang & Bourd, Alex. (2020). Training Machine Learning Network on Adreno Mobile GPUs Using OpenCL. 1-2. 10.1145/3388333.3388666.Google Scholar
- [6] He, Li & Wang, Hongqiang & Golikeri, Adarsh & Bourd, Alex & Adarshg, Iih. (2020). TVM for Adreno GPUs. 1-2. 10.1145/3388333.3388667.Google Scholar
- [7] Reddy, Siva & Wang, Hongqiang & Golikeri, Adarsh & Bourd, Alex. (2021). Machine learning training with Tensor Virtual Machine (TVM) and Adreno GPUs. 1. 10.1145/3456669.3456702.Google Scholar
- [8] Reddy, Siva & Wang, Hongqiang & Bourd, Alex & Golikeri, Adarsh & Calidas, Balaji. (2022). OpenCLML Integration with TVM. 1-1. 10.1145/3529538.3530003.Google Scholar
Index Terms
- Accelerate Machine Learning Workloads using TVM and OpenCL ML SDK on Qualcomm AdrenoTM GPUs
Recommendations
OpenCL Optimization and Best Practices for Qualcomm Adreno GPUs
IWOCL '18: Proceedings of the International Workshop on OpenCLAs the industry's leading mobile graphics processing unit (GPU) core, Adreno™ in Qualcomm®'s Snapdragon™ SOCs has supported the OpenCL™ standard since its A3x family and all through its A4x, A5x families, and the latest A6x family. How to effectively ...
An OpenCL micro-benchmark suite for GPUs and CPUs
Open computing language (OpenCL) is a new industry standard for task-parallel and data-parallel heterogeneous computing on a variety of modern CPUs, GPUs, DSPs, and other microprocessor designs. OpenCL is vendor independent and hence not specialized for ...
Performance Tuning of Matrix Multiplication in OpenCL on Different GPUs and CPUs
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and AnalysisOpenCL (Open Computing Language) is a framework for general-purpose parallel programming. Programs written in OpenCL are functionally portable across multiple processors including CPUs, GPUs, and also FPGAs. Using an auto-tuning technique makes ...
Comments