extended-abstract

Accelerated Machine Learning Using TensorFlow and SYCL on OpenCL Devices

Authors:
Mehdi Goli

Codeplay Software Ltd., Edinburgh, UK

Codeplay Software Ltd., Edinburgh, UK
View Profile

,
Luke Iwanski

Codeplay Software Ltd., Edinburgh, UK

Codeplay Software Ltd., Edinburgh, UK
View Profile

,
Andrew Richards

Codeplay Software Ltd., Edinburgh, UK

Codeplay Software Ltd., Edinburgh, UK
View Profile

IWOCL '17: Proceedings of the 5th International Workshop on OpenCLMay 2017Article No.: 8Pages 1–4https://doi.org/10.1145/3078155.3078160

Published:16 May 2017Publication History

IWOCL '17: Proceedings of the 5th International Workshop on OpenCL

Pages 1–4

ABSTRACT

Machine learning is being used in more and more artificial intelligence applications. While existing machine learning frameworks mostly support NVIDIA CUDA GPUs, there has been little research dedicated to targeting other devices through open standards such as OpenCL. In this paper, we explain how machine learning applications can harness the power of OpenCL using open standards and how, by using SYCL, TensorFlow can be extended to include customized operations running on OpenCL devices.

References

Martín Abadi and others. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Savannah, Georgia, USA. Google ScholarDigital Library
Rami Al-Rfou and others. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688 (2016).Google Scholar
Anelia Angelova, Alex Krizhevsky, and Vincent Vanhoucke. 2015. Pedestrian detection with a large-field-of-view deep network. In Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 704--711.Google ScholarCross Ref
Tianqi Chen and others. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015).Google Scholar
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014).Google Scholar
Ronan Collobert and others. 2002. Torch: a modular machine learning software library. Technical Report. Idiap.Google Scholar
Mehdi Goli. 2016. VisionCPP: A SYCL-based Computer Vision Framework. In Proceedings of the 4th International Workshop on OpenCL. ACM, 6. Google ScholarDigital Library
Mehdi Goli. 2017. SYCL backend for Eigen. Technical peresentation in 1st workshop on Distributed and Heterogeneous Programming in C and C++(DHPCC++17)- To be appear in May 2017 (2017).Google Scholar
Khronos OpenCL Working Group. 2008. The OpenCL Specification. (2008).Google Scholar
OpenACC Working Group and others. 2011. The OpenACC Application Programming Interface. (2011).Google Scholar
Junli Gu, Yibing Liu, Yuan Gao, and Maohua Zhu. 2016. OpenCL caffe: Accelerating and enabling a cross platform machine learning framework. In Proceedings of the 4th International Workshop on OpenCL. ACM, 8. Google ScholarDigital Library
Gael Guennebaud, Benoit Jacob, and others. 2014. Eigen: a C++ linear algebra library. (2014).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
Yangqing Jia and others. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia. ACM, 675--678. Google ScholarDigital Library
Norman P. Jouppi and others. 2017. In-Datacenter Performance Analysis of a Tensor Processing Unit. (2017).Google Scholar
Khronos Group 2014. The SPIR Specification. Khronos Group.Google Scholar
Microsoft. 2013. C++ AMP: Language and Programming Model. (2013).Google Scholar
CUDA Nvidia. 2010. Programming guide. (2010).Google Scholar
ARB OpenMP. 2011. OpenMP Application Programming Interface. (2011).Google Scholar
Ralph Potter, Paul Keir, Russell J Bradford, and Alastair Murray. 2015. Kernel composition in SYCL. In Proceedings of the 3rd International Workshop on OpenCL. ACM, 11. Google ScholarDigital Library
Olga Russakovsky and others. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarDigital Library
Ben Sander, Greg Stoner, Siu-chi Chan, WH Chung, and Robin Maffeo. 2015. HCC: A C++ Compiler For Heterogeneous Computing. HSA Foundation, Tech. Rep. (2015).Google Scholar
Khronos OpenCL Working Group SYCL subgroup. 2015. SYCL Specification. (2015).Google Scholar

Index Terms

Accelerated Machine Learning Using TensorFlow and SYCL on OpenCL Devices
1. Computing methodologies
  1. Machine learning
  2. Parallel computing methodologies
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages

Index terms have been assigned to the content through auto-classification.

Recommendations

A Comparison of SYCL, OpenCL, CUDA, and OpenMP for Massively Parallel Support Vector Machine Classification on Multi-Vendor Hardware
IWOCL '22: Proceedings of the 10th International Workshop on OpenCL

In scientific computing and Artificial Intelligence (AI), which both rely on massively parallel tasks, frameworks like the Compute Unified Device Architecture (CUDA) and the Open Computing Language (OpenCL) are widely used to harvest the computational ...
Read More
Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN
IWOCL '19: Proceedings of the International Workshop on OpenCL

Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' ...
Read More
TensorFlow Acceleration on ARM Hikey Board
IWOCL '18: Proceedings of the International Workshop on OpenCL

There is huge demand for targeting complex and large-scale machine learning applications particularly those based on popular actively-maintained frameworks such as TensorFlow and CAFFE to a variety of platforms with accelerators ranging from high-end ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IWOCL '17: Proceedings of the 5th International Workshop on OpenCL
May 2017
135 pages
ISBN:9781450352147
DOI:10.1145/3078155
General Chairs:
Simon McIntosh-Smith
University of Bristol, UK
,
Ben Bergen
Los Alamos National Laboratory, USA
Copyright © 2017 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 May 2017
Check for updates
Author Tags
Eigen
GPGPU
Machine learning
OpenCL
TensorFlow
Qualifiers
- extended-abstract
- Research
- Refereed limited
Conference

Acceptance Rates
IWOCL '17 Paper Acceptance Rate15of29submissions,52%Overall Acceptance Rate84of152submissions,55%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 544
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Accelerated Machine Learning Using TensorFlow and SYCL on OpenCL Devices

IWOCL '17: Proceedings of the 5th International Workshop on OpenCL

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Comparison of SYCL, OpenCL, CUDA, and OpenMP for Massively Parallel Support Vector Machine Classification on Multi-Vendor Hardware

Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

TensorFlow Acceleration on ARM Hikey Board

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Accelerated Machine Learning Using TensorFlow and SYCL on OpenCL Devices

IWOCL '17: Proceedings of the 5th International Workshop on OpenCL

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Comparison of SYCL, OpenCL, CUDA, and OpenMP for Massively Parallel Support Vector Machine Classification on Multi-Vendor Hardware

Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

TensorFlow Acceleration on ARM Hikey Board

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media