Cited By
View all- Guerrouj FRodríguez Flórez SEl Ouardi AAbouzahir MRamzi M(2024)Optimizing Convolution Operations for YOLOv4-based Object Detection on GPUITM Web of Conferences10.1051/itmconf/2024690400869(04008)Online publication date: 13-Dec-2024
Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and are largely used in production. State–of–the–...
Cloud environments today are increasingly featuring hybrid nodes containing multicore CPU processors and a diverse mix of accelerators such as Graphics Processing Units (GPUs), Intel Xeon Phi co-processors, and Field-Programmable Gate Arrays (FPGAs) to ...
This article presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and ...
Association for Computing Machinery
New York, NY, United States
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in