Abstract:
In today's era, many types of Artificial Intelligence (AI)-based applications are being developed to fulfill a variety of needs, for example, counting objects recorded us...Show MoreMetadata
Abstract:
In today's era, many types of Artificial Intelligence (AI)-based applications are being developed to fulfill a variety of needs, for example, counting objects recorded using a camera, identifying diseases by processing MRI images, and predicting traffic congestion levels at specific times. One way to provide infrastructure resources that match the workload of AI-based applications is to understand the patterns or characteristics of their workloads. Because an AI model is run using a Graphical Processing Unit (GPU), several parts of the AI model's architecture use Video Random Access Memory (VRAM) as temporary storage media to speed up the running time. This paper analyzes the characteristics of workloads from AI-based applications in terms of running time and VRAM usage, where experiments are conducted in two request scenarios: sequential request and concurrent request and using four types of AI models from the Super Resolution General Adversarial Network (SRGAN), namely no prune, random unstructured, L1 norm, and L2 norm. Based on the experimental results, the workload of all four types of SRGAN models will be almost the same when using the sequential request scenario, while in the concurrent request scenario, the four types of SRGAN models have different workloads. There are models that are more effectively processed one at a time rather than several at once, for example, in the SRGAN no prune model, and there are models that if processed several at once at the same time will be more effective compared to being processed one at a time, for example in the SRGAN random unstructured and L2 norm models.
Date of Conference: 29-30 April 2024
Date Added to IEEE Xplore: 15 May 2024
ISBN Information: