Abstract:
Computational offloading is used to augment the capabilities of edge devices by delegating highly complex tasks (e.g. Deep Neural Network - DNN - inference) to remote dev...Show MoreMetadata
Abstract:
Computational offloading is used to augment the capabilities of edge devices by delegating highly complex tasks (e.g. Deep Neural Network - DNN - inference) to remote devices. This allows devices to complete workloads beyond their limited computational resources. DNN partitioning allows a DNN inference task to be divided so that the partitioned segments can be executed on separate devices. How a DNN is partitioned significantly impacts the communication delay an inference task will incur. An interesting scenario is offloading in homogeneous networks (e.g. IoT networks, smart cameras), where all devices have the same computational resources. Here there is no processing reduction to be obtained by offloading. The most important factor for a partitioning technique in this context is the ability to minimise communication delay. This paper compares six different DNN partitioning techniques (fused tile partitioning, multi-split vertical partitioning, single-split vertical partitioning with/without horizontal partitioning and finally, no vertical partitioning with/without horizontal partitioning) under various levels of network load when offloading VGG16, YoloV2 and MobileNetV2 inference tasks. Based on our findings a single-split approach with horizontal partitioning provides the least communication delay while also providing a small reduction to processing time through parallelising the processing of the DNN task.
Published in: 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)
Date of Conference: 05-08 September 2023
Date Added to IEEE Xplore: 31 October 2023
ISBN Information: