Loading web-font TeX/Math/Italic
Single-Forking of Coded Subtasks for Straggler Mitigation | IEEE Journals & Magazine | IEEE Xplore

Single-Forking of Coded Subtasks for Straggler Mitigation


Abstract:

Given the unpredictable nature of the nodes in distributed computing systems, some of the tasks can be significantly delayed. Such delayed tasks are called stragglers. St...Show More

Abstract:

Given the unpredictable nature of the nodes in distributed computing systems, some of the tasks can be significantly delayed. Such delayed tasks are called stragglers. Straggler mitigation can be achieved by redundant computation. In maximum distance separable (MDS) redundancy method, a task is divided into k subtasks which are encoded to n coded subtasks, such that a task is completed if any k out of n coded subtasks are completed. Two important metrics of interest are task completion time, and server utilization which is the aggregate completed work by all servers in this duration. We consider a proactive straggler mitigation strategy where n_{0} out of n coded subtasks are started at time 0 while the remaining n-n_{0} coded subtasks are launched when \ell _{0}\le \min \left \{{n_{0},k}\right \} of the initial ones finish. The coded subtasks are halted when k of them finish. For this flexible forking strategy with multiple parameters, we analyze the mean of two performance metrics when the random service completion time at each server is independent and distributed identically (i.i.d.) to a shifted exponential. From this study, we find a tradeoff between the metrics which provides insights into the parameter choices. Experiments on Intel DevCloud illustrate that the shifted exponential distribution adequately captures the random coded subtask completion times, and our derived insights continue to hold.
Published in: IEEE/ACM Transactions on Networking ( Volume: 29, Issue: 6, December 2021)
Page(s): 2413 - 2424
Date of Publication: 02 July 2021

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.