Meta-Scheduling for the Wireless Downlink Through Learning With Bandit Feedback | IEEE Journals & Magazine | IEEE Xplore

Meta-Scheduling for the Wireless Downlink Through Learning With Bandit Feedback


Abstract:

In this paper, we study learning-assisted multi-user scheduling for the wireless downlink. There have been many scheduling algorithms developed that optimize for a pletho...Show More

Abstract:

In this paper, we study learning-assisted multi-user scheduling for the wireless downlink. There have been many scheduling algorithms developed that optimize for a plethora of performance metrics; however a systematic approach across diverse performance metrics and deployment scenarios is still lacking. We address this by developing a meta-scheduler – given a diverse collection of schedulers, we develop a learning-based overlay algorithm (meta-scheduler) that selects that “best” scheduler from amongst these for each deployment scenario. More formally, we develop a multi-armed bandit (MAB) framework for meta-scheduling that assigns and adapts a score for each scheduler to maximize reward (e.g., mean delay, timely throughput etc.). The meta-scheduler is based on a variant of the Upper Confidence Bound algorithm (UCB), but adapted to interrupt the queuing dynamics at the base-station so as to filter out schedulers that might render the system unstable. We show that the algorithm has a poly-logarithmic regret in the expected reward with respect to a genie that chooses the optimal scheduler for each scenario. Finally through simulation, we show that the meta-scheduler learns the choice of the scheduler to best adapt to the deployment scenario (e.g. load conditions, performance metrics).
Published in: IEEE/ACM Transactions on Networking ( Volume: 30, Issue: 2, April 2022)
Page(s): 487 - 500
Date of Publication: 14 October 2021

ISSN Information:

Funding Agency:


References

References is not available for this document.