Abstract:
In this paper, we consider an unmanned aerial vehicles (UAV)-assisted IoT network and study the trajectory planning problem to optimize the information freshness, in term...Show MoreMetadata
Abstract:
In this paper, we consider an unmanned aerial vehicles (UAV)-assisted IoT network and study the trajectory planning problem to optimize the information freshness, in terms of age of information (AoI), where the update arrivals at IoT devices are stochastic and are not known to the UAV. To this end, we first formulate the dynamic UAV trajectory planning problem as a Partially Observable Markov Decision Process (POMDP) with non-uniform time steps, where the set of valid actions is coupled with the agent's observations. Then, a deep recurrent reinforcement learning (DRRL) algorithm is devised to find the policy minimizing the expectation of the weighted average AoI, in which a modified discount mechanism is utilized to deal with the challenge from non-uniform time steps and an action elimination mechanism is introduced to address the coupling between the valid actions and observations. Finally, simulations are conducted to validate the effectiveness of our proposed algorithm by comparing it with baseline strategies.
Published in: 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)
Date of Conference: 13-16 September 2021
Date Added to IEEE Xplore: 21 October 2021
ISBN Information: