Loading [a11y]/accessibility-menu.js
Safety Constrained Trajectory Optimization for Completion Time Minimization for UAV Communications | IEEE Journals & Magazine | IEEE Xplore

Safety Constrained Trajectory Optimization for Completion Time Minimization for UAV Communications

Publisher: IEEE

Abstract:

In recent years, unmanned aerial vehicles (UAVs) are considered to be integrated into wireless communication systems because of their tremendous advantages in mobility, c...View more

Abstract:

In recent years, unmanned aerial vehicles (UAVs) are considered to be integrated into wireless communication systems because of their tremendous advantages in mobility, cost, maneuverability, etc. In some real UAV-assisted communication scenarios, the dynamics of the environment, such as the roaming of served users, make it hard to obtain an optimal trajectory before the UAV is dispatched. Implanting an intelligent control policy into UAVs for distributed task execution is necessary to complete the task. In this article, a UAV trajectory design problem is investigated for an orthorgonal-frequency-division-multiplexing (OFDM) wireless sensor network, which is dynamic because mobile sensors may randomly roam within a certain range. The UAV is expected to balance task efficiency with the safety constraint with a pretrained onboard control policy. Compared to prior works, this work requires the policy to adapt to randomly generated obstacle maps, and also assumes that the UAV has no prior knowledge of the obstacles before it is dispatched, which brings about challenges to the problem. The motivation comes from adversarial environments without the specific obstacle distribution beforehand, such as a disaster area. The problem is formulated as a constrained Markov decision process (CMDP) model, which incorporates the safety constraint compared to a basic Markov decision process. Due to the assumption of randomized obstacle distribution and lack of prior knowledge, existing algorithms for CMDP can not be applied directly. To tackle this issue, we enhance the reinforcement learning (RL) algorithm with a safety control mechanism to derive our novel safe RL (Safe RL) algorithm, which is based on the framework of the Lagrangian method. Compared to former algorithms about CMDP, our algorithm eliminates the premise that the safety model is known, the agent is able to learn safety judgment from scratch through its interactions with the environment. Simulation results demonstrate ...
Published in: IEEE Internet of Things Journal ( Volume: 11, Issue: 21, 01 November 2024)
Page(s): 34482 - 34491
Date of Publication: 19 January 2024

ISSN Information:

Publisher: IEEE

Funding Agency:


References

References is not available for this document.