Abstract:
This paper studies a novel wireless information surveillance scenario, where the legitimate party aims to eavesdrop on multiple suspicious communication links with the he...Show MoreMetadata
Abstract:
This paper studies a novel wireless information surveillance scenario, where the legitimate party aims to eavesdrop on multiple suspicious communication links with the help of multiple unmanned aerial vehicles (UAVs). Each suspicious link is comprised of a UAV (transmitter) and its fixed destination. To improve the eavesdropping ability, cooperative legitimate UAVs emit jamming signals to reduce the capacities of suspicious channels and plan the flight trajectory to enhance the capacity of the eavesdropping channels. Considering the system dynamics, it is natural to model this sequential decision-making problem as a Markov Decision Process (MDP), which might be solved by reinforcement learning (RL). However, it is difficult to design a policy in RL that determines jamming powers satisfying the considered eavesdropping constraints. Therefore, we decompose the optimization process into two phases, 1) obtaining the non-learning-based optimal solver for jamming power allocation under each state, and 2) optimizing the policy of moving action by RL. We will show this decoupled optimization process also holds the optimality. Considering the flying safety, we will determine the individual moving policy for each legitimate UAV rather than a centralized policy that controls all UAVs. Finally, extensive simulations are conducted to demonstrate the effectiveness of the proposed solution.
Published in: IEEE Transactions on Mobile Computing ( Volume: 23, Issue: 5, May 2024)