Abstract:
The next-generation wireless network is expected to use low-earth orbit (LEO) satellite networks to deliver seamless and high-capacity global communications services. Due...Show MoreMetadata
Abstract:
The next-generation wireless network is expected to use low-earth orbit (LEO) satellite networks to deliver seamless and high-capacity global communications services. Due to the high-speed mobility of LEO satellites, massive and frequent handovers inevitably occur. Moreover, handover becomes more complicated with the ever-growing constellation scale, number of mobile terminals (MTs), and demands for emerging delay-sensitive applications. In this paper, a decentralized Markov decision process (DEC-MDP) is adopted to formulate the handover problem in the LEO satellite network with finite bursty traffic. The target is maximizing the total reward associated with the service revenue and the cost of handover and packet loss. To deal with the high computational complexity caused by the large state space and action space, the solution is designed using a multi-agent double deep Q-network (MADDQN) with fully decentralized framework, which also allows each MT to train and use an individual local DDQN to avoid load imbalance between satellites. Further, to alleviate the non-stationary issue of the environment in parallel learning, multi-agent fingerprints are applied in MADDQN, and the proposed algorithm is called multi-agent fingerprints-enhanced double deep Q-network-based distributed intelligent handover (MAF-DDQN-DIH) mechanism. The implementation of MAF-DDQN-DIH in practical communication systems are discussed, and the corresponding communication overhead and computational complexity are analyzed. Simulation results demonstrate that the designed multi-agent fingerprints are effective and the proposed MAF-DDQN-DIH algorithm outperforms the comparison handover algorithms in terms of the total reward.
Published in: IEEE Transactions on Vehicular Technology ( Volume: 73, Issue: 10, October 2024)