Loading [MathJax]/extensions/TeX/upgreek.js
Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things | IEEE Journals & Magazine | IEEE Xplore

Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things


Abstract:

Internet of Robotic Things (IoRT) emphasizes the integrated robotic, artificial intelligence computing, and communication technologies, enabling more sophisticated operat...Show More

Abstract:

Internet of Robotic Things (IoRT) emphasizes the integrated robotic, artificial intelligence computing, and communication technologies, enabling more sophisticated operations and decision-making. As a crucial element of IoRT, mission-critical applications, such as industrial manufacturing and emergency services, impose stringent requirements on ultra-reliable and low-latency communication (URLLC). The paper focuses on addressing URLLC challenges in the context of IoRT, particularly when autonomous mobile robots (AMRs) coexist with static sensors. We prioritize safe and efficient AMRs’ travel through trajectory design and communication resource allocation in IoRT systems without the need of any prior knowledge. To enhance network connectivity and exploit diversity gains, we introduce the flexible decoding and free clustering as the next-generation multiple access technologies in spectrum-limited downlink IoRT system. Then, aiming at minimizing the decoding error probability and travel time, we formulate a long-term multi-objective optimization problem by jointly designing AMRs’ trajectory and communication resource. To accommodate the inherent dynamics and unpredictability in the IoRT system, we introduce a multi-agent actor-critic deep reinforcement learning (DRL) framework, offering four distinct implementations, each accompanied by comprehensive complexity analyses. Simulation results reveal the following insights: 1) in terms of DRL implementations, off-policy algorithms with deterministic policies outperform their on-policy counterparts, achieving approximately a 67% increase in rewards; 2) In terms of communication schemes, our proposed flexible decoding and free clustering strategies under designed trajectories can effectively reduce decoding errors; and 3) In terms of algorithm optimality, our DRL framework shows superior flexibility and adaptability in communication environments compared to traditional A* search and heuristic methods.
Published in: IEEE Transactions on Wireless Communications ( Volume: 23, Issue: 12, December 2024)
Page(s): 18154 - 18168
Date of Publication: 24 September 2024

ISSN Information:

Funding Agency:


References

References is not available for this document.