Abstract:
Dynamic off-chain routing in payment channel network (PCN)-based Internet of Things (IoT) is attracting increasing research attention. However, there are two major issues...Show MoreMetadata
Abstract:
Dynamic off-chain routing in payment channel network (PCN)-based Internet of Things (IoT) is attracting increasing research attention. However, there are two major issues in dynamic routing in PCN-based IoT with resource-limited devices. The first issue is how to achieve high long-term transaction efficiency in PCN with dynamic channel capacities. The second issue is how to achieve a lightweight routing algorithm deployed on IoT devices while achieving high transaction efficiency, i.e., successful payment amount and success ratio. Therefore, in this paper, we propose a compact deep reinforcement learning (DRL) algorithm to learn the joint dynamic and lightweight routing policy for maximizing long-term transaction efficiency. To obtain optimal performance in dynamic routing problems for off-chain systems, a proximal policy optimization algorithm is employed to create an actor–critic learning structure for training the teacher DRL model. To obtain a compact and efficient student DRL model, an adaptive pruning technique is utilized for pruning unnecessary parameters of networks in the teacher model adaptively without affecting its learning ability. Furthermore, knowledge distillation is leveraged to improve the performance of the student network. Thus, a compact and efficient student DRL model can be developed and implemented to maximize the long-term transaction efficiency in off-chain systems on resource-limited IoT devices. The simulation results demonstrate that the proposed DRL algorithm outperforms the other baseline algorithms in PCN transaction efficiency while requiring only 10% of the computation and storage resources compared with that of the original teacher model.
Published in: IEEE Journal on Selected Areas in Communications ( Volume: 40, Issue: 12, December 2022)