Dynamic value iteration networks for the planning of rapidly changing UAV swarms

Li, Wei; Yang, Bowei; Song, Guanghua; Jiang, Xiaohong

doi:10.1631/FITEE.1900712

Dynamic value iteration networks for the planning of rapidly changing UAV swarms

用于规划快速变化无人机群的动态值迭代网络

Published: 12 January 2021

Volume 22, pages 687–696, (2021)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

211 Accesses
Explore all metrics

Abstract

In an unmanned aerial vehicle ad-hoc network (UANET), sparse and rapidly mobile unmanned aerial vehicles (UAVs)/nodes can dynamically change the UANET topology. This may lead to UANET service performance issues. In this study, for planning rapidly changing UAV swarms, we propose a dynamic value iteration network (DVIN) model trained using the episodic Q-learning method with the connection information of UANETs to generate a state value spread function, which enables UAVs/nodes to adapt to novel physical locations. We then evaluate the performance of the DVIN model and compare it with the non-dominated sorting genetic algorithm II and the exhaustive method. Simulation results demonstrate that the proposed model significantly reduces the decision-making time for UAV/node path planning with a high average success rate.

摘要

在无人机自组网 (UANET) 中, 稀疏且高速移动的无人机节点会动态改变无人机自组网的拓扑结构, 这可能会导致无人机自组网服务性能问题. 为规划快速变化的无人机群, 本文提出一种动态值迭代网络 (DVIN) 模型, 该模型利用无人机自组网的连接信息, 采用场景式Q学习方法训练, 生成状态值传播函数, 使无人机节点能够自适应调节至新的物理位置. 然后, 评估了动态值迭代网络模型的性能, 并将其与非支配排序遗传算法NSGA-II和穷举法比较. 仿真结果表明, 动态值迭代网络模型显著缩短了无人机节点路径规划的决策时间, 且平均成功率更高.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An adaptive local search-based arithmetic optimization algorithm for unmanned aerial vehicle placement

Article 17 December 2024

Broad Learning System Routing to Mitigate the Impact of Dynamic Changing Topology for 3D Flying Ad Hoc Networks

Article 01 June 2024

An improved whale optimization algorithm for UAV swarm trajectory planning

Article Open access 26 September 2024

References

Abadi M, Barham P, Chen JM, et al., 2016. TensorFlow: a system for large-scale machine learning. Proc 12^th USENIX Conf on Operating Systems Design and Implementation, p.265–283.
Bekmezci I, Sahingoz OK, Temel Ş, 2013. Flying ad-hoc networks (FANETs): a survey. Ad Hoc Netw, 11(3):1254–1270. https://doi.org/10.10167/j.adhoc.2012.12.004
Article Google Scholar
Bellman R, 1966. Dynamic programming. Science, 153(3731):34–37. https://doi.org/10.1126/science.153.3731.34
Article Google Scholar
Bertsekas DP, 1995. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, USA.
MATH Google Scholar
Boureau YL, Bach F, LeCun Y, et al., 2010. Learning mid-level features for recognition. Proc IEEE Computer Society Conf on Computer Vision and Pattern Recognition, p.2559–2566. https://doi.org/10.1109/CVPR.2010.5539963
Buck I, Foley T, Horn D, et al., 2004. Brook for GPUs: stream computing on graphics hardware. ACM Trans Graph, 23(3):777–786. https://doi.org/10.1145/1015706.1015800
Article Google Scholar
Challita U, Saad W, Bettstetter C, 2018. Deep reinforcement learning for interference-aware path planning of cellular-connected UAVs. Proc IEEE Int Conf on Communications, p.1–7. https://doi.org/10.1109/ICC.2018.8422706
Cruz F, Wüppen P, Fazrie A, et al., 2019. Action selection methods in a robotic reinforcement learning scenario. Proc IEEE Latin American Conf on Computational Intelligence, p.1–6. https://doi.org/10.1109/LA-CCI.2018.8625243
Deb K, Pratap A, Agarwal S, et al., 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput, 6(2):182–197. https://doi.org/10.1109/4235.996017
Article Google Scholar
Fontes RR, 2019. Emulando Redes Sem Fio Com Mininet-WiFi. https://github.com/ramonfontes/mn-wifi-book-pt/blob/master/preview-book.pdf
Fontes RR, Afzal S, Brito SHB, et al., 2015. Mininet-WiFi: emulating software-defined wireless networks. Proc 11^th Int Conf on Network and Service Management, p.384–389. https://doi.org/10.1109/CNSM.2015.7367387
François-Lavet V, Henderson P, Islam R, et al., 2018. An introduction to deep reinforcement learning. Found Trends® Mach Learn, 11(3–4):219–354. https://doi.org/10.1561/2200000071
Article Google Scholar
Koohifar F, Kumbhar A, Guvenc I, 2017. Receding horizon multi-UAV cooperative tracking of moving RF source. IEEE Commun Lett, 21(6):1433–1436. https://doi.org/10.1109/LCOMM.2016.2603977
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Lee J, Kang BY, Kim DW, 2013. Fast genetic algorithm for robot path planning. Electron Lett, 49(23):1449–1451. https://doi.org/10.1049/el.2013.3143
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, et al., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533. https://doi.org/10.1038/nature14236
Article Google Scholar
Mnih V, Badia AP, Mirza L, et al., 2016. Asynchronous methods for deep reinforcement learning. Proc 33^rd Int Conf on Machine Learning, p.1928–1937.
Niu SF, Chen SH, Guo HY, et al., 2018. Generalized value iteration networks: life beyond lattices. Proc 32^nd AAAI Conf on Artificial Intelligence, p.6246–6253.
Roberge V, Tarbouchi M, Labonte G, 2013. Comparison of parallel genetic algorithm and particle swarm optimization for real-time UAV path planning. IEEE Trans Ind Inform, 9(1):132–141. https://doi.org/10.1109/TII.2012.2198665
Article Google Scholar
Schaal S, 1999. Is imitation learning the route to humanoid robots? Trends Cogn Sci, 3(6):233–242. https://doi.org/10.1016/s1364-6613(99)01327-3
Article Google Scholar
Tamar A, Wu Y, Thomas G, et al., 2017. Value iteration networks. Proc 26^th Int Joint Conf on Artificial Intelligence, p.4949–4953. https://doi.org/10.24963/ijcai.2017/700
Tokic M, Palm G, 2011. Value-difference based exploration: adaptive control between epsilon-greedy and softmax. Proc 34^th Annual German Conf on Advances in Artificial Intelligence, p.335–346. https://doi.org/10.1007/978-3-642-24455-1_33
Watkins CJCH, Dayan P, 1992. Q-learning. Mach Learn, 8(3–4):279–292. https://doi.org/10.1007/BF00992698
MATH Google Scholar
Zhang CY, Patras P, Haddadi H, 2019. Deep learning in mobile and wireless networking: a survey. IEEE Commun Surv Tutor, 21(3):2224–2287. https://doi.org/10.1109/COMST.2019.2904897
Article Google Scholar
Zhang T, Li Q, Zhang CS, et al., 2017. Current trends in the development of intelligent unmanned autonomous systems. Front Inform Technol Electron Eng, 18(1):68–85. https://doi.org/10.1631/FITEE.1601650
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Aeronautics and Astronautics, Zhejiang University, Hangzhou, 310027, China
Wei Li (李伟), Bowei Yang (杨波威) & Guanghua Song (宋广华)
School of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Xiaohong Jiang (姜晓红)

Authors

Wei Li (李伟)
View author publications
You can also search for this author inPubMed Google Scholar
Bowei Yang (杨波威)
View author publications
You can also search for this author inPubMed Google Scholar
Guanghua Song (宋广华)
View author publications
You can also search for this author inPubMed Google Scholar
Xiaohong Jiang (姜晓红)
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Wei LI and Bowei YANG designed the research. Wei LI processed the data and drafted the manuscript. Guanghua SONG and Xiaohong JIANG helped organize the manuscript. Wei LI and Bowei YANG revised and finalized the paper.

Corresponding author

Correspondence to Bowei Yang (杨波威).

Ethics declarations

Wei LI, Bowei YANG, Guanghua SONG, and Xiaohong JIANG declare that they have no conflict of interest.

Additional information

Project supported by the National Natural Science Foundation of China (No. 61501399), the SAIC MOTOR (No. 1925), and the National Key R&D Program of China (No. 2018AAA0102302)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, W., Yang, B., Song, G. et al. Dynamic value iteration networks for the planning of rapidly changing UAV swarms. Front Inform Technol Electron Eng 22, 687–696 (2021). https://doi.org/10.1631/FITEE.1900712

Download citation

Received: 19 December 2019
Accepted: 27 June 2020
Published: 12 January 2021
Issue Date: May 2021
DOI: https://doi.org/10.1631/FITEE.1900712

Key words

关键词

CLC number

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic value iteration networks for the planning of rapidly changing UAV swarms

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An adaptive local search-based arithmetic optimization algorithm for unmanned aerial vehicle placement

Broad Learning System Routing to Mitigate the Impact of Dynamic Changing Topology for 3D Flying Ad Hoc Networks

An improved whale optimization algorithm for UAV swarm trajectory planning

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Subscribe and save

Buy Now