Elsevier

Automatica

Volume 50, Issue 3, March 2014, Pages 809-820
Automatica

Cooperative and Geometric Learning Algorithm (CGLA) for path planning of UAVs with limited information

https://doi.org/10.1016/j.automatica.2013.12.035Get rights and content

Abstract

In this paper, we propose a new learning algorithm, named as the Cooperative and Geometric Learning Algorithm (CGLA), to solve problems of maneuverability, collision avoidance and information sharing in path planning for Unmanned Aerial Vehicles (UAVs). The contributions of CGLA are three folds: (1) CGLA is designed for path planning based on cooperation of multiple UAVs. Technically, CGLA exploits a new defined individual cost matrix, which leads to an efficient path planning algorithm for multiple UAVs. (2) The convergence of the proposed algorithm for calculating the cost matrix is proven theoretically, and the optimal path in terms of path length and risk measure from a starting point to a target point can be calculated in polynomial time. (3) In CGLA, the proposed individual weight matrix can be efficiently calculated and adaptively updated based on the geometric distance and risk information shared among UAVs. Finally, risk evaluation is introduced first time in this paper for UAV navigation and extensive computer simulation results validate the effectiveness and feasibility of CGLA for safe navigation of multiple UAVs.

Introduction

The Unmanned Aerial Vehicles (UAVs) have been widely applied in military fields and civil industries, and there are significantly growing research interests in this area over the past decade (Bortoff, 2008, Chandler and Pachter, 2002, Chandler and Rasmussen, 2000, Rabbath et al., 2004, Zheng et al., 2005). The maneuverability, collision avoidance and information sharing in path planning for Unmanned Aerial Vehicles (UAVs) are among the most challengeable problems. The current progress has focused on these challenges from single UAV to multiple UAVs (Bellingham et al., 2002, Flint and Fernandez, 2005, Kim et al., 2006, Nikolos et al., 2007, Wu et al., 2011).

In previous works, Voronoi Graph Search (Bortoff, 2008) and Visibility Graph Search (Bellingham et al., 2002) are among the earliest path planning methods for both single UAV and multiple UAVs, but they have been proven to be effective only in simple environment. Both methods suffer from a fatal failure when the map information is partially available, especially when some obstacles are not detected. The A2D method reported in Kim et al. (2006) can be considered as a rule and inference based approach. When the environment becomes complicated, the method failed to find a good path to escape from dangerous zone. Evolutionary algorithms (Nikolos et al., 2007) have been used as a viable candidate to effectively solve path planning problems and can provide feasible solutions within a short time. Being different with a single UAV, the path planning for multiple UAVs mainly concentrates on the collaborative framework, information sharing (Bauso et al., 2004, Beard and Stepanyan, 2003, Dogman, 2003, Flint and Fernandez, 2005, Flint et al., 2002, Girard et al., 2007, Shanmugavel et al., 2010). Researchers in Bellingham et al. (2002) use the Mixed-integer Linear Programming to find a global path for multiple UAVs in order to avoid collision. However, the method fails to handle sudden changes in a local region. It is widely known that Dynamic Programming (DP) can be used to obtain an optimal path when full information and unlimited computation resource are available (Flint & Fernandez, 2005). In Flint et al. (2002), a dynamic programming method is presented to produce the near-optimal trajectories for multiple UAVs to cooperatively search for targets. However, path planning for multiple UAVs is generally a problem depending on individual behaviors, which is not directly suitable for global methods such as dynamic programming. A number of the improved versions of DP have since then been proposed. For example, a so-called stochastic dynamic programming is proposed for UAV path planning in Girard et al. (2007). The method is actually based on random parameters, specifically the number of flying times. The cost function is one of the key issues in DP and its variants. The distance between the center mass and the target is used in Bauso et al. (2004), and the distance and cooperation measures are exploited in Flint and Fernandez (2005) for UAV path planning. In Dogman (2003), the authors propose a path planning algorithm based on a map with probability of threats, which is built from a priori surveillance data. In Bauso et al. (2004) and Beard and Stepanyan (2003), the researchers develop a novel hybrid model, and design consensus protocols for the management of information. They further synthesize local predictive controllers through a distributed, scalable and suboptimal neuro-dynamic programming algorithm. Fuzzy reasoning, or approximate reasoning, has been an active topic in the fuzzy community since the inception of Zadeh’s pioneering work (Zadeh, 1974). In Zhao, Zheng, Liu, Cai, and Lin (2011) and Zheng, Wu, Liu, and Cai (2011), fuzzy reasoning methods treat fuzzy reasoning as a process of optimization rather than logical inference in the UAV path planning. An explicit feedback mechanism and the ‘virtual obstacle’ method, the so-called feedback based CRI (FBCRI), are embedded into the optimal fuzzy reasoning method to solve the path planning for multiple UAVs (Zhao et al., 2011, Zheng et al., 2011). In FBCRI, the fuzzy rule base is updated during the reasoning process by incorporating newly generated rules into the original rule base (Zheng et al., 2011). Furthermore, by embedding some virtual sub-goals into the FBCRI based approach, a new cooperative path planning approach based on virtual sub-goals (CPVS) is proposed in Zhao et al. (2011). But the analytic robustness analysis of various fuzzy reasoning methods is often very complicated (Cai and Zhang, 2008, Zheng et al., 2011). Though there has been significant progress on path planning for multiple UAVs based on a given risk map and logical feedback controller, to the best of our knowledge, the path planning for multiple UAVs based on adaptive information sharing is not well studied in the literature.

In this paper, we address the path planning problem for both single and multiple UAVs in a dynamically changing environment from perspectives of information sharing and reinforcement learning (Even-Dar and Mansour, 2003, Watkins, 1989, Zhang, Mao, Liu, Liu, and Zheng, 2013). In path planning, Q-Learning (Watkins, 1989) (reinforcement learning) is of high performance for the case when the entire map is known to planners (Mao et al., 2012, Zhang, Mao, Liu, Liu, and Zheng, 2013). Geometric distance and information sharing are believed to be very important elements for path planning when partial area of the map is changing (Zhang, Mao, Liu, & Liu, 2013) based on a shared weight matrix and the reward matrix by all UAVs in a random process. However, such a valuable information is not used by the traditional reinforcement approach. In the theoretical aspect, we know that the convergence of Q-Learning is only studied in a probabilistic manner (Even-Dar & Mansour, 2003). Motivated by the above observation, we propose a new algorithm, called the Cooperative and Geometric Learning Algorithm (CGLA), to build a general path-planning model based on the criterion in terms of geometric distance and integral risk information. The theoretical convergence and complexity analysis about the proposed method are also well investigated in this paper.

The concepts of the Individual Weight Matrix and cost matrix are proposed in this paper and they lead to an efficient path planning of both single UAV and multiple UAVs. Both matrices can be dynamically updated by taking both the distance information and integral risk into consideration. A single weight matrix or cost matrix is used by all UAVs (Zhang, Mao, Liu, & Liu, 2013), which will result into bad path planning results when no communication is available among UAVs, even can be of higher efficiency. This paper exploits an individual weight matrix and the cost matrix for a more reasonable path planning. In multiple UAVs, the “virtual obstacle” is proposed and embedded into the weight matrix calculation for prevention of possible collisions. Extensive computer simulation results show that the proposed approach performs very well in terms of the path length and the integral risk measure, which was validated to be a good criterion for path planning.

The rest of the paper is organized as follows: Section  2 describes the UAV threat environment modeling. Sections  3 The Individual Weight Matrix, 4 Cooperative and geometric learning algorithm present the main components of the CGLA algorithm. The extensive computer simulation results are given in Section  5, and Section  6 concludes the paper.

Section snippets

Modeling of UAV environment

UAVs often fly in low-attitude urban sky, or mountainous environment, which needs to be simulated to evaluate performances of different path planning methods. UAVs are vulnerable to attack from the ground, or other UAVs. It is necessary for UAVs to keep a certain distance from regions of high risk to ensure a safe flying. The distribution of the probabilistic risk to an obstacle on a map can be used to compute the risk at any location on the map. For example, the probabilistic risk of the area

The Individual Weight Matrix

The relationship between action and state (Watkins, 1989, Zhang, Mao, Liu, Liu, and Zheng, 2013) is a very important issue for the path planning problem. Firstly, in order to find the next point from the current point in a given map, we need to know the relationship or weight between any two positions or points. Secondly, further action for each step in path planning needs to be considered in terms of both the distance and the risk information for two neighborhood points in a given map. In the

Cooperative and geometric learning algorithm

CGLA is a general method designed for path planning of both single and multiple UAVs. For a single UAV, CGLA is executed when a new threatening object is detected and the weight matrix A is updated. For multiple UAVs, CGLA is scheduled to be performed when A is changed due to the detected risk and the dynamic information shared by other UAVs.

Computer simulation results

Extensive simulations are performed in this section. The same map was used for all competing methods like Behavior Coordination and Virtual (BCV) (Wu et al., 2011), Q-Learning (Zhang, Mao, Liu, & Liu, 2013), CPVS (Zhao et al., 2011), and FBCRI (Zheng et al., 2011) (with information sharing for multiple UAVs). The BCV method (Wu et al., 2011) is a real-time path planning approach based on coordination of the global and local behaviors. While most of the existing works only use path length as

Conclusions

In this paper, we propose a new Cooperative and Geometric Learning method to solve the path planning problem for multiple UAVs. Compared with other existing methods, CGLA leads to a very simple Path Planning solution. The parameters K and N well studied in CGLA can balance between the safety and economy of the paths. K can be adjusted to be suitable for different kinds of tasks of UAVs, which are designed to find the path with more flexibility. CGLA is also designed for multi-UAV collaborative

Baochang Zhang received the B.S., M.S., and Ph.D. degrees in Computer Science from the Harbin Institute of Technology, Harbin, China, in 1999, 2001, and 2006, respectively. From 2006 to 2008, he was a research fellow with the Chinese University of Hong Kong, Hong Kong, and with Griffith University, Brisbane, Australia. Currently, he is an associate professor with the Science and Technology on Aircraft Control Laboratory, School of Automation Science and Electrical Engineering, Beihang

References (26)

  • M. Flint et al.

    Approximate dynamic programming methods for cooperative UAV search

  • Zheng Zheng et al.

    A feedback based CRI approach to fuzzy reasoning

    Applied Soft Computing

    (2011)
  • Bauso, D., Giarrè, L., & Pesenti, R. (2004). Multiple UAV cooperative path planning via neuro-dynamic programming. In...
  • R.W. Beard et al.

    Information consensus in distributed multiple vehicle coordinated control

  • Bellingham, John, Richards, Arthur, & How, Jonathan P. (2002). Receding horizon control of autonomous aerial vehicles....
  • Bortoff, Scott A. (2008). Path planning for UAVs. In IEEE proceedings of the American control conference (pp....
  • K.Y. Cai et al.

    Fuzzy reasoning as a control problem

    IEEE Transactions on Fuzzy Systems

    (2008)
  • Chandler, P.R., & Pachter, Meir (2002). Complexity in UAV cooperative control, In The proceedings of the American...
  • Chandler, P.R., & Rasmussen, S. (2000). UAV Cooperative path planning. In AIAA Guidance, navigation, and control...
  • Dogman, A. (2003). Probabilistic approach in path planning for UAVs. In Proc. of IEEE international symposium on...
  • Eyal Even-Dar et al.

    Learning rates for Q-learning

    Journal of Machine Learning Research

    (2003)
  • Flint, M., Polycarpou, M., & Fernandez-Gaucherand, E. (2002). Cooperative control for multiple autonomous UAV’s...
  • Girard, A., Darbha, S., Pachter, M., & Chandler, P. (2007). Stochastic dynamic programming for uncertainty handling in...
  • Cited by (73)

    • Control-oriented UAV highly feasible trajectory planning: A deep learning method

      2021, Aerospace Science and Technology
      Citation Excerpt :

      Trajectory feasibility [17], [19], which reflects whether a planned trajectory can be followed exactly, is a fundamental problem in UAV trajectory planning and mainly depends on the accuracy of the planning model. Thus far, most studies [1–13] think that the functions of the trajectory planning and the trajectory tracking control are uncoupled in the UAV system. Namely, the trajectory planning module generates command signals, which are satisfied with the mission requirement, for the low-level controller, whereas the low-level controller module tries to make the UAV follow the given command signals as precise as possible.

    • An artificial moment method for conflict resolutions with robots being close to their targets

      2021, Information Sciences
      Citation Excerpt :

      For the above reasons, many distributed or decentralized strategies have been developed for the conflict resolution of swarm robots. They include the prioritized approach [4,22], artificial potential field method [10,11,16,27], reciprocal collision avoidance strategy [2,5,6], evolutionary algorithms [7,19,28], and cooperative and geometric learning algorithms [29]. Furthermore, methods based on sequential convex programming [18], collision cones [8,12], behavioral dynamics [13], cocktail party models [15], and Bernstein-Bézier curves [21], and many other methods [3,14,17], are also widely used for resolving the problem.

    View all citing articles on Scopus

    Baochang Zhang received the B.S., M.S., and Ph.D. degrees in Computer Science from the Harbin Institute of Technology, Harbin, China, in 1999, 2001, and 2006, respectively. From 2006 to 2008, he was a research fellow with the Chinese University of Hong Kong, Hong Kong, and with Griffith University, Brisbane, Australia. Currently, he is an associate professor with the Science and Technology on Aircraft Control Laboratory, School of Automation Science and Electrical Engineering, Beihang University, Beijing, China. He was supported by the Program for New Century Excellent Talents in University of Ministry of Education of China. His current research interests include pattern recognition, machine learning, face recognition, and wavelets.

    Wanquan Liu received the B.Sc. degree in Applied Mathematics from Qufu Normal University, PR China, in 1985, the M.Sc. degree in Control Theory and Operation Research from Chinese Academy of Science in 1988, and the Ph.D. degree in Electrical Engineering from Shanghai Jiaotong University, in 1993. He once held the ARC Fellowship, U2000 Fellowship and JSPS Fellowship and attracted research funds from different resources over 2 million dollars. He is currently an Associate Professor in the Department of Computing at Curtin University and is in editorial board for seven international journals. His current research interests include large-scale pattern recognition, signal processing, machine learning, and control systems.

    Zhili Mao received the B.S., degree in Automation Science from Beihang University, Beijing, China in 2012. From 2013 he was a postgraduate student getting his M.Phil. degree in Fok Ying Tung. Graduate School, Hong Kong University of Science and Technology, Clear Water Bay, New Territories, Hong Kong.

    Jianzhuang Liu received the Ph.D. degree in Computer Vision from The Chinese University of Hong Kong, Hong Kong, in 1997. From 1998 to 2000, he was a research fellow with Nanyang Technological University, Singapore. From 2000 to 2012, he was a postdoctoral fellow, then an assistant professor, and then an adjunct associate professor with The Chinese University of Hong Kong. He joined Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, as a professor, in 2011. He is currently a chief scientist with Huawei Technologies Co. Ltd., Shenzhen, China. He has published more than 100 papers, most of which are in prestigious journals and conferences in computer science. His research interests include computer vision, image processing, machine learning, multimedia, and graphics.

    Linlin Shen received Ph.D. degree from University of Nottingham, UK in 2005. From 2005 to 2006, he was a research fellow with University of Nottingham, working on MRI brain image processing. He has been with Shenzhen University, China since 2006 and is currently a professor at the School of Computer Science and Software Engineering. His research interests include computer vision, image processing, and pattern recognition. He received the Most Cited Paper award from the journal of Image and Vision Computing in 2010 and his team was the winner of International Competition on Cells Classification by Fluorescent Image Analysis organized by ICIP 2013.

    This work was supported in part by the Natural Science Foundation of China, under Contracts 60903065, 61039003 and 61272052, in part by the Fundamental Research Funds for the Central Universities, and by the Program for New Century Excellent Talents in University of Ministry of Education of China. The material in this paper was partially presented at the ICUAS conference. This paper was recommended for publication in revised form by Editor Berç Rüstem.

    1

    Tel.: +86 13621107142; fax: +86 13621107142.

    2

    Tel.: +86 13641425339; fax: +86 13641425339.

    View full text