Abstract
In this paper, we consider the cooperative decision-making problem for multi-target tracking in multi-unmanned aerial vehicle (UAV) systems. The multi-UAV decision-making problem is modeled in the framework of distributed multi-agent partially observable Markov decision processes (MPOMDPs). Specifically, the state of the targets is represented by the joint multi-target probability distribution (JMTPD), which is estimated by a distributed information fusion strategy. In the information fusion process, the most accurate estimation is selected to propagate through the whole network in finite time. We propose a max-consensus protocol to guarantee the consistency of the JMTPD. It is proven that the max-consensus can be achieved in the connected communication graph after a limited number of iterations. Based on the consistent JMTPD, the distributed partially observable Markov decision algorithm is used to make tracking decisions. The proposed method uses the Fisher information to bid for targets in a distributed auction. The bid is based upon the reward value of the individual UAV’s POMDPs, thereby removing the need to optimize the global reward in the MPOMDPs. Finally, the cooperative decision-making approach is deployed in a simulation of a multi-target tracking problem. We compare our proposed algorithm with the centralized method and the greedy approach. The simulation results show that the proposed distributed method has a similar performance to the centralized method, and outperforms the greedy approach.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Kassas, Z. M., & Özgüner, Ü. (2010). A nonlinear filter coupled with hospitability and synthetic inclination maps for in-surveillance and out-of-surveillance tracking. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(1), 87–97.
Cesare, K., Skeele, R., Yoo, S. H., Zhang, Y., & Hollinger, G. (2015). Multi-UAV exploration with limited communication and battery. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 2230–2235). IEEE.
Darrah, M., Wilhelm, J., Munasinghe, T., Duling, K., Yokum, S., Sorton, E., et al. (2015). A flexible genetic algorithm system for multi-UAV surveillance: Algorithm and flight testing. Unmanned Systems, 3(1), 49–62.
Capitán, J., Merino, L., & Ollero, A. (2016). Cooperative decision-making under uncertainties for multi-target surveillance with multiples UAVs. Journal of Intelligent & Robotic Systems, 84(1–4), 371–386. Springer.
Caraballo, L., Acevedo, J., Díaz-Báñez, J., Arrue, B., Maza, I., & Ollero, A. (2014). The block-sharing strategy for area monitoring missions using a decentralized multi-UAV system. In 2014 International conference on unmanned aircraft systems (ICUAS) (pp. 602–610). IEEE.
Mersheeva, V., & Friedrich, G. (2015). Multi-UAV monitoring with priorities and limited energy resources. In Proceedings of the 25th international conference on automated planning and scheduling (pp. 347–356).
Qi, J., Song, D., Shang, H., Wang, N., Hua, C., Wu, C., et al. (2016). Search and rescue rotary-wing UAV and its application to the lushan ms 7.0 earthquake. Journal of Field Robotics, 33(3), 290–321. Wiley Online Library.
Adamey, E., & Ozguner, U. (2011). Cooperative multitarget tracking and surveillance with mobile sensing agents: A decentralized approach. In 2011 14th International IEEE conference on intelligent transportation systems (ITSC) (pp. 1916–1922). IEEE.
Dames, P., Tokekar, P., & Kumar, V. (2017). Detecting, localizing, and tracking an unknown number of moving targets using a team of mobile robots. The International Journal of Robotics Research, 36(13–14), 1540–1553. SAGE.
Tokekar, P., Isler, V., & Franchi, A. (2014). Multi-target visual tracking with aerial robots. In 2014 IEEE/RSJ international conference on intelligent robots and systems (IROS 2014) (pp. 3067–3072). IEEE.
Bayram, H., Stefas, N., Engin, K. S., & Isler, V. (2017). Tracking wildlife with multiple UAVs: System design, safety and field experiments. In 2017 International symposium on multi-robot and multi-agent systems (MRS) (pp. 97–103). IEEE.
Hausman, K., Müller, J., Hariharan, A., Ayanian, N., & Sukhatme, G. S. (2016). Cooperative control for target tracking with onboard sensing. In M. A. Hsieh, O. Khatib, & V. Kumar (Eds.), Experimental robotics (pp. 879–892). Berlin: Springer.
Schlotfeldt, B., Thakur, D., Atanasov, N., Kumar, V., & Pappas, G. J. (2018). Anytime planning for decentralized multirobot active information gathering. IEEE Robotics and Automation Letters, 3(2), 1025–1032. IEEE.
Nestmeyer, T., Giordano, P. R., Bülthoff, H. H., & Franchi, A. (2017). Decentralized simultaneous multi-target exploration using a connected network of multiple robots. Autonomous Robots, 41(4), 989–1011. Springer.
Mahmoud, M. S., & Khalid, H. M. (2013). Distributed Kalman filtering: A bibliographic review. IET Control Theory & Applications, 7(4), 483–501. IET.
Capitan, J., Spaan, M. T., Merino, L., & Ollero, A. (2013). Decentralized multi-robot cooperation with auctioned POMDPs. The International Journal of Robotics Research, 32(6), 650–671. SAGE.
Capitán, J., Merino, L., Caballero, F., & Ollero, A. (2011). Decentralized delayed-state information filter (DDSIF): A new approach for cooperative decentralized tracking. Robotics and Autonomous Systems, 59(6), 376–388. Elsevier.
Adamey, E., & Ozguner, U. (2012). A decentralized approach for multi-UAV multitarget tracking and surveillance. In SPIE Defense, Security, and Sensing (Vol. 8389, pp. 838915-1–838915-6). International Society for Optics and Photonics.
Chagas, R. A. J., & Waldmann, J. (2015). A novel linear, unbiased estimator to fuse delayed measurements in distributed sensor networks with application to UAV fleet. In D. Choukroun, Y. Oshman, J. Thienel, & M. Idan (Eds.), Advances in estimation, navigation, and spacecraft control (pp. 135–157). Berlin: Springer.
Fanti, M. P., Mangini, A. M., & Ukovich, W. (2012). A quantized consensus algorithm for distributed task assignment. In 2012 IEEE 51st annual conference on decision and control (CDC) (pp. 2040–2045). IEEE.
Luo, L., Chakraborty, N., & Sycara, K. (2015). Distributed algorithms for multirobot task assignment with task deadline constraints. IEEE Transactions on Automation Science and Engineering, 12(3), 876–888. IEEE.
Peng, Z., Yang, S., Wen, G., & Rahmani, A. (2014). Distributed consensus-based robust adaptive formation control for nonholonomic mobile robots with partial known dynamics. In Mathematical Problems in Engineering. https://doi.org/10.1155/2014/670497.
Seo, J., Kim, Y., Kim, S., & Tsourdos, A. (2012). Consensus-based reconfigurable controller design for unmanned aerial vehicle formation flight. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 226(7), 817–829. SAGE.
Maggs, M. K., O’Keefe, S. G., & Thiel, D. V. (2012). Consensus clock synchronization for wireless sensor networks. IEEE Sensors Journal, 12(6), 2269–2277.
Battistelli, G., & Chisci, L. (2014). Kullback–Leibler average, consensus on probability densities, and distributed state estimation with guaranteed stability. Automatica, 50(3), 707–718. Elsevier.
Palacios-Gasós, J. M., Montijano, E., Sagüés, C., & Llorente, S. (2016). Distributed coverage estimation and control for multirobot persistent tasks. IEEE Transactions on Robotics, 32(6), 1444–1460. IEEE.
Di Paola, D., Petitti, A., & Rizzo, A. (2015). Distributed Kalman filtering via node selection in heterogeneous sensor networks. International Journal of Systems Science, 46(14), 2572–2583. Taylor & Francis.
Smallwood, R. D., & Sondik, E. J. (1973). The optimal control of partially observable markov processes over a finite horizon. Operations Research, 21(5), 1071–1088. INFORMS.
Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1), 99–134. Elsevier.
Ponda, S. S., Johnson, L. B., Geramifard, A., & How, J. P. (2015). Cooperative mission planning for multi-UAV teams. In K. P. Valavanis & G. J. Vachtsevanos (Eds.), Handbook of unmanned aerial vehicles (pp. 1447–1490). Berlin: Springer.
Oliehoek, F. A. (2012). Decentralized POMDPs. In Reinforcement Learning (Vol. 12, pp. 471–503). Springer.
Wu, F., Zilberstein, S., & Chen, X. (2011). Online planning for multi-agent systems with bounded communication. Artificial Intelligence, 175(2), 487–511. Elsevier.
Vaisenberg, R., Della Motta, A., Mehrotra, S., & Ramanan, D. (2014). Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy. Pervasive and Mobile Computing, 10, 83–103. Elsevier.
Panella, A., & Gmytrasiewicz, P. (2017). Interactive POMDPs with finite-state models of other agents. Autonomous Agents and Multi-Agent Systems, 31(4), 861–904.
Yu, H., Meier, K., Argyle, M., & Beard, R. W. (2015). Cooperative path planning for target tracking in urban environments using unmanned air and ground vehicles. IEEE/ASME Transactions on Mechatronics, 20(2), 541–552. IEEE.
Farmani, N., Sun, L., & Pack, D. (2015). Tracking multiple mobile targets using cooperative unmanned aerial vehicles. In 2015 International conference on unmanned aircraft systems (ICUAS) (pp. 395–400). IEEE.
Zhang, K., Collins, E. G, Jr., & Shi, D. (2012). Centralized and distributed task allocation in multi-robot teams via a stochastic clustering auction. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 7(2), 21.
Edalat, N., Tham, C. K., & Xiao, W. (2012). An auction-based strategy for distributed task allocation in wireless sensor networks. Computer Communications, 35(8), 916–928. Elsevier.
Spaan, M. T., Veiga, T. S., & Lima, P. U. (2015). Decision-theoretic planning under uncertainty with information rewards for active cooperative perception. Autonomous Agents and Multi-Agent Systems, 29(6), 1157–1185. Springer.
Zhao, Y., Wang, X., Cong, Y., & Shen, L. (2018). Information geometry based action decision-making for target tracking by fixed-wing UAV: From algorithm design to theory analysis. International Journal of Advanced Robotic Systems, 15(4), 1729881418787061.
Ragi, S., & Chong, E. K. (2013). Uav path planning in a dynamic environment via partially observable Markov decision process. IEEE Transactions on Aerospace and Electronic Systems, 49(4), 2397–2412. IEEE.
Burguera, A., González, Y., & Oliver, G. (2009). Sonar sensor models and their application to mobile robot localization. Sensors, 9(12), 10217–10243. Molecular Diversity Preservation International.
Nejad, B. M., Attia, S. A., & Raisch, J. (2009). Max-consensus in a max-plus algebraic setting: The case of fixed communication topologies. In XXII international symposium on information, communication and automation technologies, 2009 (ICAT 2009) (pp. 1–7). IEEE.
Petitti, A., Di Paola, D., Rizzo, A., & Cicirelli, G. (2011). Consensus-based distributed estimation for target tracking in heterogeneous sensor networks. In 2011 50th IEEE conference on decision and control and European control conference (CDC-ECC) (pp. 6648–6653). IEEE.
Bui, M., Butelle, F., & Lavault, C. (2004). A distributed algorithm for constructing a minimum diameter spanning tree. Journal of Parallel and Distributed Computing, 64(5), 571–577. Elsevier.
Burkard, R. E. (2002). Selected topics on assignment problems. Discrete Applied Mathematics, 123(1), 257–302. Elsevier.
Rao, C. R. (1992). Information and the accuracy attainable in the estimation of statistical parameters. In N. L. Johnson, A. W. Kemp, & S. Kotz (Eds.), Breakthroughs in statistics (pp. 235–247). Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research is sponsored by the National Key Laboratory of Science and Technology on UAV, Northwestern Polytechnical University, under the Grant Number 614230110080817.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 13621 KB)
Appendices
The properties of the trace of the covariance matrix
The estimations of multi-target states are produced by Kalman filter. The state of the filter is represented by two variables:
-
\(\hat{{\mathbf {s}}}^t(k)\), the posteriori state of the target at time k;
-
\(\hat{{\mathbf {P}}}^t(k)\), the posteriori error covariance matrix (a measure of the accuracy of the state estimation).
Then covariance matrices accurately reflect the covariance of estimations.
One generalization to a scalar-valued covariance for vector-valued random variables can be obtained by interpreting the deviation as the Euclidean distance:
The expression can be rewritten as
where \({s}_i^t(k)\) and \({\hat{s}}_i^t(k)\) are the ith element of the target state \({{\mathbf {s}}}^t(k)\) and its mean \(\hat{{\mathbf {s}}}^t(k)\). The (i, j)th element of the covariance matrix \(\hat{{\mathbf {P}}}^t(k)\) is \({{\hat{P}}_{ii}^t(k)}\). Finally, this can be simplified to
which is the trace of the covariance matrix.
In conclusion, the trace of the matrix is the Euclidean distance deviation of the state estimation, which can be a scalar-valued measure of the accuracy of the state estimation.
Proof of Theorem 1
The proof is based on the “Max-Plus Algebra”, defined in [43]. It is a powerful tool for the timed cyclic discrete-event systems and allows for a compact representation of weighted graphs.
The max-plus algebra consists of two binary operations, \(\oplus \) and \(\otimes \), on the set \({\mathbb {R}}_{\max } := {\mathbb {R}}\cup \{-\infty \}\). The operations are defined as follows:
The neutral element of the max-plus addition \(\oplus \) is \(-\infty \), denoted by \(\varepsilon \). The neutral element of multiplication \(\otimes \) is 0, denoted by \(\mathrm{e}\). The elements \(\varepsilon \) and \(\mathrm{e}\) are also referred to as the zero and one element of the max-plus algebra. Similar to conventional algebra, the associativity, commutativity, and distributivity of multiplication over addition also hold for the max-plus algebra. Both operations can be extended to matrices in a straightforward way. For \(A,B\in {\mathbb {R}}_{\max }^{m\times n}\),
For \(A\in {\mathbb {R}}_{\max }^{m\times n}\), \(B\in {\mathbb {R}}_{\max }^{n\times q}\),
Multiplication of a matrix \(A\in {\mathbb {R}}_{\max }^{m\times n}\) and a scalar \(\alpha \in {\mathbb {R}}_{\max }\) is defined by
Note that, as in conventional algebra, the multiplication symbol \(\otimes \) is often omitted.
In the sequel, we also need matrices of zero elements, denoted by N, and of one elements, denoted by E. The identity matrix I is a square matrix with
For any matrix \(A\in {\mathbb {R}}_{\max }^{n\times n}\), its precedence graph\({\mathcal {G}}(A)\) is defined in the following way: it has n nodes, denoted by \(1,\ldots ,n\), and (j, i) is an edge if and only if \(a_{ij}\ne \varepsilon \). In this case \(a_{ij}\) is the weight of edge (j, i). Then
-
A path in \({\mathcal {G}}(A)\) is a sequence of \(p > 1\) nodes, denoted by \(\rho :=i_1,\ldots ,i_p\), such that \(a_{i_{k+1}i_k}\ne \varepsilon , k=1,\ldots ,p-1\).
-
\((A^k)_{ij}\) represents the maximal weight of all paths of length k from node j to node i, where
$$\begin{aligned} A^k:=\underbrace{A\otimes A\otimes \ldots \otimes A}_{(k - 1)-\text{ times } \text{ multiplication }},k\ge 1 \end{aligned}$$(40)and \(A^0=I\).
On this basis we define \(A_i\in {\mathbb {R}}_{\max }^{N^u_i\times N^u_i}\) as a matrix, representing the communication topology of the connected undirected graph \({\mathcal {G}}^u_i\), where the ith UAV is located. There exists a path of length d from node m to node n if and only if \((A_i^d)_{mn}=\mathrm{{e}}\), where \(m,n =1,2,\ldots ,N^u_i\). In addition, \(E_{i}\) is defined as a particular class of matrix with one element: \((E_{i})_{mn}:=\mathrm{e}\) for \(m=i\) or \(n=i\). The other elements of \(E_{i}\) can be at any value.
As mentioned above, the system is dynamic and the communication range is limited. Therefore, the communication graph of all UAVs may be split into multiple independent subgraphs and the topology is changing in real time. Theorem 1 essentially gives the minimum number of iterations to ensure that the ith UAV achieves the maximum in the subgraph \({\mathcal {G}}^u_i\). When each UAV in the subgraph has iterated for a corresponding number of times, the states in the entire subgraph achieve the max-consensus.
Now, it is ready to complete the proof.
Proof
A necessary and sufficient condition for max-consensus held in node i is that
First, the sufficiency. Given a undirected graph \({\mathcal {G}}^u_i\) composed of \(N^u_i\) nodes, we define an initial vector of the perception confidence value \(\gamma '^{t}_{m}(0):=[\gamma '^{t}_{1m}(0),\gamma '^{t}_{2m}(0),\ldots ,\gamma '^{t}_{N^u_im}(0)]^\mathrm{T},m=1,2,\ldots ,N^t\).
The Eq. (41) implies \(\gamma '^{t}_{im}(l)=\big (E_i\otimes \gamma '^{t}_{m}(0)\big )_{i}\), where \((\cdot )_i\) is the ith element of the column vector. i.e.,
Applying the rules for multiplying matrices in the Max-Plus Algebra, we obtain
and hence
Necessity is obvious.
If \(A_i^{l}\ne E_i,~\forall l\); then \(\forall l,~\exists k\) s.t. \((A_i^l)_{ik}=\varepsilon \), i.e., \(\gamma '^{t}_{im}(l)\) dose not depend on \(x_{km}(0)\). If \(x_{km}(0)\) is the maximum element of \(\{\gamma '^{t}_{1m}(0),\ldots ,\gamma '^{t}_{N^u_im}(0)\}\), the max-consensus will not hold.
In (41), \(A^l_i=E_{i}\) implies that there exists a path of length l from the node i to any nodes in \({\mathcal {G}}^u_i\). Take l as the maximum of the length of all minimum path from i to each node on graph \({\mathcal {G}}^u_i\), which is the diameter of the shortest paths tree (SPT) of \({\mathcal {G}}^u_i\) rooted at node i [45].
Therefore, in order to establish the result of Theorem 1, it is essentially proved that if there exists an l, such that \(A_i^l=E_i\) and \(A_i^d \ne E_i\) for all \(d<l\), then
where \(D_i({\mathcal {G}}^u_i)\) is the diameter of SPT of \({\mathcal {G}}^u_i\) rooted at node i.
Recall the condition. \(A^l=E_i\) and \(A^d \ne E_i\) means that there is a shortest path from node i to each node in \({\mathcal {G}}_i\) whose length is less or equal to l. As the maximum length of these paths is the diameter of SPT of \({\mathcal {G}}_i\) rooted at node i, Eq. (45) is true. \(\square \)
The derivation of the FIM
The FIM [47] at time k is defined by
where \(p ( {\mathbf{z}}^P_{ij}(k)\left| {\mathbf{s}}_i (k)\right. )\) is the batch measurement likelihood. For the sake of simplicity, we omit the iterative step k in the following derivation. In this tracking scenario, as the measurement accuracy is determined by the position relationship between the UAV and the target, in which the covariance matrix is \({{\mathbf{C}}_{ij}}\) shown in (2), the FIM is only related to the position of the UAV and that of the target. Since the target is non-cooperative, it is able to change the FIM only by adjusting the position of the UAV. Therefore, it can be restated as:
The batch measurement likelihood in (47) is defined as (48):
In the above formula, \(\varvec{\ell } ({{\mathbf{p}}^u_{i}},{{\mathbf{p}}^t_{j}})\) denotes the real value of range-bearing, which is given by
The first order derivative of the log-density function is given by
Then the Fisher information matrix can be written as
where \({\mathbf{G}}_{ij}\) is a square matrix of order 2; \(m,n\in \{1,2\}\) represents the row number and column number of each element in \({\mathbf{G}}_{ij}\); \({\mathbf{p}}^u_{i}(1) = x^u_{i}\), and \({\mathbf{p}}^u_{i}(2)=y^u_{i}\). The specific form of each element is as following:
Rights and permissions
About this article
Cite this article
Zhao, Y., Wang, X., Wang, C. et al. Systemic design of distributed multi-UAV cooperative decision-making for multi-target tracking. Auton Agent Multi-Agent Syst 33, 132–158 (2019). https://doi.org/10.1007/s10458-019-09401-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10458-019-09401-5