Abstract
This paper proposes a novel highly scalable sampling-based planning algorithm for multi-robot active information acquisition tasks in complex environments. Active information gathering scenarios include target localization and tracking, active SLAM, surveillance, environmental monitoring and others. The objective is to compute control policies for sensing robots which minimize the accumulated uncertainty of a dynamic hidden state over an a priori unknown horizon. To address this problem, we propose a new sampling-based algorithm that simultaneously explores both the robot motion space and the reachable information space. Unlike relevant sampling-based approaches, we show that the proposed algorithm is probabilistically complete, asymptotically optimal and is supported by convergence rate bounds. Moreover, we propose a novel biased sampling strategy that biases exploration towards informative areas. This allows the proposed method to quickly compute sensor policies that achieve desired levels of uncertainty in large-scale estimation tasks that may involve large sensor teams, workspaces, and dimensions of the hidden state. Extensions of the proposed algorithm to account for hidden states with no prior information are discussed. We provide extensive simulation results that corroborate the theoretical analysis and show that the proposed algorithm can address large-scale estimation tasks that are computationally challenging for existing methods.
Similar content being viewed by others
Notes
Throughout the paper, when it is clear from the context, we drop the dependence of \({\mathbf{q}}(t)\) on t.
Note that the horizon F and \({\mathbf {u}}_{0:F}\) returned by Algorithm 1 depend on n. For simplicity of notation, we drop this dependence.
References
Atanasov, N., Le Ny, J., Daniilidis, K., & Pappas, G. J., (2014). Information acquisition with sensing robots: Algorithms and error bounds. In IEEE International Conference on Robotics and Automation (pp. 6447–6454), Hong Kong, China. URL https://ieeexplore.ieee.org/document/6907811.
Atanasov, N., Le Ny, J., Daniilidis, K., & Pappas, G. J. (2015a) Decentralized active information acquisition: Theory and application to multi-robot SLAM. In IEEE International Conference on Robotics and Automation (pp. 4775–4782), Seattle, WA. URL https://ieeexplore.ieee.org/document/7139863.
Atanasov, N. A., Le Ny, J., & Pappas, G. J. (2015b). Distributed algorithms for stochastic source seeking with mobile robot networks. Journal of Dynamic Systems, Measurement, and Control, 137(3), 031004.
Bai, S., Chen, F., & Englot, B. (2017). Toward autonomous mapping and exploration for mobile robots through deep supervised learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2379–2384).
Bennetts, V. M. H., Lilienthal, A. J., Khaliq, A. A., Sese, V. P. & Trincavelli, M. (2013). Towards real-world gas distribution mapping and leak localization using a mobile robot with 3d and remote gas sensing capabilities. In IEEE International Conference on Robotics and Automation, pages 2335–2340, Karlsruhe, Germany, 2013. URL https://ieeexplore.ieee.org/document/6630893.
Bircher, A., Alexis, K., Schwesinger, U., Omari, S., Burri, M., & Siegwart, R. (2017). An incremental sampling-based approach to inspection planning: the rapidly exploring random tree of trees. Robotica, 35(6), 1327.
Bowman, S. L., Atanasov, N., Daniilidis, K., & Pappas, G.J. (2017). Probabilistic data association for semantic slam. In IEEE international conference on robotics and automation (ICRA) (pp. 1722–1729).
Charrow, B., Kumar, V., & Michael, N. (2014). Approximate representations for multi-robot control policies that maximize mutual information. Autonomous Robots, 37(4):383–400. URL https://link.springer.com/article/10.1007/s10514-014-9411-2.
Chen, F., Wang, J., Shan, T., & Englot, B. (2019). Autonomous exploration under uncertainty via graph convolutional networks. In Proceedings of the International Symposium on Robotics Research.
Corah, M., & Michael, N. (2018). Distributed submodular maximization on partition matroids for planning on large sensor networks. In IEEE Conference on Decision and Control (pp. 6792–6799), Miami Beach, FL, December 2018.
Dames, P., Schwager, M., Kumar, V., & Rus, D. (2012). A decentralized control policy for adaptive information gathering in hazardous environments. In IEEE 51st Annual Conference on Decision and Control (CDC) (pp. 2807–2813). IEEE.
Dames, P., Tokekar, P., & Kumar, V. (2017). Detecting, localizing, and tracking an unknown number of moving targets using a team of mobile robots. The International Journal of Robotics Research, 36(13–14), 1540–1553.
Freundlich, C., Lee, S., & Zavlanos, M. M. (2017). Distributed active state estimation with user-specified accuracy. IEEE Transactions on Automatic Control, 63(2), 418–433.
Furrer, F., Burri, M., Achtelik, M., & Siegwart, R., (2016). RotorS—A mdular Gazebo MAV smulator famework. In Robot operating system (ROS) (pp. 595–625). Springer, Cham. ISBN 978-3-319-26054-9. 10.1007/978-3-319-26054-9\_23. URL https://doi.org/10.1007/978-3-319-26054-9_23.
Graham, R., & Cortés, J. (2008). A cooperative deployment strategy for optimal sampling in spatiotemporal estimation. In 47th IEEE Conference on Decision and Control (pp. 2432–2437). URL https://ieeexplore.ieee.org/document/4739085.
Grimmett, G., & Stirzaker, D. (2001). Probability and random processes. Oxford: Oxford University Press.
Hollinger, G. A., & Sukhatme, G. S. (2014). Sampling-based robotic information gathering algorithms. The International Journal of Robotics Research, 33 (9):1271–1287. URL https://journals.sagepub.com/doi/abs/10.1177/0278364914533443.
Huang, G., Zhou, K., Trawny, N., & Roumeliotis, S. I. (2015). A bank of maximum a posteriori (MAP) estimators for target tracking. IEEE Transactions on Robotics, 31(1), 85–103.
Jadidi, M. G., Miro, J. V., & Dissanayake, G. (2018). Gaussian processes autonomous mapping and exploration for range-sensing mobile robots. Autonomous Robots, 42(2), 273–290.
Kantaros, Y., Schlotfeldt, B., Atanasov, N., & Pappas, G. J. (2019). Asymptotically optimal planning for non-myopic multi-robot information gathering. In Robotics: Science and Systems, Freiburg, Germany.
Karaman, S., & Frazzoli, E. (2011). Sampling-based algorithms for optimal motion planning. The International Journal of Robotics Research, 30(7), 846–894.
Khodayi-mehr, R., Kantaros, Y., & Zavlanos, M. M. (2018). Distributed state estimation using intermittently connected robot networks. arXiv preprint arXiv:1805.01574.
Kumar, V., Rus, D., & Singh, S. (2004). Robot and sensor networks for first responders. IEEE Pervasive computing, 3(4), 24–33.
Lan, X., & Schwager, M. (2016). Rapidly exploring random cycles: Persistent estimation of spatiotemporal fields with multiple sensing robots. IEEE Transactions on Robotics, 32(5), 1230–1244.
LaValle, S. M. (2006). Planning algorithms. Cambridge: Cambridge University Press.
Le Ny J., & Pappas, G. J. (2009). On trajectory optimization for active sensing in gaussian process models. In Proceedings of the 48th IEEE Conference on Decision and Control, held jointly with the 28th Chinese Control Conference, pages 6286–6292, Shanghai, China, 2009. URL https://ieeexplore.ieee.org/document/5399526.
Leung, K., Barfoot, T. D., & Liu, H. (2012). Decentralized cooperative slam for sparsely-communicating robot networks: A centralized-equivalent approach. Journal of Intelligent and Robotic Systems, 66(3), 321–342.
Levine, D., Luders, B., & How, J. P. (2010). Information-rich path planning with general constraints using rapidly-exploring random trees. In AIAA Infotech at Aerospace Conference, Atlanta, GA. URL https://dspace.mit.edu/handle/1721.1/60027.
Li, B., Wang, Y., Zhang, Y., Zhao, W., Ruan, J., & Li, P. (2020). Gp-slam: laser-based slam approach based on regionalized gaussian process map reconstruction. Autonomous Robots (pp. 1–21).
Lu, Q., & Han, Q. (2018). Mobile robot networks for environmental monitoring: A cooperative receding horizon temporal logic control approach. IEEE Transactions on Cybernetics. URL https://ieeexplore.ieee.org/document/8540323.
Martínez, S., & Bullo, F. (2006). Optimal sensor placement and motion coordination for target tracking. Automatica, 42(4), 661–668.
Meyer, F., Wymeersch, H., Fröhle, M., & Hlawatsch, F. (2015). Distributed estimation with information-seeking control in agent networks. IEEE Journal on Selected Areas in Communications, 33(11), 2439–2456.
Michael, N., Zavlanos, M. M., Kumar, V., & Pappas, G. J. (2008). Distributed multi-robot task assignment and formation control. In IEEE International Conference on Robotics and Automation, (pp. 128–133), Pasadena, CA. URL https://ieeexplore.ieee.org/document/4543197.
Obermeyer K. J., & Contributors. (2008). The VisiLibity library. http://www.VisiLibity.org. Release 1.
Reinhart, R., Dang, T., Hand, E., Papachristos, C., & Alexis, K. (2020). Learning-based path planning for autonomous exploration of subterranean environments. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1215–1221), Paris, France, 2020.
Schlotfeldt, B., Thakur, D., Atanasov, N., Kumar, V., & Pappas, G. J. (2018). Anytime planning for decentralized multirobot active information gathering. IEEE Robotics and Automation Letters, 3(2), 1025–1032.
Schlotfeldt, B., Atanasov, N., & Pappas, G. J. (2019). Maximum information bounds for planning active sensing trajectories. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4913–4920), Macao, November 2019.
Singh, A., Krause, A., Guestrin, C., & Kaiser, W. J. (2009). Efficient informative sensing using multiple robots. Journal of Artificial Intelligence Research, 34, 707–755. URL https://www.jair.org/index.php/jair/article/view/10602
Solovey, K., Janson, L., Schmerling, E., Frazzoli, E., & Pavone, M. (2020). Revisiting the asymptotic optimality of RRT. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 2189–2195), Paris, France, May 31 - August 31 2020.
Srinivasa, S. S., Barfoot, T. D., & Gammell, J. D. (2020). Batch informed trees (bit*): Informed asymptotically optimal anytime search. The International Journal of Robotics Research, 39(5), 543–567.
Turpin, M., Michael, N., & Kumar, V. (2014). CAPT: Concurrent assignment and planning of trajectories for multiple robots. The International Journal of Robotics Research, 33(1), 98–112.
Yamauchi, B. (1997). A frontier-based approach for autonomous exploration. In IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA) (pp. 146–151).
Funding
This work was supported by the ARL Grant DCIST CRA W911NF-17-2-0181.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: proof of completeness, optimality, and convergence
Appendix: proof of completeness, optimality, and convergence
In what follows, we denote by \({\mathcal {G}}^n=\{{\mathcal {V}}^{n}, {\mathcal {E}}^{n}, J\}\) the tree that has been built by Algorithm 1 at the n-th iteration. The same notation also extends to \(f_{{\mathcal V}}\), \(f_{{\mathcal U}}\), and \({\mathbf {u}}_{\text {new}}\). To prove Theorems 1 and 2, we need to prove the following results.
Lemma 1
(Sampling \({\mathcal V}_{k_{\text {rand}}}^n\)) Consider any subset \({\mathcal V}_{k}^n\) and any fixed iteration index n and any fixed \(k\in \{1,\dots ,K_n\}\). Then, there exists an infinite number of subsequent iterations \(n+w\), where \(w\in {\mathcal {W}}\) and \({\mathcal {W}}\subseteq {\mathbb {N}}\) is a subsequence of \({\mathbb {N}}\), at which the subset \({\mathcal V}_{k}^n\) is selected to be the set \({\mathcal V}_{k_{\text {rand}}}^{n+w}\).
Proof
Let \(A^{\text {rand},n+w}(k)=\{{\mathcal V}_{k_{\text {rand}}}^{n+w}={\mathcal V}_{k}^n\}\), with \(w\in {\mathbb {N}}\), denote the event that at iteration \(n+w\) of Algorithm 1 the subset \({\mathcal V}_{k}^n\subseteq {\mathcal V}^n\) is selected by the sampling operation to be the set \({\mathcal V}_{k_{\text {rand}}}^{n+w}\) [line 2, Alg. 1]. Also, let \({\mathbb {P}}(A^{\text {rand},n+w}(k))\) denote the probability of this event, i.e., \({\mathbb {P}}(A^{\text {rand},n+w}(k))=f_{{\mathcal V}}^{n+w}(k)\).
Next, define the infinite sequence of events \(A^{\text {rand}}=\{A^{\text {rand},n+w}(k)\}_{w=0}^{\infty }\), for a given subset \({\mathcal V}_{k}^n\subseteq {\mathcal V}^n\). In what follows, we show that the series
diverges and then we complete the proof by applying the Borel-Cantelli lemma Grimmett and Stirzaker (2001).
Recall that by Assumption 2(i), we have that given any subset \({\mathcal V}_{k}^n\subseteq {\mathcal V}^n\), the probability \(f_{{\mathcal V}}^n(k|{\mathcal V}^n)\) satisfies \(f_{{\mathcal V}}^n(k|{\mathcal V}^n)\ge \epsilon \), for any iteration n. Thus we have that \({\mathbb {P}}(A^{\text {rand},n+w}(k))=f_{{\mathcal V}}^{n+w}(k|{\mathcal V}^{n+w})\ge \epsilon >0, \) for all \(w\in {\mathbb {N}}\). Note that this result holds for any \(k\in \{1,\dots ,K_{n+w}\}\) due to Assumption 2(i). Therefore, we have that \(\sum _{w=0}^{\infty }{\mathbb {P}}(A^{\text {rand},n+w}(k))\ge \sum _{w=0}^{\infty }\epsilon .\) Since \(\epsilon \) is a strictly positive constant, we have that \(\sum _{w=0}^{\infty }\epsilon \) diverges. Then, we conclude that \(\sum _{w=0}^{\infty }{\mathbb {P}}(A^{\text {rand},n+w}(k))=\infty .\) Combining this result and the fact that the events \(A^{\text {rand},n+w}(k)\) are independent by Assumption 2(ii), we get that \({\mathbb {P}}(\limsup _{k\rightarrow \infty } A^{\text {rand},n+w}(k)=1,\) by the Borel-Cantelli lemma. In other words, the events \(A^{\text {rand},n+w}(k)\) occur infinitely often, for all \(k\in \{1,\dots ,K_n\}\). This equivalently means that for every subset \({\mathcal V}_{k}^n\subseteq {\mathcal V}^n\), for all \(n\in {\mathbb {N}}_{+}\), there exists an infinite subsequence \({\mathcal {W}}\subseteq {\mathbb {N}}\) so that for all \(w\in {\mathcal {W}}\) it holds \({\mathcal V}_{k_{\text {rand}}}^{n+w}={\mathcal V}^{n}\), completing the proof. \(\square \)
Lemma 2
(Sampling \({\mathbf {u}}_{\text {new}}\)) Consider any subset \({\mathcal V}_{k_{\text {rand}}}^n\) selected by \(f_{{\mathcal V}}\) and any fixed iteration index n. Then, for any given control input \({\mathbf {u}}\in {\mathcal U}\), there exists an infinite number of subsequent iterations \(n+w\), where \(w\in {\mathcal {W}}'\) and \({\mathcal {W}}'\subseteq {\mathcal {W}}\) is a subsequence of the sequence of \({\mathcal {W}}\) defined in Lemma 1, at which the control input \({\mathbf {u}}\in {\mathcal U}\) is selected to be \({\mathbf {u}}_{\text {new}}^{n+w}\).
Proof
Define the infinite sequence of events \(A^{\text {new}}=\{A^{\text {new},n+w}({\mathbf {u}})\}_{w=0}^{\infty }\), for \({\mathbf {u}}\in {\mathcal U}\), where \(A^{\text {new},n+w}({\mathbf {u}})=\{{\mathbf {u}}_{\text {new}}^{n+w}={\mathbf {u}}\}\), for \(w\in {\mathbb {N}}\), denotes the event that at iteration \(n+w\) of Algorithm 1 the control input \({\mathbf {u}}\in {\mathcal U}\) is selected by the sampling function to be the input \({\mathbf {u}}_{\text {new}}^{n+w}\), given the subset \({\mathcal V}_{k_{\text {rand}}}^n\in {\mathcal V}_{k_{\text {rand}}}^{n+w}\). Moreover, let \({\mathbb {P}}(A^{\text {new},n+w}({\mathbf {u}}))\) denote the probability of this event, i.e., \({\mathbb {P}}(A^{\text {new},n+e}({\mathbf {u}}))=f_{{\mathcal U}}^{n+w}({\mathbf {u}}| {\mathcal V}_{k_{\text {rand}}}^{n+w})\). Now, consider those iterations \(n+w\) with \(w\in {\mathcal {W}}\) such that \(k_{\text {rand}}^{n+w}=k_{\text {rand}}^n\) by Lemma 1. We will show that the series
diverges and then we will use Borel-Cantelli lemma to show that any given \({\mathbf {u}}\in {\mathcal U}\) will be selected infinitely often to be control input \({\mathbf {u}}_{\text {new}}^{n+w}\). By Assumption 3(i) we have that \({\mathbb {P}}(A^{\text {new},n+w}({\mathbf {u}}))=f_{{\mathcal U}}^{n+w}({\mathbf {u}}| {\mathcal V}_{k_{\text {rand}}}^{n+w})\) is bounded below by a strictly positive constant \(\zeta >0\) for all \(w\in {\mathcal W}\). Therefore, we have that \(\sum _{w\in {\mathcal W}}{\mathbb {P}}(A^{\text {new},n+w}({\mathbf {u}}))\) diverges, since it is an infinite sum of a strictly positive constant term. Using this result along with the fact that the events \(A^{\text {new},n+w}({\mathbf {u}})\) are independent, by Assumption 3(ii), we get that \({\mathbb {P}}(\limsup _{w\rightarrow \infty } A^{\text {new},n+w}({\mathbf {u}}))=1,\) due to the Borel-Cantelli lemma. In words, this means that the events \(A^{\text {new},n+w}({\mathbf {u}})\) for \(w\in {\mathcal {W}}\) occur infinitely often. Thus, given any subset \({\mathcal V}_{k_{\text {rand}}}^n\), for every control input \({\mathbf {u}}\) and for all \(n\in {\mathbb {N}}_{+}\), there exists an infinite subsequence \({\mathcal {W}}' \subseteq {\mathcal {W}}\) so that for all \(w\in {\mathcal {W}}'\) it holds \({\mathbf {u}}_{\text {new}}^{n+w}={\mathbf {u}}\), completing the proof. \(\square \)
Before stating the next result, we first define the reachable state-space of a state \({\mathbf{q}}(t)=[{\mathbf{p}}(t),\Sigma (t)]\in {\mathcal V}_{k}^n\), denoted by \({\mathcal R}({\mathbf{q}}(t))\) that collects all states \({\mathbf{q}}(t+1)=[{\mathbf{p}}(t+1), \Sigma (t+1)]\) that can be reached within one time step from \({\mathbf{q}}(t)\).
Corollary 1
(Reachable set \({\mathcal R}({\mathbf{q}}(t))\)) Given any state \({\mathbf{q}}(t)=[{\mathbf{p}}(t),\Sigma (t)]\in {\mathcal V}_{k}^n\), for any \(k\in \{1,\dots ,K_n\}\), Algorithm 1 will add to \({\mathcal V}^n\) all states that belong to the reachable set \({\mathcal R}({\mathbf{q}}(t))\) will be added to \({\mathcal V}^{n+w}\), with probability 1, as \(w\rightarrow \infty \), i.e., \(\lim _{w\rightarrow \infty } {\mathbb {P}}\left( \{{\mathcal R}({\mathbf{q}}(t))\subseteq {\mathcal {V}}^{n+w}\}\right) =1.\) Also, edges from \({\mathbf{q}}(t)\) to all reachable states \({\mathbf{q}}'(t+1)\in {\mathcal R}({\mathbf{q}}(t))\) will be added to \({\mathcal E}^{n+w}\), with probability 1, as \(w\rightarrow \infty \), i.e., \(\lim _{w\rightarrow \infty } {\mathbb {P}}\left( \{\cup _{{\mathbf{q}}'\in {\mathcal R}({\mathbf{q}})}({\mathbf{q}},{\mathbf{q}}')\subseteq {\mathcal {E}}^{n+w}\}\right) =1.\)
Proof
The proof straightforwardly follows from Lemmas 1-2 and is omitted. \(\square \)
Proof of Theorem 1
By construction of the path \({\mathbf{q}}_{0:F}\), it holds that \({\mathbf{q}}(f)\in {\mathcal R}({\mathbf{q}}(f-1))\), for all \(f\in \{1,\dots ,F\}\). Since \({\mathbf{q}}(0)\in {\mathcal V}^1\), it holds that all states \({\mathbf{q}}\in {\mathcal R}({\mathbf{q}}(0))\), including the state \({\mathbf{q}}(1)\), will be added to \({\mathcal V}^n\) with probability 1, as \(n\rightarrow \infty \), due to Corollary 1. Once this happens, the edge \(({\mathbf{q}}(0),{\mathbf{q}}(1))\) will be added to set of edges \({\mathcal E}^n\) due to Corollary 1. Applying Corollary 1 inductively, we get that \(\lim _{n\rightarrow \infty } {\mathbb {P}}\left( \{{\mathbf{q}}_f\in {\mathcal {V}}^{n}\}\right) =1\) and \(\lim _{n\rightarrow \infty } {\mathbb {P}}\left( \{({\mathbf{q}}(f-1), {\mathbf{q}}(f))\in {\mathcal {E}}^{n}\}\right) =1\), for all \(f\in \{1,\dots ,F\}\) meaning that the path \({\mathbf{q}}_{0:F}\) will be added to the tree \({\mathcal G}^n\) with probability 1 as \(n\rightarrow \infty \) completing the proof. \(\square \)
Proof of Theorem 2
The proof of this result straightforwardly follows from Theorem 1. Specifically, recall from Theorem 1 that Algorithm 1 can find any feasible path and, therefore, the optimal path as well, with probability 1, as \(n\rightarrow \infty \), completing the proof. \(\square \)
Proof of Theorem 3
To prove this result, we model the sampling strategy employed by Algorithm 1 as a Poisson binomial process. Specifically, we define Bernoulli random variables \(Y_n\) at every iteration n of Algorithm 1 so that \(Y_n=1\) only if the edge \(({\mathbf{q}}(f-1), {\mathbf{q}}(f))\) is added to the tree at iteration n, where f is the smallest element of the set \(\{1,\dots ,F\}\) that satisfies \({\mathbf{q}}(f-1)\in {\mathcal V}^{n-1}\) and \({\mathbf{q}}(f)\notin {\mathcal V}^{n-1}\). Then, using the random variables \(Y_n\), we define the random variable \(Y=\sum _{n=1}^{n_{\text {max}}}Y_n\) which captures the total number of successes of the random variables \(Y_n\) and we show that it follows a Poisson binomial distribution. Finally, we show that \( {\mathbb {P}}(A^{n_{\text {max}}}({\mathbf{q}}_{0:F}^*))={\mathbb {P}}(Y\ge F)\) which yields (8) by applying the Chernoff bounds to Y. The detailed proof is omitted.
Let \(X_f^n\), for all \(f\in \{1,\dots ,F-1\}\) denote a Bernoulli random variable associated with iteration n of Algorithm 1, that is equal to 1 if the edge \(({\mathbf{q}}_{f-1},{\mathbf{q}}_f)\) in \({\mathbf{q}}_{0:F}^*\) is added to the tree at iteration n or has already been added to the tree at a previous iteration \(m<n\), and is 0 otherwise. Observe that \(X_1^n\) is equal to 1 for all iterations \(n\ge 1\) of Algorithm 1, since the tree is rooted at \({\mathbf{q}}_1\) and, therefore, \({\mathbf{q}}_1\in {\mathcal V}^n\), for all \(n\ge 1\). By construction of the sampling strategy the probability that \(X_f^n=1\) is defined as follows
where \({\mathbf{q}}_{f-1}\in {\mathcal V}_{k_{f-1}}^n\) and with slight abuse of notation \({\mathbf {u}}_{f-1\rightarrow f}\in {\mathcal U}\) stands for the control input that steers the robots from \({\mathbf{q}}_{f-1}\) to \({\mathbf{q}}_{f}\). Note that such a controller exists since \({\mathbf{q}}_f\in {\mathcal R}({\mathbf{q}}_{f-1})\) by definition of the path \({\mathbf{q}}_{0:F}^*\), where \({\mathcal R}(\cdot )\) denotes the reachable set. Observe that if \({\mathbf{q}}_{f-1}\notin {\mathcal V}^n\) then \({\mathbb {P}}(X_f^n)=0\), since \(f_{{\mathcal V}}^n(k_{f-1}|{\mathcal V}^n)=0\). Moreover, note that if an edge \(({\mathbf{q}}_{f-1},{\mathbf{q}}_f)\) already belongs to \({\mathcal E}^n\) from a previous iteration \(m<n\) of Algorithm 1, then it trivially holds that \({\mathbb {P}}(X_f^n)=1\).
Given the random variables \(X_f^n\), we define the discrete random variable \(Y_n\) initialized as \(Y_1=X_1^1\) and for every subsequent iteration \(n>1\) defined as
In words, \(Y_n\) is defined exactly as \(Y_{n-1}\), i.e., \(Y_n=Y_{n-1}=X_f^{n-1}=X_f^n\), if \(Y_{n-1}=X_{f}^{n-1}=0\), i.e., if the edge \(({\mathbf{q}}_{f-1},{\mathbf{q}}_f)\) associated with the random variable \(Y_{n-1}=X_f^{n-1}\) does not exist in the tree at iteration \(n-1\); see the first case in (18). Also, \(Y_n=X_{f+1}^n\), if \(Y_{n-1}=X_{f}^{n-1}=1\), i.e., if the edge \(({\mathbf{q}}_{f-1},{\mathbf{q}}_f)\) was added to the tree at the previous iteration \(n-1\); see the second case in (18). If \(f+1>F\) and \(X_f^{n-1}=1\), then we define \(Y_n=X_F^n\); see the last case in (18). Note that in this case, \(Y_n\) can be defined arbitrarily, i.e., \(Y_n=X_{{\bar{f}}}^n\), for any \({\bar{f}}\in \{1,\dots ,F\}\), since if \(f+1>K\) and \(X_f^{n-1}=1\), then this means that all edges that appear in \({\mathbf{q}}_{0:F}^*\) have been added to \({\mathcal E}^n\). By convention, in this case we define \(Y_n=X_F^n\). Since \(Y_n\) is equal to \(X_n^f\) for some \(f\in \{1,\dots ,F\}\), as per (18), for all \(n\ge 1\), we get that \(Y_n\) also follows a Bernoulli distribution with parameter (probability of success) \(p^{\text {suc}}_n\) equal to the probability of success of \(X_n^k\) defined in (17), i.e.,
where the index f is determined as per (18).
Given the random variables \(Y_n\), \(n\in \{1,\dots ,n_{\text {max}}\}\), we define the discrete random variable Y as
Observe that Y captures the total number of successes of the random variables \(Y_n\) after \(n_{\text {max}}>0\) iterations, i.e., if \(Y=y\), \(y\in \{1,\dots ,n_{\text {max}}\}\), then \(Y_n=1\) for exactly y random variables \(Y_n\). Therefore, if \(Y\ge F\), then all edges that appear in the path \({\mathbf{q}}_{0:F}^*\) have been added to the tree, by definition of the random variables \(Y_n\) and Y in (18) and (19), respectively. Therefore, we conclude that
In what follows, our goal is to compute the probability \({\mathbb {P}}(Y\ge K)\). Observe that Y is defined as a sum of Bernoulli random variables \(Y_n\) that are not identically distributed as their probabilities of success \(p^{\text {suc}}_n\) are not fixed across the iterations n, since the definition of \(Y_n\) changes at every iteration n as per (18). Therefore, Y follows a Poisson Binomial distribution which has a rather complicated probability mass function, which is valid for small n and numerically unstable for large n. Therefore, instead of computing \({\mathbb {P}}(Y\ge K)\), we compute a lower bound for \({\mathbb {P}}(Y\ge K)\) by applying the Chernoff bound to Y.
Specifically, we have that
where \(\mu \) is the mean value of Y defined as \(\mu =\sum _{n=1}^{n_{\text {max}}} p^{\text {suc}}_n\). Also, the last inequality in (21) is due to the Chernoff bound in the lower tail of Y and holds for any \(\delta \in (0,1)\). Observe that the Chernoff bound can be applied to Y, as it is defined as the sum of independent Bernoulli random variables \(Y_n\). Specifically, the random variables \(Y_n\) are independent since independent samples \({\mathbf{q}}\) can be generated by the proposed sampling process, specified by the density functions \(f_{{\mathcal V}}\) and \(f_{{\mathcal U}}\), due to Assumptions 2(ii) and 3(ii). Substituting \(\delta =1-\frac{F}{\mu }\) in (21), we get
where the last inequality is due to \(e^{-\frac{F^2}{\mu }}\le 1\). Recall that (22) holds for any \(\delta =1-\frac{F}{\mu }\in (0,1)\), i.e, for any \(n_{\text {max}}\) that satisfies
where the last inequality in (23) is due to \( p^{\text {suc}}_n\le 1\). Therefore, (22) holds as long as \(n_{\text {max}}>F\).
Note also that the inequality \(0<F<\sum _{n=1}^{n_{\text {max}}} p^{\text {suc}}_n\) in (23) is well defined, since \(p^{\text {suc}}_n={\mathbb {P}}(X_f^n)\) is strictly positive for all \(n\ge 1\) by definition of \(Y_n\). To show that, observe that if \(Y_n=X_f^n\), for some \(f\in \{1,\dots ,F-1\}\), then it holds that \(({\mathbf{q}}_{f-2},{\mathbf{q}}_{f-1})\in {\mathcal E}^n\), by definition of \(Y_n\) in (18), i.e., \({\mathbf{q}}_{f-1}\in {\mathcal V}^{n}\). Thus, we have that \(f_{{\mathcal V}}({\mathbf{q}}_{f-1})>0\) by Assumption 2(i). Also, \(f_{{\mathcal U}}({\mathbf {u}}_{f-1\rightarrow f})>0\) by Assumption 3(i). Therefore, we have that \(p^{\text {suc}}_n={\mathbb {P}}(X_f^n)>0\); see also (17).
Thus, we proved that there exist parameters \(\alpha _n({\mathbf{q}}_{0:F}^*)=p^{\text {suc}}_n\in (0,1]\) associated with every iteration n of Algorithm 1 such that the probability \({\mathbb {P}}(A^{n_{\text {max}}}({\mathbf{q}}_{0:F}^*))\) of finding the optimal path \({\mathbf{q}}_{0:F}^*\) within \(n_{\text {max}}>F\) iterations satisfies
completing the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Kantaros, Y., Schlotfeldt, B., Atanasov, N. et al. Sampling-based planning for non-myopic multi-robot information gathering. Auton Robot 45, 1029–1046 (2021). https://doi.org/10.1007/s10514-021-09995-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-021-09995-4