Competitive target search with multi-agent teams: symmetric and asymmetric communication constraints

Otte, Michael; Kuhlman, Michael; Sofge, Donald

doi:10.1007/s10514-017-9687-0

Competitive target search with multi-agent teams: symmetric and asymmetric communication constraints

Published: 02 December 2017

Volume 42, pages 1207–1230, (2018)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

913 Accesses
10 Citations
Explore all metrics

Abstract

We study a search game in which two multi-agent teams compete to find a stationary target at an unknown location. Each team plays a mixed strategy over the set of search sweep-patterns allowed from its respective random starting locations. Assuming that communication enables cooperation we find closed-form expressions for the probability of winning the game as a function of team sizes and the existence or absence of communication within each team. Assuming the target is distributed uniformly at random, an optimal mixed strategy equalizes the expected first-visit time to all points within the search space. The benefits of communication enabled cooperation increase with team size. Simulations and experiments agree well with analytical results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Competitive Two Team Target Search Game with Communication Symmetry and Asymmetry

STRATA: unified framework for task assignments in large teams of heterogeneous agents

Article 13 May 2020

Harish Ravichandar, Kenneth Shaw & Sonia Chernova

Spectral-Based Distributed Ergodic Coverage for Heterogeneous Multi-agent Search

Notes

A discrete formulation can easily be obtained by replacing Lebesgue integrals over continuous spaces with summations over discrete sets, and reasoning about the probability of events directly instead of via probability density.
Note that in $\mathbb {T}^D$ this is technically the push-forward extension of the Lebesgue measure that one would naturally assume.
Note that even for sensors with positive measure footprints, ${0< \mathscr {L}_{D-1}(B_r) < \infty }$ (e.g., a $D$-ball instead of a $(D\!-\!1)$-ball) nearly all space is searched as the forward boundary of a sensor volume sweeps over it (in contrast to the space that is searched instantaneously at startup due to being within some agent’s sensor volume).
This prevents “cheating” where an agent that continuously rotates through an uncountably infinite number of points is able to use its zero-measure sweep sensor as if it were a volumetric sensor of non-zero-measure (the measure of a countably infinite union of sweep footprints is still 0).
This formulation is chosen because we are interested in scenarios where communication provides a significant coordination advantage. The topological constraints of $\mathbb {S}^1$ significantly penalize an in-game redistribution from an initial i.i.d. space to an even spacing (that would otherwise be expected to facilitate maximum coordinated search). Although we do not explore it in this paper, an alternative game formulation in ${X= \mathbb {S}^1}$ would be to have all teams draw their start locations from $\mathcal {D}_{X,\mathrm {fan}}$, regardless of their ability to communicate. In such a case, teams with communication can coordinate by having all members search in the directions expected to be most advantageous given their start locations.
To guarantee that this number is finite, we require that the sensor footprint contains some convex subset of space. In other words, degenerate sensors of measure zero or fractal-like geometry might produce an infinite number of different sweep rates.
That is, using the fact that $G$ winning a game with n communicating agents versus m non-communicating adversaries is equivalent to $G$ winning each of the m independent sub-games involving each different adversary (and vice versa). This is how the results from Sect. 4.1 were extended in Sects. 4.2 and 4.3.
In other words, the scenario we consider is provably worse than a worst-case scenario. Thus, the bounds we derive are outside bounds on a worst-case scenario; and as a result, they are also outside bounds on the actual scenario. Note that we choose to use the worse than worst-case scenario because it is straightforward to analyze (unlike the worst-case scenario and the actual scenario).

References

Beard, R. W., & McLain, T. W. (2003). Multiple uav cooperative search under collision avoidance and limited range communication constraints. In Proceedings of 42nd IEEE conference on decision and control, 2003, Vol. 1, pp. 25–30
Bertuccelli, L. F., & How, J. P. (2005). Robust UAV search for environments with imprecise probability maps. In 44th IEEE conference on decision and control, 2005 and 2005 European control conference. CDC-ECC ’05, pp. 5680–5685, https://doi.org/10.1109/CDC.2005.1583068
Bhattacharya, S., Khanafer, A., & Başar, T. (2016). A double-sided jamming game with resource constraints. Springer International Publishing, pp. 209–227
Chandler, P., & Pachter, M. (2001). Hierarchical control for autonomous teams. In Proceedings of the AIAA guidance, navigation, and control conference, pp. 632–642
Choset, H., & Pignon, P. (1998). Coverage path planning: The boustrophedon cellular decomposition. In Field and service robotics (pp. 203–209). Springer
Chung, T. H., Hollinger, G. A., & Isler, V. (2011). Search and pursuit-evasion in mobile robotics. Autonomous Robots, 31(4), 299–316.
Article Google Scholar
Demaine, E. D., Fekete, S. P., & Gal, S. (2006). Online searching with turn cost. Theoretical Computer Science, 361(2), 342–355.
Article MathSciNet MATH Google Scholar
Dias, M. B. (2004). Traderbots: A new paradigm for robust and efficient multirobot coordination in dynamic environments. PhD thesis, Carnegie Mellon University Pittsburgh
Dias, M. B., Zlot, R., Kalra, N., & Stentz, A. (2006). Market-based multirobot coordination: A survey and analysis. Proceedings of the IEEE, 94(7), 1257–1270.
Article Google Scholar
Feinerman, O., Korman, A., Lotker, Z., Sereni, J. S. (2012). Collaborative search on the plane without communication. In Proceedings of the 2012 ACM symposium on principles of distributed computing, PODC ’12 (pp. 77–86). ACM, New York https://doi.org/10.1145/2332432.2332444,
Flint, M., Polycarpou, M., & Fernandez-Gaucherand, E. (2002). Cooperative control for multiple autonomous uav’s searching for targets. In Proceedings of the 41st IEEE Conference on Decision and Control, Vol. 3, pp. 2823–2828
Forsmo, E. J., Grotli, E. I., Fossen, T. I., & Johansen, T. A. (2013). Optimal search mission with unmanned aerial vehicles using mixed integer linear programming. In International conference on unmanned aircraft systems (ICUAS), pp. 253–259, https://doi.org/10.1109/ICUAS.2013.6564697
Gerkey, B. P., Thrun, S., & Gordon, G. (2005). Parallel stochastic hill-climbing with small teams. In Multi-robot systems. From swarms to intelligent automata Volume III (pp. 65–77). Springer
Hollinger, G. A., Yerramalli, S., Singh, S., Mitra, U., & Sukhatme, G. S. (2015). Distributed data fusion for multirobot search. IEEE Transactions on Robotics, 31(1), 55–66.
Article Google Scholar
Hopcroft, J. E., & Karp, R. M. (1971). A n5/2 algorithm for maximum matchings in bipartite. In IEEE 12th annual symposium on switching and automata theory, pp. 122–125
Hu, J., Xie, L., Lum, K. Y., & Xu, J. (2013). Multiagent information fusion and cooperative control in target search. IEEE Transactions on Control Systems Technology, 21(4), 1223–1235. https://doi.org/10.1109/TCST.2012.2198650.
Article Google Scholar
Huang, A. S., Olson, E., & Moore, D. C. (2010). Lcm: Lightweight communications and marshalling. In IEEE/RSJ international conference on, intelligent robots and systems (IROS), pp. 4057–4062
Huang, H., Ding, J., Zhang, W., & Tomlin, C. J. (2015). Automation-assisted capture-the-flag: A differential game approach. IEEE Transactions on Control Systems Technology, 23(3), 1014–1028.
Article Google Scholar
Kim, M. H., Baik, H., & Lee, S. (2013). Response threshold model based uav search planning and task allocation. Journal of Intelligent & Robotic Systems, 75(3), 625–640. https://doi.org/10.1007/s10846-013-9887-6.
Google Scholar
Koopman, B. (1956). The theory of search. II Target detection. Operations Research, 4(5), 503–531.
Article MathSciNet Google Scholar
Kwak, D. J., & Kim, H. J. (2014). Policy improvements for probabilistic pursuit-evasion game. Journal of Intelligent & Robotic Systems, 74(3–4), 709–724. https://doi.org/10.1007/s10846-013-9857-z.
Article Google Scholar
Lynen, S., Achtelik, M. W., Weiss, S., Chli, M., Siegwart, R. (2013). A robust and modular multi-sensor fusion approach applied to mav navigation. In 2013 IEEE/RSJ international conference on intelligent robots and systems, pp. 3923–3929
Mangel, M. (1989). Marcel Dekker, New York
Noori, N., & Isler, V. (2013). Lion and man with visibility in monotone polygons. The International Journal of Robotics Research p 0278364913498291
Otte, M., Kuhlman, M., & Sofge, D. (2016). Competitive two team target search game with communication symmetry and asymmetry. In International workshop on the algorithmic foundations of robotics (WAFR), San Francisco, USA
Sato, H., & Royset, J. O. (2010). Path optimization for the resource-constrained searcher. Naval Research Logistics (NRL), 57(5), 422–440.
MathSciNet MATH Google Scholar
Spieser, K., & Frazzoli, E. (2012). The cow-path game: A competitive vehicle routing problem. In IEEE 51st annual conference on decision and control (CDC), pp. 6513–6520
Spires, S. V., & Goldsmith, S. Y. (1998). Exhaustive geographic search with mobile robots along space-filling curves. In Collective robotics (pp. 1–12). Springer
Sujit, P. B., & Ghose, D. (2004). Multiple agent search of an unknown environment using game theoretical models. In Proceedings of the American control conference, 2004, Vol. 6, pp. 5564–5569
Sujit, P. B., & Ghose, D. (2009). Negotiation schemes for multi-agent cooperative search. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 223(6), 791–813.
Article Google Scholar
Sydney, N., Paley, D. A., Sofge, D. (2015). Physics-inspired motion planning for information-theoretic target detection using multiple aerial robots. Autonomous Robots pp 1–11
Trummel, K., & Weisinger, J. (1986). Technical note the complexity of the optimal searcher path problem. Operations Research, 34(2), 324–327.
Article MathSciNet MATH Google Scholar
Vidal, R., Shakernia, O., Kim, H. J., Shim, D. H., & Sastry, S. (2002). Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation. IEEE Transactions on Robotics and Automation 18(5):662–669, https://doi.org/10.1109/TRA.2002.804040
Vincent, P., & Rubin, I. (2004). A framework and analysis for cooperative search using uav swarms. In Proceedings of the 2004 ACM symposium on applied computing, SAC ’04. ACM, New York, pp. 79–86, https://doi.org/10.1145/967900.967919,
Waharte, S., & Trigoni, N. (2010). Supporting search and rescue operations with uavs. In International conference on emerging security technologies (EST), 2010, pp. 142–147
Zhu, M., Frazzoli, E. (2012). On competitive search games for multiple vehicles. In IEEE 51st annual conference on decision and control (CDC), 2012, pp. 5798–5803, https://doi.org/10.1109/CDC.2012.6426371

Download references

Acknowledgements

We would like to thank Colin Ward, Corbin Wilhelmi, and Cyrus Vorwald for their help in facilitating the mixed platform experiments.

Author information

Authors and Affiliations

U.S. Naval Research Laboratory, Washington, DC, 20375, USA
Michael Otte & Donald Sofge
University of Maryland, College Park, MD, 20742, USA
Michael Kuhlman

Authors

Michael Otte
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kuhlman
View author publications
You can also search for this author in PubMed Google Scholar
Donald Sofge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Otte.

Additional information

This work was performed at the Naval Research Laboratory and was funded by the Office of Naval Research under Grant Numbers N0001416WX01271 and N0001416WX01272. The views, positions and conclusions expressed herein reflect only the authors’ opinions and expressly do not reflect those of the Office of Naval Research, nor those of the Naval Research Laboratory.

Appendices

A Details of proofs leading up to Lemma 1

We now present technical details leading up to the proof of Lemma 1.

Given particular multipath strategies $\psi _{G}$ and $\psi _{A}$ for our team and the adversary, respectively, we can compute the ratio of space our team visits first by integrating the normalized rate new (to anybody) territory is swept by our team:

$$\begin{aligned} \frac{\mathscr {L}_{D}(X_{\mathrm {win}}(\psi _{G},\psi _{A}))}{\mathscr {L}_{D}(X)} = \int _{0}^{t_{\mathrm {final}}}\frac{d}{dt}\!\left[ \frac{\mathscr {L}_{D}(X_{\mathrm {new}}(t))}{\mathscr {L}_{D}(X)}\right] dt,\end{aligned}$$

where ${t_{\mathrm {final}}= \min (\frac{1}{n c^*} , \frac{1}{m c^*})}$. Thus, Eq. 1 can be reformulated for the Nash equilibrium of an ideal game with cooperation within both teams as:

where integrals are Lebesgue. Using the independence of the two team’s optimal mixed strategies, i.e., $\mathcal {D}_{\mathbf {g}_{0}}(\psi _{G})$ and $\mathcal {D}_{\mathbf {a}_{0}}(\psi _{A})$ for all $\mathbf {g}_{0}$ and $\mathbf {a}_{0}$ yields:

We observe that the quantity inside the outermost integral describes the expected value of ${\frac{d}{dt}[\frac{\mathscr {L}_{D}(X_{\mathrm {new}}(t))}{\mathscr {L}_{D}(X)}]}$ over all $\mathcal {S}_G$, $\mathcal {S}_A$, $\varPsi ^*_{G}$, and $\varPsi ^*_{A}$. Recall that the expected value of ‘$\cdot $’ over all $\mathcal {S}_G$, $\mathcal {S}_A$, $\varPsi ^*_{G}$, and $\varPsi ^*_{A}$ as $ {\mathbb {E^{*}}\!\left[ \cdot \right] \equiv \mathbb {E}_{\mathcal {S}_G, \mathcal {S}_A, \varPsi ^*_{G}, \varPsi ^*_{A}}\left[ \cdot \right] }. $ Thus, formally,

and

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}^*\right) = \int _0^{t_{\mathrm {final}}} \mathbb {E^{*}}\!\left[ \frac{d}{dt} \frac{\mathscr {L}_{D}(X_{\mathrm {new}}(t))}{\mathscr {L}_{D}(X)} \right] dt\end{aligned}$$

Lemma 1

Assuming optimal ideal mixed strategies exist and both teams play an optimal ideal mixed strategy,

$$\begin{aligned} \mathbb {E^{*}}\!\left[ \frac{d}{dt} \frac{\mathscr {L}_{D}(X_{\mathrm {new}}(t))}{\mathscr {L}_{D}(X)} \right] = \left( 1 - tm c^*\right) n c^*. \end{aligned}$$

Proof

At time $t$ the adversary (operating according to its own ideal optimal strategy) has swept $tm v\frac{\mathscr {L}_{D-1}(B_r)}{\mathscr {L}_{D}(X)}$ portion of the entire search space. The interplay between the mixed ideal optimal strategies for each team forces the expected instantaneous overlap between teams to be uncorrelated. Thus, for all $t\in [0, t_{\mathrm {final}}]$, the instantaneous expected rate team $G$ sweeps $\mathscr {L}_{D}(X_{\mathrm {new}})$ is discounted by a factor of ${1 - tm v\frac{\mathscr {L}_{D-1}(B_r)}{\mathscr {L}_{D}(X)}}$ versus $\frac{d^*}{dt} \frac{\mathscr {L}_{D}(X_{\mathrm {new}}(t))}{\mathscr {L}_{D}(X)}$.

Substitution with Proposition 2 and Corollary 1 yields the desired result.$\square $

B Details of analysis leading to theorem 1

We now present details of our analysis leading up to and including the proof of Theorem 1.

Proposition 5

Given a uniform distribution of starting locations for all $a\in A$ and target locations $q$, and assuming an ideal mixed strategy is played by $A$, then the distribution of times at which that target-sweeping adversary $a_j$ sweeps the target is uniform on the interval $[0, 1/(m c^*)]$.

Let $t_{A}$ be a realization of a uniformly random sample from $[0, 1/(m c^*)]$. Given our assumptions:

$$\begin{aligned} {\mathbb {P}\left( t_{A} < t\right) = \frac{t}{1/(m c^*)}}. \end{aligned}$$

where ${0 \le t\le 1/(m c^*)}$. Note that the probability density of $t_{A}$ on $[0, 1/(m c^*)]$ is $m c^*$, formally:

(6)

Team $G$ does not communicate and can be viewed as a confederation of n independent single-agent sub-teams ${\{g_1\}, \ldots \{g_n\}}$. Each subteam $\{g_i\}$ plays a single agent ideal mixed strategy such that $g_i$ sweeps (new to $g_i$) space at the rate $c^*$ and so $g_i$ requires $1/c^*$ time to sweep the entire space by itself. This leads to the single-agent team counterpart to Proposition 5:

Proposition 6

Given a uniform distribution of starting locations for $g_i$ and target locations $q$, and assuming a single agent ideal mixed strategy is played by $\{g_i\}$, then the distribution of times at which agent $g_i$ sweeps the target is evenly distributed on the interval $[0, 1/c^*]$.

Let $t_{g,i}$ be a realization of a uniformly random sample from $[0, 1/ c^*]$. Given our assumptions on start locations, and assuming a single-agent ideal mixed strategy is played by $\{g_i\}$, the probability $g_i$ sweeps the target before time $t$ is:

$$\begin{aligned} {\mathbb {P}\left( t_{g,i} < t\right) = \frac{t}{1/c^*}} \end{aligned}$$

where ${0 \le t\le 1/c^*}$.

The target is swept by team $G$ as soon as it is swept by any agent ${g_i \in G}$, thus team $G$ essentially gets to draw n i.i.d. uniformly random samples ${t_{g,1}, \ldots t_{g,n}}$ from $[0, 1/c^*]$ and play the best (smallest) of these versus the adversary team $A$’s single draw from $[0, 1/(m c^*)]$. Let ${t_{G} = \min _{i}(t_{g,i})}$ denote a realization of the smallest out of n values sampled uniformly at random and i.i.d. from ${[0, 1/c^*]}$.

The distribution of $t_{G}$ can be determined directly using order statistics. The probability that at least one of the n (uncommunicating) subteams ${\{g_1\}, \ldots \{g_n\}}$ sweeps the target before time $t$ is:

$$\begin{aligned} {\mathbb {P}\left( t_{G} < t\right) = 1 - \left( 1 - \frac{t}{1/c^*}\right) ^n} \end{aligned}$$

(7)

Team $G$ wins whenever ${t_{G} < t_{A}}$. We observe that $ \mathbb {P}\left( \omega _{\mathrm {tie}}^*\right) = \mathbb {P}\left( t_{G} =t_{A}\right) = 0 $ and given our assumptions:

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}^*\right) = \mathbb {P}\left( t_{G} < t_{A}\right) = 1 - \mathbb {P}\left( t_{G} > t_{A}\right) = 1 - \mathbb {P}\left( \omega _{\mathrm {lose}}^*\right) . \end{aligned}$$

Theorem 1

The probability team $G$ wins an ideal game, assuming team $G$ cannot communicate but the adversary team $A$ can communicate, and the adversary team $A$ plays optimal ideal mixed strategies, while each ${\{g_i\}\subset G}$ plays an optimal ideal single agent mixed strategy, is:

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}^*\right) = \frac{ \left( 1-\frac{1}{m}\right) ^n (m-1) - m + n + 1 }{n + 1} \end{aligned}$$

Proof

We can compute $\mathbb {P}\left( \omega _{\mathrm {win}}^*\right) $ using the Law of Total Probability:

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}^*\right) = \int _{-\infty }^{\infty } \mathbb {P}\left( t_{G} < t_{A} | t_{A} = t\right) f_{t}(t) d t. \end{aligned}$$

We note that $f_{t}(t) = 0$ for all ${t< 0}$ and for all ${t> 1/(m c^*)}$ because the game has not started yet and the adversary will have already won, respectively. Substituting Eqs. 6 and 7 we get:

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}^*\right) = \int _{0}^{1/(m c^*)}\left( 1 - \left( 1 - \frac{t_{A}}{1/c^*}\right) ^n \right) mc^* d t\end{aligned}$$

and performing the integration yields:

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}^*\right) = 1 + \frac{ m \left( \frac{m-1}{m}\right) ^{n+1}}{(n+1)} - \frac{m}{n + 1}. \end{aligned}$$

The final result is obtained using algebra. $\square $

C Details of proofs pertaining to non-ideal cases

In Sects. C.1, C.2 and C.3 we bound the individual non-ideal effects of type I, II, and III, respectively that were discussed in Sect. 6. The each effect is analyzed assuming the others can be ignored. The general case where I, II, and III simultaneously occur can be found by combining these results. Here we focus on the case where both teams can communicate. Bounds on cases where one or the other team cannot communicate can be found by extending these results as was done for the ideal case^{Footnote 7} in Sect. 4.

Boundary effects caused by non-ideal startup locations and environmental geometry both cause the team to perform more than the ideal amount of sweeping (see Fig. 5). We use the same basic proof technique for their analysis in Sects. C.1 and C.2.

The analytical technique used in Sects. C.1 and C.2 relies on the fact that search-hindering non-ideal effects are expected to be increasingly detrimental to $G$’s probability of winning the game the earlier in the game they occur. This happens because the ratio $\frac{d \mathscr {L}_{D}(X_{\mathrm {swept}})}{d \mathscr {L}_{D}(X_{\mathrm {new}})}$ decreases versus time. It is possible to construct a scenario that is even worse than a worst-case non-ideal search, in term’s of team $G$’s probability of winning the game, by: (first) assuming that all negative ramifications of a non-ideal search happen at the beginning of the search for team $G$, instead of whenever they actually occur; (second) Assuming the adversary team $A$ sweeps at the ideal-game rate for the entire game. The relative length of the non-ideal startup phase in this modified scenario is guaranteed to be worse than the worst-case,^{Footnote 8} and is bounded by a dimensionally dependent constant length of time.

A road-map of our analytical technique, including the construction of the worse than worst-case scenario we use, is now presented:

1.
We break the space $X$ into two non-overlapping sets, $X_{ideal}$ and $X_{startup}$, depending on if the sweep search through it is “ideal” (it is swept at a time when the team is sweeping new space at the maximum rate).
1. (a)
  Let all search space that is not reswept be combined in set $X_{ideal}$.
2. (b)
  Let all search space that is ever reswept (plus, when relevant, all non-search space that is swept) be combined in set $X_{startup}$. Multiple copies of each reswept portion of space are included in $X_{startup}$—if a subset of space is swept i different times, then i different copies of that subset are included. For the following discussion each copy is considered distinct such that each contributes its own volume to $\mathscr {L}_{D}(X_{startup})$. Thus, by construction, ${\mathscr {L}_{D}(X_{startup}) + \mathscr {L}_{D}(X_{ideal}) \ge \mathscr {L}_{D}(X)}$.
2.
We consider (a worse than worst-case scenario) where the time costs, but not the target detection benefits, of sweeping $X_{startup}$ are incurred prior to performing an ideal search through $X_{ideal}$.
1. (a)
  We design a multipath that is guaranteed to sweep $X_{startup}$, including all duplicate copies of reswept space, and derive an upper bound $t_{\mathrm {startup}}$ on the time required for the team to travel this multipath.
2. (b)
  We assume team $G$ begins the game by moving along a path sufficient to sweep $X_{startup}$—but not actually performing search as it moves along this path (e.g., with its detection sensor turned off). For each duration of non-ideal behavior that occurs in a normal scenario, this essentially shifts an equivalent duration of non-ideal behavior to the beginning of the search without providing any target detection benefits.
3. (c)
  However, after accounting for the penalty we receive for not searching before $t_{\mathrm {startup}}$, we assume that search through $X_{ideal}$ happens at the ideal rate. In other words, for the purposes of deriving performance bounds, we essentially pay an up-front performance penalty “with interest” to move each piece of non-ideal search such that it happens before $t_{\mathrm {startup}}$.
4. (d)
  Finally, we account for actually searching $X_{startup}$ (since we moved through it without searching before $t_{\mathrm {startup}}$).
  1. i.
    Our original search would have already completed by this point; thus, we can assume that any rate of sweep for the second pass over unique elements of $X_{startup}$ and our scenario will still be worse than the original (and provide a valid performance bound).
  2. ii.
    Thus, it is permissible to assume the ideal search rate in this phase (for convenience) without destroying the worse than worst-case bound.

We now apply this technique in Sects. C.1 and C.2.

1.1 C.1 Non-ideal starting locations

If two robots on the same team start closer than $2r$, then either some nonzero measure subset of space will be swept by both of them (see Fig. 5left), or some nonzero measure subset of space will be swept by them and during some other point in the search (see Fig. 5center). Such an event occurs with nonzero probability given an assumption of uniform random i.i.d. start locations (Fig. 12).

The ill-effects of non-ideal start locations can be bounded by considering the following worse than worst-case scenario: all n team members start at exactly the same point ${\mathbf {g}_{0}= (x_{1,0}, \ldots , x_{n,0})}$, and then begin the game by moving (without actually searching) to the closest configuration $\mathbf {g}_{0}^*$ at which type-I effects would not have occurred if the robots had started at $\mathbf {g}_{0}^*$ in the first place (see Fig. 13). We assume that the entire team waits to start searching until all robots have reached their coordinate of $\mathbf {g}_{0}^*$, and that this requires $t_{\mathrm {startup}}$ time.

The probability the adversary wins before $t_{\mathrm {startup}}$ is $\frac{t_{\mathrm {startup}}m v \mathscr {L}_{D-1}(B_r) }{ \mathscr {L}_{D}(X)}$. After, $t_{\mathrm {startup}}$ our expected search rate is that of the ideal case such that $\frac{d\mathscr {L}_{D}(X_{\mathrm {new}}(t))}{dt}$ decreases versus $\frac{d\mathscr {L}_{D}(X_{\mathrm {swept}}(t))}{dt}$ by the usual factor of ${1 - \frac{tm v{\mathscr {L}_{D-1}(B_r)}}{{\mathscr {L}_{D}(X)}}}$ after $t_{\mathrm {startup}}$.

Recalling that ${c^*=v\frac{\mathscr {L}_{D-1}(B_r)}{\mathscr {L}_{D}(X)}}$, the resulting worse than worst-case bound on the probability we win the game is:

$$\begin{aligned} {\mathbb {P}\left( \omega _{\mathrm {win}}\right) \ge \left[ \int _{t_{\mathrm {startup}}}^{\hat{t}_{\mathrm {final}}} \left( 1 - t m c^*\right) n c^* \,\, dt\right] - t_{\mathrm {startup}}m c^*} \end{aligned}$$

(8)

where

$$\begin{aligned} \hat{t}_{\mathrm {final}}= \min \left( t_{\mathrm {startup}}+ \frac{1}{nc^*} , \frac{1}{mc^*}\right) . \end{aligned}$$

As discussed above, this bound accounts only for inefficiencies in search caused by non-ideal search locations.

In toroid $\mathbb {T}^D$ environments the furthest distance that any individual member of the team must move during the startup phase is $2 r(n - 1)$ (see Fig. 13) and so ${t_{\mathrm {startup}}< 2 r(n-1) / v}$.

A similar bound exists for convex subsets of $\mathbb {R}^D$ as long as other boundary effects can be ignored, and ${\mathcal {W}> 2rn}$, where $\mathcal {W}$ is the maximum distance between any two points in $X$ along a geodesic. In other words, $\mathcal {W}$ is the maximum width of the environment. We assume other boundary effects can be ignored, but address them directly in Sects. C.2 and C.3. We now limit our consideration to ${\mathcal {W}> 2rn}$, i.e., wide environments. In thin environments, such that ${\mathcal {W}\ngtr 2rn}$, moving to an ideal start location may require time on the same order as sweeping the environment.

An upper bound on the probability we win is found by swapping the roles played by team $G$ and the adversary. We let ${\tilde{t}_{\mathrm {startup}}< 2 rm / v}$ denote the startup time required for an adversary that must compete with an ideal version of team $G$ (for achieving this other bound), and define

$$\begin{aligned} \tilde{t}_{\mathrm {final}}= \min \left( \tilde{t}_{\mathrm {startup}}+ \frac{1}{mc^*} , \frac{1}{nc^*}\right) . \end{aligned}$$

The preceding discussion leads to the following theorem:

Theorem 3

Assuming that both teams can communicate, and an optimal mixed strategy exists for both teams, and that both teams play an optimal mixed strategy, and that the game is ideal in every sense except for starting locations, the probability we win is bounded as follows:

and

1.2 C.2 Turns and other non-ideal boundary effects

Re-sweeping of previously swept space occurs near the boundary of the search space due to the necessity of turning (Fig. 5right). Moreover, sweeping all of the space within the search space sometimes requires sweeping some portion of space outside the search space (Fig. 5right). Finally, a third boundary effect occurs due to the fact that, in general, the search spaces cannot be covered by an integer number of sweep passes (this also happens in $\mathbb {T}^D$, see Fig. 14). All of these effects can be analyzed simultaneously.

We assume that the team starts at locations that do not suffer from the startup effects discussed in Sect. C.1. We also assume that the projection of the sensor footprint along the direction of movement permits a tiling over $D-1$ space, see Fig. 6 (we investigate the case where this is not true in Sect. C.3).

Given these assumptions, the space swept in any $\mathbb {T}^D$ search space or subset of $\mathbb {R}^D$ can be decomposed into two different non-overlapping subsets: the first ($X_{startup}$) contains all space involved in turning induced resweep as well as all non-search area that is swept, while the second ($X_{ideal}$) contains the remaining search space (that involved an ideal search). An example appears in Fig. 15.

Similar to the analysis in Sect. C.1, we derive $t_{\mathrm {startup}}$ an upper bound on the time required to sweep $X_{startup}$ (including any necessary resweeps). And as in Sect. C.1, the worse than worst-case scenario we use for our analysis requires team $G$ to wait for $X_{startup}$ time before beginning an ideal search (during which time the adversary searches at the ideal rate).

The first step to find $t_{\mathrm {startup}}$ is to bound the length of a path that is guaranteed to cover $X_{startup}$ (see Fig. 16). Consider the largest hypercube that is contained within the robot’s sensor footprint and that is aligned with the direction of forward movement. We call this hypercube $\hat{C}_r$. Let the boundary of the search space be denoted $\partial X$. Note, in toroid spaces the only boundary effects are related to the last pass, and so for $\mathbb {T}^D$, $\partial X$ represents the manifold located between the paths taken during the first and last sweeps (See Fig. 14).

Since $\partial X$ is essentially a lower-dimensional manifold embedded in our $D$-dimensional search space, it is possible to cover $\partial X$ with $N_{cover}$ hypercubes (each of dimension $D$), where

$$\begin{aligned} N_{cover}< c_2 r \mathscr {L}_{D-1}(\partial X) \end{aligned}$$

here $c_2$ is a constant that counts the maximum number of tiled hypercubes required to cover a non-axis aligned line segment of length $\sqrt{D}$, and where $\sqrt{D}$ is the maximum distance between any two points in the same unit grid cell. Figure 16a–c depicts such a covering.

It is important to note that $c_2$ is a dimensionally dependent constant. It is possible to construct a tour upper bounded by $\hat{\ell }$ that covers all grid cells within $c_2$ of $\partial X$ (and thus covers $X_{startup}$). We note that $\hat{\ell }$ is also, by design, longer than the cumulative length traveled along the subset of the original multipath involved in the boundary effects we are investigating in this section. We calculate $\hat{\ell }$ by considering a naive tour of the hypercube covering of the search space boundary (see Fig. 16bottom). For each hypercube, it is possible to construct such a covering by traveling a distance that is at most three times a side length (i.e., 6 times $r_c$) as follows: $2r_c$ to reach the center of the nearest face, $2r_c$ to reach the opposite face and thus sweep the entire cube, and $2r_c$ to exit through the center of any other face (in most cases it will be much less than this since multiple cubes can be swept without changing direction). Thus, ${\hat{\ell }< 6N_{cover}r_c}$ and so

$$\begin{aligned} \hat{\ell }< 6 c_2 r N_{cover}\mathscr {L}_{D}(\partial X). \end{aligned}$$

The time required to perform the startup phases is upper bounded by

$$\begin{aligned} t_{\mathrm {startup}}< \hat{\ell }/ v\end{aligned}$$

This bound implicitly assumes a worst-case situation in which all boundary effects must be dealt with by a single agent. Thus, this bound on $t_{\mathrm {startup}}$ may be up to n times too large (i.e., in the best-case boundary problems are divided evenly between agents). We can now proceed as in the previous subsection (using our new definition of $t_{\mathrm {startup}}$ in Eq. 8). This discussion leads to the following theorem:

Theorem 4

Assuming that both teams can communicate, and an optimal mixed strategy exists for both teams, and that both teams play an optimal mixed strategy, and that the game is ideal in every sense except for boundary conditions of re-sweep and sweep outside the search space, the probability we win is bounded as follows:

We note that ${\frac{N_{cover}\mathscr {L}_{D}(\hat{C}_r)}{\mathscr {L}_{D}(X)}\rightarrow 0}$, in the limit, as the size of the environment increases relative to the sensor radius.

The non-ideal effects from this section can be combined with those from the previous subsection by simply combining the startup phases used for analysis into a single startup phase of combined duration.

1.3 C.3 Sensor model when $D > 2$

A sensor sweep footprint that forms a space tiling in $D- 1$ dimensions is required for ideal search, See Fig. 6. Non-tiling footprints require neighboring sweep passes to overlap in order to sweep the full space. When ${D> 2}$ the vast majority of sensor models will not permit a sweep footprint that forms a space tiling in ${D- 1}$ dimensions. While an appropriately chosen sensor footprint, such as the $D-1$ dimensional $L_{\infty }$-ball, does permit such a covering, other common symmetrical coverings such as the $D-1$ dimensional $L_{2}$-ball do not. We now investigate the effects of what happens when a non-tiling sensor is used.

Assume that, other than the tiling of the search sensor, the search is otherwise ideal (i.e., we are ignoring the startup and boundary effects that were addressed in Sects. C.1 and C.2). Search necessarily happens in a number of different separate phases that are characterized by different $\frac{d \mathscr {L}_{D}(X_{\mathrm {swept}})}{d t}$ (sweep rates of space we have not yet swept) and thus different $\frac{d \mathscr {L}_{D}(X_{\mathrm {new}})}{d t}$ (sweep rates of space that has not been swept by ether team).

For example, we could perform search in two meta-phases represented by the gray discs and red circles in Fig. 17. During the first phase, search happens at the ideal rate due to the fact that each pass (gray disc) covers terrain that we have never visited before. However, after some time, sweeping any new space will necessarily require re-sweeping some previously swept terrain (red circles). Thus, in the second phase, $\frac{d \mathscr {L}_{D}(X_{\mathrm {swept}})}{d t}$ and $\frac{d \mathscr {L}_{D}(X_{\mathrm {new}})}{d t}$ are substantially reduced.

We could alternatively sweep the entire space much more quickly using a multipath defined by sensor discs inscribed by the blue hexagons in Fig. 17. The price we pay for a quicker overall search turns out to be an earlier decrease in $\frac{d \mathscr {L}_{D}(X_{\mathrm {swept}})}{d t}$ and $\frac{d \mathscr {L}_{D}(X_{\mathrm {new}})}{d t}$.

The fact that we can control $\frac{d \mathscr {L}_{D}(X_{\mathrm {swept}})}{d t}$ via choosing how different sweep passes overlap adds a significant amount of complexity to our analysis. Each time either team’s search rate changes, we must use a slightly different version of Eq. 2. This can be accomplished by breaking our analysis into $K$ different intervals depending on the number of times that either team changes their sweep rate.

Let the $k$-th time interval start at time $t_k$ and end at time $t_{k+1}$. By the beginning of the $k$-th time interval the adversary has already swept $F_k$ proportion of the total search space. During the $k$-th interval the rate team $G$ sweeps new (for us) territory is determined by $nc_{G,k}$ and the rate the adversary sweeps new (for them) territory is determined by $mc_{A,k}$.

The total fraction of area that has been swept by the adversary prior to the start of the $k$-th interval is given by

$$\begin{aligned} F_k= 1 - \sum _{k= 1}^{K-1} \int _{t_k}^{t_{k+1}} tm c_{A,k} \,\, dt\end{aligned}$$

As a corollary of Lemma 1 we get:

Corollary 5

The probability we win, assuming an optimal mixed strategy exists for both teams, and both teams play an optimal mixed strategy, when both teams communicate, and the game is ideal except for sensor footprint is

$$\begin{aligned} \mathbb {P}\left( \omega _{\mathrm {win}}\right) = \sum _{k= 1}^{K-1} \int _{t_k}^{t_{k+1}} \left( F_k- tm c_{A,k}\right) nc_{G,k} \,\, dt. \end{aligned}$$

We note that these effects do not vanish as the size of the environment increases relative to the sensor footprint.

Corollary 5 works for the case that both teams communicate. In general, extending these results to cases where one team cannot communicate is more involved than how the ideal results from Sect. 4.1 were extended in Sects. 4.2 and 4.3. Each of the sub-games versus a single (i.e., non-communicating) agent must be analyzed separately to account for the fact that each agent will independently choose when and how much its own sweeps will overlap between different passes through the environment.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Otte, M., Kuhlman, M. & Sofge, D. Competitive target search with multi-agent teams: symmetric and asymmetric communication constraints. Auton Robot 42, 1207–1230 (2018). https://doi.org/10.1007/s10514-017-9687-0

Download citation

Received: 03 May 2017
Accepted: 11 November 2017
Published: 02 December 2017
Issue Date: August 2018
DOI: https://doi.org/10.1007/s10514-017-9687-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Competitive target search with multi-agent teams: symmetric and asymmetric communication constraints

Abstract

Access this article

Similar content being viewed by others

Competitive Two Team Target Search Game with Communication Symmetry and Asymmetry

STRATA: unified framework for task assignments in large teams of heterogeneous agents

Spectral-Based Distributed Ergodic Coverage for Heterogeneous Multi-agent Search

Notes

References

Acknowledgements