Approximate representations for multi-robot control policies that maximize mutual information

Charrow, Benjamin; Kumar, Vijay; Michael, Nathan

doi:10.1007/s10514-014-9411-2

Approximate representations for multi-robot control policies that maximize mutual information

Published: 23 August 2014

Volume 37, pages 383–400, (2014)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Benjamin Charrow¹,
Vijay Kumar¹ &
Nathan Michael²

1582 Accesses
49 Citations
2 Altmetric
Explore all metrics

Abstract

We address the problem of controlling a small team of robots to estimate the location of a mobile target using non-linear range-only sensors. Our control law maximizes the mutual information between the team’s estimate and future measurements over a finite time horizon. Because the computations associated with such policies scale poorly with the number of robots, the time horizon associated with the policy, and typical non-parametric representations of the belief, we design approximate representations that enable real-time operation. The main contributions of this paper include the control policy, an algorithm for approximating the belief state, and an extensive study of the performance of these algorithms using simulations and real world experiments in complex, indoor environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Towards Cooperative Multi-robot Belief Space Planning in Unknown Environments

Cooperative multi-robot belief space planning for autonomous navigation in unknown environments

Article 26 January 2017

Vadim Indelman

Sampling-based planning for non-myopic multi-robot information gathering

Article 28 June 2021

Yiannis Kantaros, Brent Schlotfeldt, … George J. Pappas

References

Binney, J., Krause, A., & Sukhatme, G. S. (2013). Optimizing waypoints for monitoring spatiotemporal phenomena. International Journal of Robotics Research, 32(8), 873–888.
Article Google Scholar
Charrow, B., Kumar, V., & Michael, N. (2013). Approximate representations for multi-robot control policies that maximize mutual information. In Proceedings of Robotics: Science and Systems, Berlin, Germany.
Charrow, B., Michael, N., & Kumar, V. (2014). Cooperative multi-robot estimation and control for radio source localization. The International Journal of Robotics Research, 33, 569–580.
Article Google Scholar
Chung, T., Hollinger, G., & Isler, V. (2014). Search and pursuit-evasion in mobile robotics. Autonomous Robots, 31(4), 299–316.
Article Google Scholar
Cover, T. M., & Thomas, J. A. (2004). Elements of information theory. New York: Wiley.
Google Scholar
Dame, A., & Marchand, E. (2011). Mutual information-based visual servoing. The IEEE Transactions on Robotics, 27(5), 958–969.
Article Google Scholar
Djugash, J., & Singh, S. (2012). Motion-aided network slam with range. International Journal of Robotics Research, 31(5), 604–625.
Article Google Scholar
Fannes, M. (1973). A continuity property of the entropy density for spin lattice systems. Communications in Mathematical Physics, 31(4), 291–294.
Article MathSciNet MATH Google Scholar
Fox, D. (2003). Adapting the sample size in particle filters through KLD-sampling. International Journal of Robotics Research, 22(12), 985–1003.
Article Google Scholar
Golovin, D., & Krause, A. (2011). Adaptive submodularity: Theory and applications in active learning and stochastic optimization. The Journal of Artificial Intelligence Research, 42(1), 427–486.
MathSciNet MATH Google Scholar
Grocholsky, B. (2002). Information-theoretic control of multiple sensor platforms. PhD thesis, Australian Centre for Field Robotics.
Hahn, T. (2013, January). Cuba. http://www.feynarts.de/cuba/.
Hoffmann, G., & Tomlin, C. (2010). Mobile sensor network control using mutual information methods and particle filters. The IEEE Transactions on Automatic Control, 55(1), 32–47.
Hollinger, G., & Sukhatme, G. (2013). Sampling-based motion planning for robotic information gathering. In Proceedings of Robotics: Science and Systems, Berlin, Germany.
Hollinger, G., Djugash, J., & Singh, S. (2011). Target tracking without line of sight using range from radio. Autonomous Robots, 32(1), 1–14.
Article Google Scholar
Huber, M., & Hanebeck, U. (2008). Progressive gaussian mixture reduction. In International Conference on Information Fusion.
Huber, M., Bailey, T., Durrant-Whyte, H., & Hanebeck, U. (2008, August). On entropy approximation for gaussian mixture random vectors. In Conference on Multisensor Fusion and Integration for Intelligent Systems, Seoul, Korea (pp. 181–188).
Julian, B. J., Angermann, M., Schwager, M., & Rus, D. (2011, September). A scalable information theoretic approach to distributed robot coordination. In Proceedings of the IEEE/RSJ International Conference on Intellegent Robots and System, San Francisco, USA (pp. 5187–5194).
Kassir, A., Fitch, R., & Sukkarieh, S. (2012, May). Decentralised information gathering with communication costs. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, USA (pp. 2427–2432).
Krause, A., & Guestrin, C. (2005). Near-optimal nonmyopic value of information in graphical models. In Conference on Uncertainty in Artificial Intelligence (pp. 324–331).
nanoPAN 5375 Development Kit. (2013, September). http://nanotron.com/EN/pdf/Factsheet_nanoPAN_5375_Dev_Kit.pdf.
Owen, D. (1980). A table of normal integrals. Communications in Statistics-Simulation and Computation, 9(4), 389–419.
Article MathSciNet Google Scholar
Park, J.G., Charrow, B., Battat, J., Curtis, D., Minkov, E., Hicks, J. Teller, S., & Ledlie, J. (2010). Growing an organic indoor location system. In Proceedings of International Conference on Mobile Systems, Applications, and Services, San Francisco, CA.
ROS. (2013, January). http://www.ros.org/wiki/.
Runnals, A. (2007). Kullback–Leibler approach to gaussian mixture reduction. IEEE Transactions on Aerospace and Electronic Systems, 43(3), 989–999.
Article Google Scholar
Ryan, A., & Hedrick, J. (2010). Particle filter based information-theoretic active sensing. Robotics and Autonomous Systems, 58(5), 574–584.
Silva, J.F., Parada, P. (2011). Sufficient conditions for the convergence of the shannon differential entropy. In IEEE Information Theory Workshop, Paraty, Brazil (pp. 608–612).
Singh, A., Krause, A., Guestrin, C., & Kaiser, W. J. (2009). Efficient informative sensing using multiple robots. The Journal of Artificial Intelligence Research, 34(1), 707–755.
MathSciNet MATH Google Scholar
Stump, E., Kumar, V., Grocholsky, B., & Shiroma, P. (2009). Control for localization of targets using range-only sensors. International Journal of Robotics Research, 28(6), 743.
Article Google Scholar
Thrun, S., Burgard, W., & Fox, D. (2008). Probabilistic robotics. Cambridge: MIT Press.
Google Scholar
Vidal, R., Shakernia, O., Jin Kim, H., Shim, D., & Sastry, S. (2002). Probabilistic pursuit-evasion games: Theory, implementation, and experimental evaluation. IEEE Transactions on Robotics and Automation, 18(5), 662–669.
Article Google Scholar
Whaite, P., & Ferrie, F. P. (1997). Autonomous exploration: Driven by uncertainty. The IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(3), 193–205.

Download references

Acknowledgments

We gratefully acknowledge the support of ONR Grant N00014-07-1-0829, ARL Grant W911NF-08-2-0004, and AFOSR Grant FA9550-10-1-0567. Benjamin Charrow was supported by a NDSEG fellowship from the Department of Defense.

Author information

Authors and Affiliations

University of Pennsylvania, Philadelphia, PA, USA
Benjamin Charrow & Vijay Kumar
Carnegie Mellon University, Pittsburgh, PA, USA
Nathan Michael

Authors

Benjamin Charrow
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Michael
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benjamin Charrow.

Appendix: Proofs

1.1 Integrating Gaussians over a half-space

Proving Lemmas 1 and 2 requires integrating Gaussian functions over a half-space. The following identities will be used several times. Owen (1980) gives these identities for 1-dimensional Gaussians, but we have been unable to find a reference for the multivariate case. For completeness, we prove them here.

Lemma 6

If $f(x)=\mathcal {N}(x; \mu , \Sigma )$ is a $k$-dimensional Gaussian and $A=\{x : a^Tx + b > 0\}$ is a half-space then

$$\begin{aligned}&\int _{A} f(x)\,dx = \Phi \left( p\right) \end{aligned}$$

(9)

$$\begin{aligned}&\int _{A} xf(x)\,dx = \Phi \left( p\right) \mu + \phi \left( p\right) \frac{\Sigma a}{\sqrt{a^T\Sigma a}}\end{aligned}$$

(10)

$$\begin{aligned}&\int _{A} xx^Tf(x)\,dx = \Phi \left( p\right) (\Sigma + \mu \mu ^T) \nonumber \\&\quad +\phi \left( p\right) \left( \frac{\Sigma a\mu ^T + \mu a^T\Sigma }{\sqrt{a^T\Sigma a}}-p\frac{\Sigma aa^T \Sigma }{a^T\Sigma a}\right) \end{aligned}$$

(11)

where $p=(a^T\mu +b)/\sqrt{a^T\Sigma a}$ is a scalar, $\phi \left( x\right) =\mathcal {N}(x;0,1)$ is the PDF of the standard 1-dimensional Gaussian and $\Phi \left( x\right) $ is its CDF.

Proof of (9)

All of these integrals are evaluated by making the substitution $x=Ry$, where $R$ is a rotation matrix that makes the half-space $A$ axis aligned. Specifically, define $R$ such that $a^TR=\Vert a\Vert e_1^T$ where $e_1$ is a $k$-dimensional vector whose first entry is 1 and all others are 0. This substitution is advantageous, because it makes the limits of integration for all components of $y$ except $y_1 [-\infty ,\infty ]$.

Because $R^Tx=y$, $y$ is a $k$-dimensional Gaussian with density $q(y)=\mathcal {N}(y; R^T\mu , R^T\Sigma R)$. The determinant of the Jacobian of $y$ is the determinant of a rotation matrix: $|\partial y / \partial x|=|R^T|=1$. Substituting:

$$\begin{aligned} \int _{a^Tx+b>0}f(x)\,dx&= \int _{(a^TR)y+b>0} |R^T|q(y)\,dy\nonumber \\&=\int _{\Vert a\Vert y_1+b>0} q(y)\,dy \nonumber \\&=\int _{-b/\Vert a\Vert }^\infty q_1(y_1)\,dy_1 \end{aligned}$$

where $q_1(y_1)=\mathcal {N}(y_1; e_1^T (R^T\mu ), e_1^T(R^T\Sigma R)e_1)$ is the density for the first component of $y$. The final step follows as the limits of integration marginalize $q(y)$. To simplify the remaining integral, apply the definition of $R$, $q_1(y_1) =\mathcal {N}(y_1;\mu ^Ta/\Vert a\Vert , a^T\Sigma a/ \Vert a\Vert ^2)$, and use $1-\Phi \left( -x\right) =\Phi \left( x\right) $:

$$\begin{aligned} \int _{-b/\Vert a\Vert }^\infty q_1(y_1)\,dy_1 = 1-\Phi \left( \frac{-b-\mu ^Ta}{\sqrt{a^T\Sigma a}}\right) = \Phi (p) \end{aligned}$$

$\square $

Proof of (10)

First, we perform a change of variables so that the integral is evaluated over the standard multivariate Gaussian $g(x)=\mathcal {N}(x;0, I)$. This substitution is $x=\Sigma ^{1/2}y+\mu $ which can be seen by noting that 1) $|\partial y / \partial x|=|\Sigma |^{-1/2}$ and 2) ${f(x)=|\Sigma |^{-1/2}g(\Sigma ^{-1/2}(x-\mu ))}$:

$$\begin{aligned} \int _{A} x f(x)\,dx&= \int _{B} (\Sigma ^{1/2}y+\mu )g(y)\,dy \nonumber \\&=\Sigma ^{1/2}\int _{B}y g(y)\,dy + \mu \int _B g(y)\,dy \end{aligned}$$

(12)

where $B=\{y : (a^T\Sigma ^{1/2})y +(a^T\mu +b) > 0\}$ is the transformed half-space.

To evaluate the first term in (12), we calculate $\int _C yg(y)\,dy$, where $C=\{y: c^Ty + d > 0\}$ is a new generic half-space. This integral is easier than the original problem as $g(y)$ is the density of a zero-mean Gaussian with identity covariance. To proceed, substitute $y=Rz$ where $R$ is a rotation matrix that makes $C$ axis-aligned (i.e., $c^TR=\Vert c\Vert e_1^T$) and observe that $g(Rz)=g(z)$:

$$\begin{aligned} \int _{C}y g(y)\,dy&= \int _{c^TRz + d > 0}Rz g(Rz)|R^T|\,dz \nonumber \\&=Re_1\int _{-d/\Vert c\Vert }^\infty z_1 \phi (z_1)\,dz_1 \nonumber \\&=Re_1\phi \left( \frac{-d}{\Vert c\Vert }\right) = \frac{c}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$

(13)

$e_1$ appears because $g(x)$ is 0-mean; the only non-zero component of the integral is from $z_1$. The 1-dimensional integral follows as $d\phi (x)/dx=-x\phi (x)$.

To finish, (12) can be evaluated using the formula in (9) and (13):

$$\begin{aligned} \int _{A} x f(x)\,dx&= \Sigma ^{1/2}\left( \frac{\Sigma ^{1/2}a}{\sqrt{a^T\Sigma a}}\right) \phi (p)+\mu \Phi (p) \end{aligned}$$

$\square $

Proof of (11)

Similar to the last proof, make the substitution ${x=\Sigma ^{1/2}y+\mu }$ with $g(y)$ as the standard multivariate Gaussian and expand terms.

$$\begin{aligned}&\int _{A} xx^Tf(x)\,dx =\Sigma ^{1/2}\int _B yy^Tg(y)\,dy \Sigma ^{1/2}\nonumber \\&\quad + \Sigma ^{1/2}\int _Byg(y)\,dy\mu ^T \nonumber \\&\quad +\mu \int _By^Tg(y)\,dy\Sigma ^{1/2}+\mu \mu ^T\int _Bg(y)\,dy \end{aligned}$$

(14)

where $B=\{y : (a^T\Sigma ^{1/2})y +(a^T\mu +b) > 0\}$ is the transformed half-space.

To evaluate (14), we only need a formula for the first integral; the previous proofs have expressions for the other 3 integrals. To evaluate the first integral, let $C=\{y: c^Ty + d > 0\}$ be a new half-space and use the same rotation substitution $y=Rz$ as in the last proof.

$$\begin{aligned} \int _C yy^Tg(y)\,dy&=R\left( \int _{z_1\Vert c\Vert +d>0}zz^Tg(z)\,dz\right) R^T\nonumber \\&=R\left( \Phi \left( \frac{d}{\Vert c\Vert }\right) I {-} \frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) e_1e_1^T\right) R^T\nonumber \\&=\Phi \left( \frac{d}{\Vert c\Vert }\right) I-\frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \frac{cc^T}{\Vert c\Vert ^2} \end{aligned}$$

(15)

The previous integral can be evaluated by analyzing each scalar component. $g$ is the standard multivariate Gaussian, so $g(z)=\prod _{i=1}^k\phi (z_i)$. There are three types of terms:

$z_1^2$: The limits of integration marginalize $g$ and the resulting integral can be solved using integration by parts:
$$\begin{aligned} \int _{z_1\Vert c\Vert +d>0} z_1^2 g(z)\,dz&=\int _{-d/\Vert c\Vert }^\infty z_1^2\phi (z_1)\,dz_1 \nonumber \\&=\Phi \left( \frac{d}{\Vert c\Vert }\right) -\frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$
$z_i^2; (i > 1)$:
$$\begin{aligned}&\int _{z_1\Vert c\Vert +d>0} z_i^2 g(z)\,dz \nonumber \\ {}&=\int _{-d/\Vert c\Vert }^\infty \phi (z_1)\,dz_1 \int _{-\infty }^\infty z_i^2 \phi (z_i)\,dz_i = \Phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$
The integral over $z_i$ is 1 because it’s the variance of the standard normal.
$z_iz_j\; (i\ne j,i\ne 1)$: These indices cover all non-diagonal elements.
$$\begin{aligned}&\int _{z_1\Vert c\Vert +d>0} z_i z_j g(z)\,dz \nonumber \\&\quad ={\int _{\alpha }^\beta z_j\phi (z_j)\,dz_j}{\int _{-\infty }^\infty z_i \phi (z_i)\,dz_i}=0 \end{aligned}$$
$z_i\ne 1$, so its limits are the real line. Because $g$ is 0 mean the integral is 0.

We now have formula for all of the terms in (14). Recall that $B=\{x : (\Sigma ^{1/2} a)^Tx + (a^T\mu + b) > 0\}$. Using $p=(a^T\mu +b)/\sqrt{a^T\Sigma a}$, (9), (10), and (15):

$$\begin{aligned}&\int _{a^Tx+b}xx^Tf(x)\,dx \nonumber \\&\quad =\Sigma ^{1/2}\left( \Phi \left( p\right) I-p\phi \left( p\right) \frac{\Sigma ^{1/2} aa^T\Sigma ^{1/2}}{\Vert a\Vert ^2}\right) \Sigma ^{1/2}\nonumber \\&\qquad +\Sigma ^{1/2}\left( \phi \left( p\right) \frac{\Sigma ^{1/2}a}{\sqrt{a^T\Sigma a}}\right) \mu ^T\nonumber \\&\quad + \mu \left( \phi \left( p\right) \frac{\Sigma ^{1/2}a}{\sqrt{a^T\Sigma a}}\right) ^T\Sigma ^{1/2}+\Phi \left( p\right) \mu \mu ^T \nonumber \\&\quad = \Phi \left( p\right) \left( \Sigma +\mu \mu ^T\right) \nonumber \\&\qquad +\phi \left( p\right) \left( \frac{\Sigma a \mu ^T + \mu a^T\Sigma }{\sqrt{a^T\Sigma a}}-p \frac{\Sigma aa^T \Sigma }{\Vert a\Vert ^2}\right) \end{aligned}$$

which is the formula we sought. $\square $

1.2 Proof of Lemma 1

The norm can be evaluated by splitting the integral up into regions where the absolute value disappears. Let $A$ be the set where $f(x) > g(x)$ and $A^c$ be the complement of $A$ where $f(x) \le g(x)$. Noting that $\int _{A} g(x) = 1 - \int _{A^c} g(x)$:

$$\begin{aligned}&\int | f(x) - g(x)|\,dx =2\int _{A} f(x) - g(x)\,dx \end{aligned}$$

(16)

$f$ and $g$ have the same covariance, so $f$ is bigger than $g$ on a half-space $A=\{x : a^Tx + b > 0\}$ where $a=\Sigma ^{-1}(\mu _1-\mu _2)$ and $b=(\mu _1+\mu _2)\Sigma ^{-1}(\mu _2-\mu _1)/2$. This means (16) can be evaluated using Lemma 6. To do this, we need to evaluate $p$ for $\int _A f(x)\,dx$ and $\int _A g(x)\,dx$. There are three relevant terms:

$$\begin{aligned}&\mu _1^Ta+ b = \frac{\Vert \mu _1-\mu _2\Vert ^2_\Sigma }{2} \end{aligned}$$

(17)

$$\begin{aligned}&\mu _2^Ta+ b = -\frac{\Vert \mu _1-\mu _2\Vert ^2_\Sigma }{2} \end{aligned}$$

(18)

$$\begin{aligned}&\sqrt{a^T\Sigma a} = \Vert \mu _1-\mu _2\Vert _{\Sigma } \end{aligned}$$

(19)

Using $\delta =\Vert \mu _1-\mu _2\Vert _\Sigma /2$ we get $(a^T\mu _1+b)/\sqrt{a^T\Sigma a}=\delta $ and $(a^T\mu _2+b)/\sqrt{a^T\Sigma a}=-\delta $. Plugging these values into Lemma 6 completes the proof.

1.3 Proof of Lemma 2

Let $X$ be a random variable whose density is $|f(x)-g(x)|/\Vert f-g\Vert _1$. Calculating the entropy of $X$ is difficult as the expression involves the log of the absolute value of the difference of exponentials. Fortunately, the covariance of $X$ can be calculated. This is useful because the maximum entropy of any distribution with covariance $\Sigma $ is $(1/2)\log {((2\pi e)^k |\Sigma |)}$, the entropy of the multivariate Gaussian (Cover and Thomas 2004, Thm. 8.6.5). By explicitly calculating the determinant of the covariance of $X$, we prove the desired bound.

To calculate $X$’s covariance, use the formula $\hbox {cov}{X}=\mathbb {E}_{}\left[ XX^T\right] -\mathbb {E}_{}\left[ X\right] \mathbb {E}_{}\left[ X\right] ^T$. Similar to the proof of Lemma 1, we evaluate the mean by breaking the integral up into a region $A$ where $f(x)>g(x)$ and $A^C$ where $g(x)\ge f(x)$ vice-versa:

$$\begin{aligned} \nonumber \mathbb {E}_{}\left[ X\right] = \frac{1}{\Vert f-g\Vert _1}&\left( \int _A x(f(x)-g(x))\,dx \right. \\&\left. + \int _{A^c}x(g(x)-f(x))\,dx\right) \end{aligned}$$

(20)

As before, $f$ and $g$ have the some covariance, so $A = \{x : a^Tx + b >0\}$ and $A^c=\{x : (-a)^Tx + (-b) \ge 0\}$ are half-spaces with $a=\Sigma ^{-1}(\mu _1-\mu _2)$ and $b=(\mu _1+\mu _2)\Sigma ^{-1}(\mu _2-\mu _1)/2$.

Each of the terms in (20) are Gaussian functions integrated over a half-space, which can be evaluated using Lemma 6. To simplify the algebra, we evaluate the integrals involving $f$ and $g$ separately. This involves calculating a few terms, three of which are repeats: (17), (18), and (19). The other term is $\Sigma a = \mu _1-\mu _2$. As the difference of the means will arise repeatedly, define $\Delta =\mu _1-\mu _2$. Noting $2\delta =\Vert \Delta \Vert _\Sigma $ and $\phi (x)=\phi (-x)$, the integrals involving $f$ are:

$$\begin{aligned}&\int _A xf(x)\,dx-\int _{A^c}xf(x)\,dx \nonumber \\&\quad = \Phi \left( \delta \right) \mu _1 + \frac{\phi \left( \delta \right) }{2\delta }\Delta -\left( \Phi \left( -\delta \right) \mu _1 + \frac{\phi \left( -\delta \right) }{2\delta }(-\Delta )\right) \nonumber \\&\quad = (\Phi \left( \delta \right) -\Phi \left( -\delta \right) )\mu _1 + \frac{\phi \left( \delta \right) }{\delta }\Delta \end{aligned}$$

(21)

Next, evaluate the integrals in (20) involving $g$. This can be done by pattern matching from (21). The main change is that $\mu _1$ becomes $\mu _2$ and the sign of $p$ in Lemma 6 flips, meaning the sign of the arguments to $\phi \left( \cdot \right) $ and $\Phi \left( \cdot \right) $ flip (see (17) and (18)).

$$\begin{aligned}&\int _{A}xg(x)\,dx -\int _{A^c} xg(x)\,dx\nonumber \\&\quad = (\Phi \left( -\delta \right) -\Phi \left( \delta \right) )\mu _2 + \frac{\phi \left( -\delta \right) }{\delta }\Delta \end{aligned}$$

(22)

Subtracting (22) from (21), recognizing $\phi (x)=\phi (-x)$, and dividing by $\Vert f-g\Vert _1=2(\Phi \left( \delta \right) -\Phi \left( -\delta \right) )$ simplifies (20):

$$\begin{aligned} \mathbb {E}_{}\left[ X\right] =\frac{\mu _1+\mu _2}{2} \end{aligned}$$

(23)

We now evaluate $X$’s second moment. Split the integral up over $A$ and $A^c$:

$$\begin{aligned}&\mathbb {E}_{}\left[ XX^T\right] = \frac{1}{\Vert f-g\Vert _1}\nonumber \\&\quad \left( \int _A xx^T(f(x)-g(x))\,dx + \int _{A^c}xx^T(g(x)-f(x))\,dx\right) \end{aligned}$$

(24)

Once again, we evaluate this expression by separately evaluating the integrals involving $f$ and $g$.

Starting with $f$ and using Lemma 6 with $\Delta $ and $\delta $:

$$\begin{aligned} \int _Axx^Tf(x)\,dx&=\Phi \left( \delta \right) (\Sigma +\mu _1\mu _1^T)\nonumber \\&\quad +\phi \left( \delta \right) \left( \frac{\Delta \mu _1^T+\mu _1\Delta ^T}{2\delta }- \delta \frac{\Delta \Delta ^T}{4\delta ^2}\right) \end{aligned}$$

(25)

$$\begin{aligned} \int _{A^c}xx^Tf(x)\,dx&=\Phi \left( -\delta \right) (\Sigma +\mu _1\mu _1^T)\nonumber \\&\quad -\phi \left( \delta \right) \left( \frac{\Delta \mu _1^T+\mu _1\Delta ^T}{2\delta }- \delta \frac{\Delta \Delta ^T}{4\delta ^2}\right) \end{aligned}$$

(26)

Taking the difference of (25) and (26):

$$\begin{aligned}&\int _Axx^Tf(x)\,dx - \int _{A^c}xx^Tf(x)\,dx \nonumber \\&\quad =(\Phi \left( \delta \right) -\Phi \left( -\delta \right) )(\Sigma +\mu _1\mu _1^T)\nonumber \\&\qquad +\frac{\phi \left( \delta \right) }{\delta }\left( \Delta \mu _1^T+\mu _1 \Delta ^T-\frac{\Delta \Delta ^T}{2}\right) \end{aligned}$$

(27)

To evaluate the integrals in (24) involving $g$, we can pattern match using (25) and (26). As in the derivation of (22), this involves negating the $p$ terms and replacing $\mu _1$ with $\mu _2$.

$$\begin{aligned}&\int _Axx^Tg(x)\,dx - \int _{A^c}xx^Tg(x)\,dx \nonumber \\&\quad =(\Phi \left( -\delta \right) -\Phi \left( \delta \right) )(\Sigma +\mu _2\mu _2^T)\nonumber \\&\qquad +\frac{\phi \left( \delta \right) }{\delta }\left( \Delta \mu _2^T+\mu _2 \Delta ^T+\frac{\Delta \Delta ^T}{2}\right) \end{aligned}$$

(28)

Note the sign difference of $\Delta \Delta ^T$ compared to (27).

To finish calculating the second moment, subtract (28) from (27) and divide by $\Vert f-g\Vert _1$, simplifying (24)

$$\begin{aligned} \mathbb {E}_{}\left[ XX^T\right] =\frac{1}{2}(2\Sigma +\mu _1\mu _1^T +\mu _2\mu _2^T)+\frac{\phi \left( \delta \right) }{\delta \Vert f-g\Vert _1}\Delta \Delta ^T \end{aligned}$$

We can now express the covariance of $X$.

$$\begin{aligned} \hbox {cov}{X}&= \Sigma + \frac{\Delta \Delta }{4}^T + \frac{\phi (\delta )}{\delta \Vert f-g\Vert _1}\Delta \Delta ^T \nonumber \\&= \Sigma + \left( \delta ^2 + \frac{2\phi (\delta )\delta }{2\Phi (\delta )-1}\right) \frac{\Delta \Delta ^T}{4\delta ^2} \end{aligned}$$

(29)

Which follows as $(\mu _1\mu _1^T+\mu _2\mu _2^T)/2-(\mu _1+\mu _2)(\mu _1+\mu _2)^T/4=\Delta \Delta ^T/4$.

To calculate the determinant of the covariance, factor $\Sigma $ out of (29) and define $\alpha =\delta ^2 + \frac{2\phi \left( \delta \right) \delta }{2\Phi \left( \delta \right) -1}$:

$$\begin{aligned} |\hbox {cov}{X}|&=\left| \Sigma \left( I+\alpha \frac{\Sigma ^{-1}\Delta \Delta ^T}{4\delta ^2}\right) \right| =|\Sigma |\left( 1+\alpha \right) \end{aligned}$$

(30)

The last step follows from the eigenvalues. $\Sigma ^{-1}\Delta \Delta ^T$ only has one non-zero eigenvalue; it is a rank one matrix as $\Sigma ^{-1}$ is full rank and $\Delta \Delta ^T$ is rank one. The trace of a matrix is the sum of its eigenvalues, so $tr(\Sigma ^{-1}\Delta \Delta ^T) =tr(\Delta ^T\Sigma ^{-1}\Delta )=\Vert \Delta \Vert _\Sigma ^2=4\delta ^2$ is the non-zero eigenvalue. Consequently, the only non-zero eigenvalue of $\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2$ is $\alpha $. Adding $I$ to a matrix increases all its eigenvalues by 1 so the only eigenvalue of $I+\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2$ that is not 1 has a value of $1 + \alpha $. The determinant of a matrix is the product of its eigenvalues, so $|I+\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2|=1+\alpha $. As discussed at the beginning of the proof, substituting (30) into the expression for the entropy of a multivariate Gaussian achieves the desired upper bound.

1.4 Proof of Theorem 2

To prove the theorem, we build on Theorem 1. Unfortunately, it cannot be directly applied because it is difficult to evaluate 1) the $L_1$ norm between GMMs and 2) the entropy of the normalized difference of mixture models. However, Lemmas 1 and 2 provide a way to evaluate these quantities when the densities are individual Gaussians. To exploit this fact, we split the problem up and analyze the difference in entropies between GMMs that only differ by a single component.

To start, define $d_j(x)=\sum _{i=1}^j w_i g_i(x) + \sum _{i=j+1}^n w_i f_i(x)$. $d_j$ is a GMM whose first $j$ components are the first $j$ components in $g$ and last $n-j$ components are the last $n-j$ components in $f$. Note that $d_0 (x) = g(x)$ and $d_n (x) = g(x)$. Using the triangle inequality:

$$\begin{aligned} \mathrm{H }\left[ {f}\right] -\mathrm{H }\left[ {g}\right] |&= |\mathrm{H }\left[ {d_0}\right] -\mathrm{H }\left[ {d_1}\right] + \mathrm{H }\left[ {d_1}\right] -\mathrm{H }\left[ {d_n}\right] |\nonumber \\&\le |\mathrm{H }\left[ {d_0}\right] -\mathrm{H }\left[ {d_1}\right] | + |\mathrm{H }\left[ {d_1}\right] -\mathrm{H }\left[ {d_n}\right] |\nonumber \\&\le \sum _{j=1}^N|\mathrm{H }\left[ {d_{j-1}}\right] -\mathrm{H }\left[ {d_j}\right] | \end{aligned}$$

where the last step applied the same trick $n-2$ more times. Because $d_{j-1} (x) - d_{j} (x) = w_j (f_j (x) - g_j (x))$ each term in the summand can be bounded using Theorem 1:

$$\begin{aligned}&\Big |\mathrm{H }\left[ {d_{j-1}}\right] -\mathrm{H }\left[ {d_{j}}\right] \Big | \nonumber \\&\quad \le -\Vert w_j(f_j-g_j)\Vert _1 \log \Vert w_j(f_j-g_j)\Vert _1 \nonumber \\&\quad \quad + \Vert w_j(f_j-g_j)\Vert _1 \mathrm{H }\left[ {\frac{|f_j(x)-g_j(x)|}{\Vert f_j-g_j\Vert _1}}\right] \end{aligned}$$

(31)

Because $f_j(x)$ and $g_j(x)$ are Gaussians with the same covariance, we can apply Lemmas 1 and 2 to complete the proof.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Charrow, B., Kumar, V. & Michael, N. Approximate representations for multi-robot control policies that maximize mutual information. Auton Robot 37, 383–400 (2014). https://doi.org/10.1007/s10514-014-9411-2

Download citation

Received: 05 October 2013
Accepted: 14 July 2014
Published: 23 August 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10514-014-9411-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Approximate representations for multi-robot control policies that maximize mutual information

Abstract

Access this article

Similar content being viewed by others

Towards Cooperative Multi-robot Belief Space Planning in Unknown Environments

Cooperative multi-robot belief space planning for autonomous navigation in unknown environments

Sampling-based planning for non-myopic multi-robot information gathering

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs

1.1 Integrating Gaussians over a half-space

Lemma 6

Proof of (9)

Proof of (10)

Proof of (11)

1.2 Proof of Lemma 1

1.3 Proof of Lemma 2

1.4 Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Approximate representations for multi-robot control policies that maximize mutual information

Abstract

Access this article

Similar content being viewed by others

Towards Cooperative Multi-robot Belief Space Planning in Unknown Environments

Cooperative multi-robot belief space planning for autonomous navigation in unknown environments

Sampling-based planning for non-myopic multi-robot information gathering

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs

Appendix: Proofs

1.1 Integrating Gaussians over a half-space

Lemma 6

Proof of (9)

Proof of (10)

Proof of (11)

1.2 Proof of Lemma 1

1.3 Proof of Lemma 2

1.4 Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation