Skip to main content

Advertisement

Log in

Approximate representations for multi-robot control policies that maximize mutual information

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

We address the problem of controlling a small team of robots to estimate the location of a mobile target using non-linear range-only sensors. Our control law maximizes the mutual information between the team’s estimate and future measurements over a finite time horizon. Because the computations associated with such policies scale poorly with the number of robots, the time horizon associated with the policy, and typical non-parametric representations of the belief, we design approximate representations that enable real-time operation. The main contributions of this paper include the control policy, an algorithm for approximating the belief state, and an extensive study of the performance of these algorithms using simulations and real world experiments in complex, indoor environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Binney, J., Krause, A., & Sukhatme, G. S. (2013). Optimizing waypoints for monitoring spatiotemporal phenomena. International Journal of Robotics Research, 32(8), 873–888.

    Article  Google Scholar 

  • Charrow, B., Kumar, V., & Michael, N. (2013). Approximate representations for multi-robot control policies that maximize mutual information. In Proceedings of Robotics: Science and Systems, Berlin, Germany.

  • Charrow, B., Michael, N., & Kumar, V. (2014). Cooperative multi-robot estimation and control for radio source localization. The International Journal of Robotics Research, 33, 569–580.

    Article  Google Scholar 

  • Chung, T., Hollinger, G., & Isler, V. (2014). Search and pursuit-evasion in mobile robotics. Autonomous Robots, 31(4), 299–316.

    Article  Google Scholar 

  • Cover, T. M., & Thomas, J. A. (2004). Elements of information theory. New York: Wiley.

    Google Scholar 

  • Dame, A., & Marchand, E. (2011). Mutual information-based visual servoing. The IEEE Transactions on Robotics, 27(5), 958–969.

    Article  Google Scholar 

  • Djugash, J., & Singh, S. (2012). Motion-aided network slam with range. International Journal of Robotics Research, 31(5), 604–625.

    Article  Google Scholar 

  • Fannes, M. (1973). A continuity property of the entropy density for spin lattice systems. Communications in Mathematical Physics, 31(4), 291–294.

    Article  MathSciNet  MATH  Google Scholar 

  • Fox, D. (2003). Adapting the sample size in particle filters through KLD-sampling. International Journal of Robotics Research, 22(12), 985–1003.

    Article  Google Scholar 

  • Golovin, D., & Krause, A. (2011). Adaptive submodularity: Theory and applications in active learning and stochastic optimization. The Journal of Artificial Intelligence Research, 42(1), 427–486.

    MathSciNet  MATH  Google Scholar 

  • Grocholsky, B. (2002). Information-theoretic control of multiple sensor platforms. PhD thesis, Australian Centre for Field Robotics.

  • Hahn, T. (2013, January). Cuba. http://www.feynarts.de/cuba/.

  • Hoffmann, G., & Tomlin, C. (2010). Mobile sensor network control using mutual information methods and particle filters. The IEEE Transactions on Automatic Control, 55(1), 32–47.

  • Hollinger, G., & Sukhatme, G. (2013). Sampling-based motion planning for robotic information gathering. In Proceedings of Robotics: Science and Systems, Berlin, Germany.

  • Hollinger, G., Djugash, J., & Singh, S. (2011). Target tracking without line of sight using range from radio. Autonomous Robots, 32(1), 1–14.

    Article  Google Scholar 

  • Huber, M., & Hanebeck, U. (2008). Progressive gaussian mixture reduction. In International Conference on Information Fusion.

  • Huber, M., Bailey, T., Durrant-Whyte, H., & Hanebeck, U. (2008, August). On entropy approximation for gaussian mixture random vectors. In Conference on Multisensor Fusion and Integration for Intelligent Systems, Seoul, Korea (pp. 181–188).

  • Julian, B. J., Angermann, M., Schwager, M., & Rus, D. (2011, September). A scalable information theoretic approach to distributed robot coordination. In Proceedings of the IEEE/RSJ International Conference on Intellegent Robots and System, San Francisco, USA (pp. 5187–5194).

  • Kassir, A., Fitch, R., & Sukkarieh, S. (2012, May). Decentralised information gathering with communication costs. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, USA (pp. 2427–2432).

  • Krause, A., & Guestrin, C. (2005). Near-optimal nonmyopic value of information in graphical models. In Conference on Uncertainty in Artificial Intelligence (pp. 324–331).

  • nanoPAN 5375 Development Kit. (2013, September). http://nanotron.com/EN/pdf/Factsheet_nanoPAN_5375_Dev_Kit.pdf.

  • Owen, D. (1980). A table of normal integrals. Communications in Statistics-Simulation and Computation, 9(4), 389–419.

    Article  MathSciNet  Google Scholar 

  • Park, J.G., Charrow, B., Battat, J., Curtis, D., Minkov, E., Hicks, J. Teller, S., & Ledlie, J. (2010). Growing an organic indoor location system. In Proceedings of International Conference on Mobile Systems, Applications, and Services, San Francisco, CA.

  • ROS. (2013, January). http://www.ros.org/wiki/.

  • Runnals, A. (2007). Kullback–Leibler approach to gaussian mixture reduction. IEEE Transactions on Aerospace and Electronic Systems, 43(3), 989–999.

    Article  Google Scholar 

  • Ryan, A., & Hedrick, J. (2010). Particle filter based information-theoretic active sensing. Robotics and Autonomous Systems, 58(5), 574–584.

  • Silva, J.F., Parada, P. (2011). Sufficient conditions for the convergence of the shannon differential entropy. In IEEE Information Theory Workshop, Paraty, Brazil (pp. 608–612).

  • Singh, A., Krause, A., Guestrin, C., & Kaiser, W. J. (2009). Efficient informative sensing using multiple robots. The Journal of Artificial Intelligence Research, 34(1), 707–755.

    MathSciNet  MATH  Google Scholar 

  • Stump, E., Kumar, V., Grocholsky, B., & Shiroma, P. (2009). Control for localization of targets using range-only sensors. International Journal of Robotics Research, 28(6), 743.

    Article  Google Scholar 

  • Thrun, S., Burgard, W., & Fox, D. (2008). Probabilistic robotics. Cambridge: MIT Press.

    Google Scholar 

  • Vidal, R., Shakernia, O., Jin Kim, H., Shim, D., & Sastry, S. (2002). Probabilistic pursuit-evasion games: Theory, implementation, and experimental evaluation. IEEE Transactions on Robotics and Automation, 18(5), 662–669.

    Article  Google Scholar 

  • Whaite, P., & Ferrie, F. P. (1997). Autonomous exploration: Driven by uncertainty. The IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(3), 193–205.

Download references

Acknowledgments

We gratefully acknowledge the support of ONR Grant N00014-07-1-0829, ARL Grant W911NF-08-2-0004, and AFOSR Grant FA9550-10-1-0567. Benjamin Charrow was supported by a NDSEG fellowship from the Department of Defense.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Charrow.

Appendix: Proofs

Appendix: Proofs

1.1 Integrating Gaussians over a half-space

Proving Lemmas 1 and 2 requires integrating Gaussian functions over a half-space. The following identities will be used several times. Owen (1980) gives these identities for 1-dimensional Gaussians, but we have been unable to find a reference for the multivariate case. For completeness, we prove them here.

Lemma 6

If \(f(x)=\mathcal {N}(x; \mu , \Sigma )\) is a \(k\)-dimensional Gaussian and \(A=\{x : a^Tx + b > 0\}\) is a half-space then

$$\begin{aligned}&\int _{A} f(x)\,dx = \Phi \left( p\right) \end{aligned}$$
(9)
$$\begin{aligned}&\int _{A} xf(x)\,dx = \Phi \left( p\right) \mu + \phi \left( p\right) \frac{\Sigma a}{\sqrt{a^T\Sigma a}}\end{aligned}$$
(10)
$$\begin{aligned}&\int _{A} xx^Tf(x)\,dx = \Phi \left( p\right) (\Sigma + \mu \mu ^T) \nonumber \\&\quad +\phi \left( p\right) \left( \frac{\Sigma a\mu ^T + \mu a^T\Sigma }{\sqrt{a^T\Sigma a}}-p\frac{\Sigma aa^T \Sigma }{a^T\Sigma a}\right) \end{aligned}$$
(11)

where \(p=(a^T\mu +b)/\sqrt{a^T\Sigma a}\) is a scalar, \(\phi \left( x\right) =\mathcal {N}(x;0,1)\) is the PDF of the standard 1-dimensional Gaussian and \(\Phi \left( x\right) \) is its CDF.

Proof of (9)

All of these integrals are evaluated by making the substitution \(x=Ry\), where \(R\) is a rotation matrix that makes the half-space \(A\) axis aligned. Specifically, define \(R\) such that \(a^TR=\Vert a\Vert e_1^T\) where \(e_1\) is a \(k\)-dimensional vector whose first entry is 1 and all others are 0. This substitution is advantageous, because it makes the limits of integration for all components of \(y\) except \(y_1 [-\infty ,\infty ]\).

Because \(R^Tx=y\), \(y\) is a \(k\)-dimensional Gaussian with density \(q(y)=\mathcal {N}(y; R^T\mu , R^T\Sigma R)\). The determinant of the Jacobian of \(y\) is the determinant of a rotation matrix: \(|\partial y / \partial x|=|R^T|=1\). Substituting:

$$\begin{aligned} \int _{a^Tx+b>0}f(x)\,dx&= \int _{(a^TR)y+b>0} |R^T|q(y)\,dy\nonumber \\&=\int _{\Vert a\Vert y_1+b>0} q(y)\,dy \nonumber \\&=\int _{-b/\Vert a\Vert }^\infty q_1(y_1)\,dy_1 \end{aligned}$$

where \(q_1(y_1)=\mathcal {N}(y_1; e_1^T (R^T\mu ), e_1^T(R^T\Sigma R)e_1)\) is the density for the first component of \(y\). The final step follows as the limits of integration marginalize \(q(y)\). To simplify the remaining integral, apply the definition of \(R\), \(q_1(y_1) =\mathcal {N}(y_1;\mu ^Ta/\Vert a\Vert , a^T\Sigma a/ \Vert a\Vert ^2)\), and use \(1-\Phi \left( -x\right) =\Phi \left( x\right) \):

$$\begin{aligned} \int _{-b/\Vert a\Vert }^\infty q_1(y_1)\,dy_1 = 1-\Phi \left( \frac{-b-\mu ^Ta}{\sqrt{a^T\Sigma a}}\right) = \Phi (p) \end{aligned}$$

\(\square \)

Proof of (10)

First, we perform a change of variables so that the integral is evaluated over the standard multivariate Gaussian \(g(x)=\mathcal {N}(x;0, I)\). This substitution is \(x=\Sigma ^{1/2}y+\mu \) which can be seen by noting that 1) \(|\partial y / \partial x|=|\Sigma |^{-1/2}\) and 2) \({f(x)=|\Sigma |^{-1/2}g(\Sigma ^{-1/2}(x-\mu ))}\):

$$\begin{aligned} \int _{A} x f(x)\,dx&= \int _{B} (\Sigma ^{1/2}y+\mu )g(y)\,dy \nonumber \\&=\Sigma ^{1/2}\int _{B}y g(y)\,dy + \mu \int _B g(y)\,dy \end{aligned}$$
(12)

where \(B=\{y : (a^T\Sigma ^{1/2})y +(a^T\mu +b) > 0\}\) is the transformed half-space.

To evaluate the first term in (12), we calculate \(\int _C yg(y)\,dy\), where \(C=\{y: c^Ty + d > 0\}\) is a new generic half-space. This integral is easier than the original problem as \(g(y)\) is the density of a zero-mean Gaussian with identity covariance. To proceed, substitute \(y=Rz\) where \(R\) is a rotation matrix that makes \(C\) axis-aligned (i.e., \(c^TR=\Vert c\Vert e_1^T\)) and observe that \(g(Rz)=g(z)\):

$$\begin{aligned} \int _{C}y g(y)\,dy&= \int _{c^TRz + d > 0}Rz g(Rz)|R^T|\,dz \nonumber \\&=Re_1\int _{-d/\Vert c\Vert }^\infty z_1 \phi (z_1)\,dz_1 \nonumber \\&=Re_1\phi \left( \frac{-d}{\Vert c\Vert }\right) = \frac{c}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$
(13)

\(e_1\) appears because \(g(x)\) is 0-mean; the only non-zero component of the integral is from \(z_1\). The 1-dimensional integral follows as \(d\phi (x)/dx=-x\phi (x)\).

To finish, (12) can be evaluated using the formula in (9) and (13):

$$\begin{aligned} \int _{A} x f(x)\,dx&= \Sigma ^{1/2}\left( \frac{\Sigma ^{1/2}a}{\sqrt{a^T\Sigma a}}\right) \phi (p)+\mu \Phi (p) \end{aligned}$$

\(\square \)

Proof of (11)

Similar to the last proof, make the substitution \({x=\Sigma ^{1/2}y+\mu }\) with \(g(y)\) as the standard multivariate Gaussian and expand terms.

$$\begin{aligned}&\int _{A} xx^Tf(x)\,dx =\Sigma ^{1/2}\int _B yy^Tg(y)\,dy \Sigma ^{1/2}\nonumber \\&\quad + \Sigma ^{1/2}\int _Byg(y)\,dy\mu ^T \nonumber \\&\quad +\mu \int _By^Tg(y)\,dy\Sigma ^{1/2}+\mu \mu ^T\int _Bg(y)\,dy \end{aligned}$$
(14)

where \(B=\{y : (a^T\Sigma ^{1/2})y +(a^T\mu +b) > 0\}\) is the transformed half-space.

To evaluate (14), we only need a formula for the first integral; the previous proofs have expressions for the other 3 integrals. To evaluate the first integral, let \(C=\{y: c^Ty + d > 0\}\) be a new half-space and use the same rotation substitution \(y=Rz\) as in the last proof.

$$\begin{aligned} \int _C yy^Tg(y)\,dy&=R\left( \int _{z_1\Vert c\Vert +d>0}zz^Tg(z)\,dz\right) R^T\nonumber \\&=R\left( \Phi \left( \frac{d}{\Vert c\Vert }\right) I {-} \frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) e_1e_1^T\right) R^T\nonumber \\&=\Phi \left( \frac{d}{\Vert c\Vert }\right) I-\frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \frac{cc^T}{\Vert c\Vert ^2} \end{aligned}$$
(15)

The previous integral can be evaluated by analyzing each scalar component. \(g\) is the standard multivariate Gaussian, so \(g(z)=\prod _{i=1}^k\phi (z_i)\). There are three types of terms:

  • \(z_1^2\): The limits of integration marginalize \(g\) and the resulting integral can be solved using integration by parts:

    $$\begin{aligned} \int _{z_1\Vert c\Vert +d>0} z_1^2 g(z)\,dz&=\int _{-d/\Vert c\Vert }^\infty z_1^2\phi (z_1)\,dz_1 \nonumber \\&=\Phi \left( \frac{d}{\Vert c\Vert }\right) -\frac{d}{\Vert c\Vert }\phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$
  • \(z_i^2; (i > 1)\):

    $$\begin{aligned}&\int _{z_1\Vert c\Vert +d>0} z_i^2 g(z)\,dz \nonumber \\ {}&=\int _{-d/\Vert c\Vert }^\infty \phi (z_1)\,dz_1 \int _{-\infty }^\infty z_i^2 \phi (z_i)\,dz_i = \Phi \left( \frac{d}{\Vert c\Vert }\right) \end{aligned}$$

    The integral over \(z_i\) is 1 because it’s the variance of the standard normal.

  • \(z_iz_j\; (i\ne j,i\ne 1)\): These indices cover all non-diagonal elements.

    $$\begin{aligned}&\int _{z_1\Vert c\Vert +d>0} z_i z_j g(z)\,dz \nonumber \\&\quad ={\int _{\alpha }^\beta z_j\phi (z_j)\,dz_j}{\int _{-\infty }^\infty z_i \phi (z_i)\,dz_i}=0 \end{aligned}$$

    \(z_i\ne 1\), so its limits are the real line. Because \(g\) is 0 mean the integral is 0.

We now have formula for all of the terms in (14). Recall that \(B=\{x : (\Sigma ^{1/2} a)^Tx + (a^T\mu + b) > 0\}\). Using \(p=(a^T\mu +b)/\sqrt{a^T\Sigma a}\), (9), (10), and (15):

$$\begin{aligned}&\int _{a^Tx+b}xx^Tf(x)\,dx \nonumber \\&\quad =\Sigma ^{1/2}\left( \Phi \left( p\right) I-p\phi \left( p\right) \frac{\Sigma ^{1/2} aa^T\Sigma ^{1/2}}{\Vert a\Vert ^2}\right) \Sigma ^{1/2}\nonumber \\&\qquad +\Sigma ^{1/2}\left( \phi \left( p\right) \frac{\Sigma ^{1/2}a}{\sqrt{a^T\Sigma a}}\right) \mu ^T\nonumber \\&\quad + \mu \left( \phi \left( p\right) \frac{\Sigma ^{1/2}a}{\sqrt{a^T\Sigma a}}\right) ^T\Sigma ^{1/2}+\Phi \left( p\right) \mu \mu ^T \nonumber \\&\quad = \Phi \left( p\right) \left( \Sigma +\mu \mu ^T\right) \nonumber \\&\qquad +\phi \left( p\right) \left( \frac{\Sigma a \mu ^T + \mu a^T\Sigma }{\sqrt{a^T\Sigma a}}-p \frac{\Sigma aa^T \Sigma }{\Vert a\Vert ^2}\right) \end{aligned}$$

which is the formula we sought. \(\square \)

1.2 Proof of Lemma 1

The norm can be evaluated by splitting the integral up into regions where the absolute value disappears. Let \(A\) be the set where \(f(x) > g(x)\) and \(A^c\) be the complement of \(A\) where \(f(x) \le g(x)\). Noting that \(\int _{A} g(x) = 1 - \int _{A^c} g(x)\):

$$\begin{aligned}&\int | f(x) - g(x)|\,dx =2\int _{A} f(x) - g(x)\,dx \end{aligned}$$
(16)

\(f\) and \(g\) have the same covariance, so \(f\) is bigger than \(g\) on a half-space \(A=\{x : a^Tx + b > 0\}\) where \(a=\Sigma ^{-1}(\mu _1-\mu _2)\) and \(b=(\mu _1+\mu _2)\Sigma ^{-1}(\mu _2-\mu _1)/2\). This means (16) can be evaluated using Lemma 6. To do this, we need to evaluate \(p\) for \(\int _A f(x)\,dx\) and \(\int _A g(x)\,dx\). There are three relevant terms:

$$\begin{aligned}&\mu _1^Ta+ b = \frac{\Vert \mu _1-\mu _2\Vert ^2_\Sigma }{2} \end{aligned}$$
(17)
$$\begin{aligned}&\mu _2^Ta+ b = -\frac{\Vert \mu _1-\mu _2\Vert ^2_\Sigma }{2} \end{aligned}$$
(18)
$$\begin{aligned}&\sqrt{a^T\Sigma a} = \Vert \mu _1-\mu _2\Vert _{\Sigma } \end{aligned}$$
(19)

Using \(\delta =\Vert \mu _1-\mu _2\Vert _\Sigma /2\) we get \((a^T\mu _1+b)/\sqrt{a^T\Sigma a}=\delta \) and \((a^T\mu _2+b)/\sqrt{a^T\Sigma a}=-\delta \). Plugging these values into Lemma 6 completes the proof.

1.3 Proof of Lemma 2

Let \(X\) be a random variable whose density is \(|f(x)-g(x)|/\Vert f-g\Vert _1\). Calculating the entropy of \(X\) is difficult as the expression involves the log of the absolute value of the difference of exponentials. Fortunately, the covariance of \(X\) can be calculated. This is useful because the maximum entropy of any distribution with covariance \(\Sigma \) is \((1/2)\log {((2\pi e)^k |\Sigma |)}\), the entropy of the multivariate Gaussian (Cover and Thomas 2004, Thm. 8.6.5). By explicitly calculating the determinant of the covariance of \(X\), we prove the desired bound.

To calculate \(X\)’s covariance, use the formula \(\hbox {cov}{X}=\mathbb {E}_{}\left[ XX^T\right] -\mathbb {E}_{}\left[ X\right] \mathbb {E}_{}\left[ X\right] ^T\). Similar to the proof of Lemma 1, we evaluate the mean by breaking the integral up into a region \(A\) where \(f(x)>g(x)\) and \(A^C\) where \(g(x)\ge f(x)\) vice-versa:

$$\begin{aligned} \nonumber \mathbb {E}_{}\left[ X\right] = \frac{1}{\Vert f-g\Vert _1}&\left( \int _A x(f(x)-g(x))\,dx \right. \\&\left. + \int _{A^c}x(g(x)-f(x))\,dx\right) \end{aligned}$$
(20)

As before, \(f\) and \(g\) have the some covariance, so \(A = \{x : a^Tx + b >0\}\) and \(A^c=\{x : (-a)^Tx + (-b) \ge 0\}\) are half-spaces with \(a=\Sigma ^{-1}(\mu _1-\mu _2)\) and \(b=(\mu _1+\mu _2)\Sigma ^{-1}(\mu _2-\mu _1)/2\).

Each of the terms in (20) are Gaussian functions integrated over a half-space, which can be evaluated using Lemma 6. To simplify the algebra, we evaluate the integrals involving \(f\) and \(g\) separately. This involves calculating a few terms, three of which are repeats: (17), (18), and (19). The other term is \(\Sigma a = \mu _1-\mu _2\). As the difference of the means will arise repeatedly, define \(\Delta =\mu _1-\mu _2\). Noting \(2\delta =\Vert \Delta \Vert _\Sigma \) and \(\phi (x)=\phi (-x)\), the integrals involving \(f\) are:

$$\begin{aligned}&\int _A xf(x)\,dx-\int _{A^c}xf(x)\,dx \nonumber \\&\quad = \Phi \left( \delta \right) \mu _1 + \frac{\phi \left( \delta \right) }{2\delta }\Delta -\left( \Phi \left( -\delta \right) \mu _1 + \frac{\phi \left( -\delta \right) }{2\delta }(-\Delta )\right) \nonumber \\&\quad = (\Phi \left( \delta \right) -\Phi \left( -\delta \right) )\mu _1 + \frac{\phi \left( \delta \right) }{\delta }\Delta \end{aligned}$$
(21)

Next, evaluate the integrals in (20) involving \(g\). This can be done by pattern matching from (21). The main change is that \(\mu _1\) becomes \(\mu _2\) and the sign of \(p\) in Lemma 6 flips, meaning the sign of the arguments to \(\phi \left( \cdot \right) \) and \(\Phi \left( \cdot \right) \) flip (see (17) and (18)).

$$\begin{aligned}&\int _{A}xg(x)\,dx -\int _{A^c} xg(x)\,dx\nonumber \\&\quad = (\Phi \left( -\delta \right) -\Phi \left( \delta \right) )\mu _2 + \frac{\phi \left( -\delta \right) }{\delta }\Delta \end{aligned}$$
(22)

Subtracting (22) from (21), recognizing \(\phi (x)=\phi (-x)\), and dividing by \(\Vert f-g\Vert _1=2(\Phi \left( \delta \right) -\Phi \left( -\delta \right) )\) simplifies (20):

$$\begin{aligned} \mathbb {E}_{}\left[ X\right] =\frac{\mu _1+\mu _2}{2} \end{aligned}$$
(23)

We now evaluate \(X\)’s second moment. Split the integral up over \(A\) and \(A^c\):

$$\begin{aligned}&\mathbb {E}_{}\left[ XX^T\right] = \frac{1}{\Vert f-g\Vert _1}\nonumber \\&\quad \left( \int _A xx^T(f(x)-g(x))\,dx + \int _{A^c}xx^T(g(x)-f(x))\,dx\right) \end{aligned}$$
(24)

Once again, we evaluate this expression by separately evaluating the integrals involving \(f\) and \(g\).

Starting with \(f\) and using Lemma 6 with \(\Delta \) and \(\delta \):

$$\begin{aligned} \int _Axx^Tf(x)\,dx&=\Phi \left( \delta \right) (\Sigma +\mu _1\mu _1^T)\nonumber \\&\quad +\phi \left( \delta \right) \left( \frac{\Delta \mu _1^T+\mu _1\Delta ^T}{2\delta }- \delta \frac{\Delta \Delta ^T}{4\delta ^2}\right) \end{aligned}$$
(25)
$$\begin{aligned} \int _{A^c}xx^Tf(x)\,dx&=\Phi \left( -\delta \right) (\Sigma +\mu _1\mu _1^T)\nonumber \\&\quad -\phi \left( \delta \right) \left( \frac{\Delta \mu _1^T+\mu _1\Delta ^T}{2\delta }- \delta \frac{\Delta \Delta ^T}{4\delta ^2}\right) \end{aligned}$$
(26)

Taking the difference of (25) and (26):

$$\begin{aligned}&\int _Axx^Tf(x)\,dx - \int _{A^c}xx^Tf(x)\,dx \nonumber \\&\quad =(\Phi \left( \delta \right) -\Phi \left( -\delta \right) )(\Sigma +\mu _1\mu _1^T)\nonumber \\&\qquad +\frac{\phi \left( \delta \right) }{\delta }\left( \Delta \mu _1^T+\mu _1 \Delta ^T-\frac{\Delta \Delta ^T}{2}\right) \end{aligned}$$
(27)

To evaluate the integrals in (24) involving \(g\), we can pattern match using (25) and (26). As in the derivation of (22), this involves negating the \(p\) terms and replacing \(\mu _1\) with \(\mu _2\).

$$\begin{aligned}&\int _Axx^Tg(x)\,dx - \int _{A^c}xx^Tg(x)\,dx \nonumber \\&\quad =(\Phi \left( -\delta \right) -\Phi \left( \delta \right) )(\Sigma +\mu _2\mu _2^T)\nonumber \\&\qquad +\frac{\phi \left( \delta \right) }{\delta }\left( \Delta \mu _2^T+\mu _2 \Delta ^T+\frac{\Delta \Delta ^T}{2}\right) \end{aligned}$$
(28)

Note the sign difference of \(\Delta \Delta ^T\) compared to (27).

To finish calculating the second moment, subtract (28) from (27) and divide by \(\Vert f-g\Vert _1\), simplifying (24)

$$\begin{aligned} \mathbb {E}_{}\left[ XX^T\right] =\frac{1}{2}(2\Sigma +\mu _1\mu _1^T +\mu _2\mu _2^T)+\frac{\phi \left( \delta \right) }{\delta \Vert f-g\Vert _1}\Delta \Delta ^T \end{aligned}$$

We can now express the covariance of \(X\).

$$\begin{aligned} \hbox {cov}{X}&= \Sigma + \frac{\Delta \Delta }{4}^T + \frac{\phi (\delta )}{\delta \Vert f-g\Vert _1}\Delta \Delta ^T \nonumber \\&= \Sigma + \left( \delta ^2 + \frac{2\phi (\delta )\delta }{2\Phi (\delta )-1}\right) \frac{\Delta \Delta ^T}{4\delta ^2} \end{aligned}$$
(29)

Which follows as \((\mu _1\mu _1^T+\mu _2\mu _2^T)/2-(\mu _1+\mu _2)(\mu _1+\mu _2)^T/4=\Delta \Delta ^T/4\).

To calculate the determinant of the covariance, factor \(\Sigma \) out of (29) and define \(\alpha =\delta ^2 + \frac{2\phi \left( \delta \right) \delta }{2\Phi \left( \delta \right) -1}\):

$$\begin{aligned} |\hbox {cov}{X}|&=\left| \Sigma \left( I+\alpha \frac{\Sigma ^{-1}\Delta \Delta ^T}{4\delta ^2}\right) \right| =|\Sigma |\left( 1+\alpha \right) \end{aligned}$$
(30)

The last step follows from the eigenvalues. \(\Sigma ^{-1}\Delta \Delta ^T\) only has one non-zero eigenvalue; it is a rank one matrix as \(\Sigma ^{-1}\) is full rank and \(\Delta \Delta ^T\) is rank one. The trace of a matrix is the sum of its eigenvalues, so \(tr(\Sigma ^{-1}\Delta \Delta ^T) =tr(\Delta ^T\Sigma ^{-1}\Delta )=\Vert \Delta \Vert _\Sigma ^2=4\delta ^2\) is the non-zero eigenvalue. Consequently, the only non-zero eigenvalue of \(\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2\) is \(\alpha \). Adding \(I\) to a matrix increases all its eigenvalues by 1 so the only eigenvalue of \(I+\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2\) that is not 1 has a value of \(1 + \alpha \). The determinant of a matrix is the product of its eigenvalues, so \(|I+\alpha \Sigma ^{-1}\Delta \Delta ^T/4\delta ^2|=1+\alpha \). As discussed at the beginning of the proof, substituting (30) into the expression for the entropy of a multivariate Gaussian achieves the desired upper bound.

1.4 Proof of Theorem 2

To prove the theorem, we build on Theorem 1. Unfortunately, it cannot be directly applied because it is difficult to evaluate 1) the \(L_1\) norm between GMMs and 2) the entropy of the normalized difference of mixture models. However, Lemmas 1 and 2 provide a way to evaluate these quantities when the densities are individual Gaussians. To exploit this fact, we split the problem up and analyze the difference in entropies between GMMs that only differ by a single component.

To start, define \(d_j(x)=\sum _{i=1}^j w_i g_i(x) + \sum _{i=j+1}^n w_i f_i(x)\). \(d_j\) is a GMM whose first \(j\) components are the first \(j\) components in \(g\) and last \(n-j\) components are the last \(n-j\) components in \(f\). Note that \(d_0 (x) = g(x)\) and   \(d_n (x) = g(x)\). Using the triangle inequality:

$$\begin{aligned} \mathrm{H }\left[ {f}\right] -\mathrm{H }\left[ {g}\right] |&= |\mathrm{H }\left[ {d_0}\right] -\mathrm{H }\left[ {d_1}\right] + \mathrm{H }\left[ {d_1}\right] -\mathrm{H }\left[ {d_n}\right] |\nonumber \\&\le |\mathrm{H }\left[ {d_0}\right] -\mathrm{H }\left[ {d_1}\right] | + |\mathrm{H }\left[ {d_1}\right] -\mathrm{H }\left[ {d_n}\right] |\nonumber \\&\le \sum _{j=1}^N|\mathrm{H }\left[ {d_{j-1}}\right] -\mathrm{H }\left[ {d_j}\right] | \end{aligned}$$

where the last step applied the same trick \(n-2\) more times. Because \(d_{j-1} (x) - d_{j} (x) = w_j (f_j (x) - g_j (x))\) each term in the summand can be bounded using Theorem 1:

$$\begin{aligned}&\Big |\mathrm{H }\left[ {d_{j-1}}\right] -\mathrm{H }\left[ {d_{j}}\right] \Big | \nonumber \\&\quad \le -\Vert w_j(f_j-g_j)\Vert _1 \log \Vert w_j(f_j-g_j)\Vert _1 \nonumber \\&\quad \quad + \Vert w_j(f_j-g_j)\Vert _1 \mathrm{H }\left[ {\frac{|f_j(x)-g_j(x)|}{\Vert f_j-g_j\Vert _1}}\right] \end{aligned}$$
(31)

Because \(f_j(x)\) and \(g_j(x)\) are Gaussians with the same covariance, we can apply Lemmas 1 and 2 to complete the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Charrow, B., Kumar, V. & Michael, N. Approximate representations for multi-robot control policies that maximize mutual information. Auton Robot 37, 383–400 (2014). https://doi.org/10.1007/s10514-014-9411-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-014-9411-2

Keywords

Navigation