Skip to main content
Log in

Relative entropy optimization and its applications

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

In this expository article, we study optimization problems specified via linear and relative entropy inequalities. Such relative entropy programs (REPs) are convex optimization problems as the relative entropy function is jointly convex with respect to both its arguments. Prominent families of convex programs such as geometric programs (GPs), second-order cone programs, and entropy maximization problems are special cases of REPs, although REPs are more general than these classes of problems. We provide solutions based on REPs to a range of problems such as permanent maximization, robust optimization formulations of GPs, and hitting-time estimation in dynamical systems. We survey previous approaches to some of these problems and the limitations of those methods, and we highlight the more powerful generalizations afforded by REPs. We conclude with a discussion of quantum analogs of the relative entropy function, including a review of the similarities and distinctions with respect to the classical case. We also describe a stylized application of quantum relative entropy optimization that exploits the joint convexity of the quantum relative entropy function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Such a logarithmic transformation of the variables is also employed in converting a GP specified in terms of non-convex posynomial functions to a GP in convex form; see [11, 20] for more details.

  2. In full generality, density matrices are trace-one, positive semidefinite Hermitian matrices, and \(\mathbf {A}^{(j)} \in {\mathbb {C}}^{k \times n}\). As with SDPs, the Von-Neumann entropy optimization framework can handle linear matrix inequalities on Hermitian matrices, but we stick with the real case for simplicity.

References

  1. Bapat, R.B., Beg, M.I.: Order statistics for nonidentically distributed variables and permanents. Sankhya Indian J. Stat. A 51, 79–93 (1989)

    MATH  MathSciNet  Google Scholar 

  2. Barvinok, A.I.: Computing mixed discriminants, mixed volumes, and permanents. Discrete Comput. Geom. 18, 205–237 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  3. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press, Princeton (2009)

    Book  MATH  Google Scholar 

  4. Ben-Tal, A., Nemirovski, A.: Optimal design of engineering structures. Optima. 47, 4–8 (1995)

  5. Ben-Tal, A., Nemirovski, A.: Robust convex optimization. Math. Oper. Res. 23, 769–805 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  6. Ben-Tal, A., Nemirovski, A.: On polyhedral approximations of the second-order cone. Math. Oper. Res. 26, 193–205 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  7. Ben-Tal, A., Nemirovskii, A.: Lectures on Modern Convex Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2001)

    Book  Google Scholar 

  8. Betke, U.: Mixed volumes of polytopes. Arch. Math. 58, 388–391 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  9. Blekherman, G., Parrilo, P., Thomas, R.: Semidefinite Optimization and Convex Algebraic Geometry. Society for Industrial and Applied Mathematics, Philadelphia (2013)

    MATH  Google Scholar 

  10. Boyd, S., Kim, S.J., Patil, D., Horowitz, M.: Digital circuit optimization via geometric programming. Oper. Res. 53, 899–932 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  11. Boyd, S., Kim, S.J., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8, 67–127 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  12. Chandrasekaran, V., Shah, P.: Conic geometric programming. In: Proceedings of the Conference on Information Sciences and Systems (2014)

  13. Chandrasekaran, V., Shah, P.: Relative entropy relaxations for signomial optimization. SIAM J. Optim. (2014)

  14. Chiang, M.: Geometric programming for communication systems. Found. Trends Commun. Inf. Theory 2, 1–154 (2005)

    Article  MATH  Google Scholar 

  15. Chiang, M., Boyd, S.: Geometric programming duals of channel capacity and rate distortion. IEEE Trans. Inf. Theory 50, 245–258 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  16. Cover, T., Thomas, J.: Elements of Information Theory. Wiley, New York (2006)

    MATH  Google Scholar 

  17. Cox, D.R.: The regression analysis of binary sequences. J. R. Stat. Soc. 20, 215–242 (1958)

    MATH  MathSciNet  Google Scholar 

  18. Dinkel, J.J., Kochenberger, G.A., Wong, S.N.: Entropy maximization and geometric programming. Environ. Plan. A 9, 419–427 (1977)

    Article  Google Scholar 

  19. Drew, J.H., Johnson, C.R.: The maximum permanent of a 3-by-3 positive semidefinite matrix, given the eigenvalues. Linear Multilinear Algebra 25, 243–251 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  20. Duffin, R.J., Peterson, E.L., Zener, C.M.: Geometric Programming: Theory and Application. Wiley, New York (1967)

    MATH  Google Scholar 

  21. Egorychev, G.P.: Proof of the Van der Waerden conjecture for permanents (english translation; original in russian). Sib. Math. J. 22, 854–859 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  22. El Ghaoui, L., Lebret, H.: Robust solutions to least-squares problems with uncertain data. SIAM J. Matrix Anal. Appl. 18, 1035–1064 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  23. Falikman, D.I.: Proof of the Van der Waerden conjecture regarding the permanent of a doubly stochastic matrix (english translation; original in russian). Math. Notes 29, 475–479 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  24. Glineur, F.: An extended conic formulation for geometric optimization. Found. Comput. Decis. Sci. 25, 161–174 (2000)

    MATH  MathSciNet  Google Scholar 

  25. Golden, S.: Lower bounds for the Helmholtz function. Phys. Rev. Ser. II 137, B1127–B1128 (1965)

    MATH  MathSciNet  Google Scholar 

  26. Gonalves, D.S., Lavor, C., Gomes-Ruggiero, M.A., Cesrio, A.T., Vianna, R.O., Maciel, T.O.: Quantum state tomography with incomplete data: maximum entropy and variational quantum tomography. Phys. Rev. A 87 (2013)

  27. Gouveia, J., Parrilo, P., Thomas, R.: Lifts of convex sets and cone factorizations. Math. Oper. Res. 38, 248–264 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  28. Grone, R., Johnson, C.R., Eduardo, S.A., Wolkowicz, H.: A note on maximizing the permanent of a positive definite hermitian matrix, given the eigenvalues. Linear Multilinear Algebra 19, 389–393 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  29. Gurvits, L.: Van der Waerden/Schrijver-Valiant like conjectures and stable (aka hyperbolic) homogeneous polynomials: one theorem for all. Electron. J. Comb. 15 (2008)

  30. Gurvits, L., Samorodnitsky, A.: A deterministic algorithm for approximating the mixed discriminant and mixed volume, and a combinatorial corollary. Discrete Comput. Geom. 27, 531–550 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  31. Han, S., Preciado, V.M., Nowzari, C., Pappas, G.J.: Data-Driven Network Resource Allocation for Controlling Spreading Processes. IEEE Trans. Netw. Sci. Eng. 2(4), 127–38 (2015)

  32. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Berlin (2008)

    MATH  Google Scholar 

  33. Hellwig, K., Krauss, K.: Operations and measurements II. Commun. Math. Phys. 16, 142–147 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  34. Helton, J.W., Nie, J.: Sufficient and necessary conditions for semidefinite representability of convex hulls and sets. SIAM J. Optim. 20, 759–791 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  35. Holevo, A.S.: The capacity of the quantum channel with general signal states. IEEE Trans. Inf. Theory 44, 269–273 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  36. Hsiung, K.L., Kim, S.J., Boyd, S.: Tractable approximate robust geometric programming. Optim. Eng. 9, 95–118 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  37. Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. Ser. II 106, 620–630 (1957)

    MATH  MathSciNet  Google Scholar 

  38. Jerrum, M., Sinclair, A., Vigoda, E.: A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries. J. ACM 51, 671–697 (2004)

    Article  MATH  Google Scholar 

  39. Kulis, B., Sustik, M., Dhillon, I.: Low-rank kernel learning with Bregman matrix divergences. J. Mach. Learn. Res. 10, 341–376 (2009)

    MATH  MathSciNet  Google Scholar 

  40. Lieb, E.: Convex trace functions and the Wigner–Yanase–Dyson conjecture. Adv. Math. 11, 267–288 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  41. Linial, N., Samorodnitsky, A., Wigderson, A.: A deterministic strongly polynomial algorithm for matrix scaling and approximate permanents. Combinatorica 20, 545–568 (2000)

  42. Lobo, M., Vandenberghe, L., Boyd, S., Lebret, H.: Applications of second-order cone programming. Linear Algebra Appl. 284, 193–228 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  43. Minc, H.: Permanents. Cambridge University Press, Cambridge (1984)

    Book  MATH  Google Scholar 

  44. Nesterov, Y., Nemirovski, A.: Interior-Point Polynomial Algorithms in Convex Programming. Society of Industrial and Applied Mathematics, Philadelphia (1994)

    Book  Google Scholar 

  45. Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2011)

    MATH  Google Scholar 

  46. Potchinkov, A.W., Reemsten, R.M.: The design of FIR filters in the complex plane by convex optimization. Signal Process. 46, 127–146 (1995)

    Article  MATH  Google Scholar 

  47. Prajna, S., Jadbabaie, A.: Safety Verification of Hybrid Systems Using Barrier Certificates. In: Hybrid Systems: Computation and Control, pp. 477–492. Springer (2004)

  48. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  49. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56, 131–138 (1997)

    Article  Google Scholar 

  50. Scott, C.H., Jefferson, T.R.: Trace optimization problems and generalized geometric programming. J. Math. Anal. Appl. 58, 373–377 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  51. Shor, P.W.: Capacities of quantum channels and how to find them. Math. Program. B 97, 311–335 (2003)

    MATH  MathSciNet  Google Scholar 

  52. Thompson, C.J.: Inequality with applications in statistical mechanics. J. Math. Phys. 6, 1812–1813 (1965)

    Article  MathSciNet  Google Scholar 

  53. Valiant, L.: The complexity of computing the permanent. Theor. Comput. Sci. 8, 189–201 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  54. Vandenberghe, L., Boyd, S., Wu, S.: Determinant maximization with linear matrix inequality constraints. SIAM J. Matrix Anal. Appl. 19, 499–533 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  55. Wall, T., Greening, D., Woolsey, R.: Solving complex chemical equilibria using a geometric programming based technique. Oper. Res. 34, 345–355 (1986)

    Article  MATH  Google Scholar 

  56. Yazarel, H., Pappas, G.: Geometric programming relaxations for linear system reachability. In: Proceedings of the American Control Conference (2004)

Download references

Acknowledgments

The authors would like to thank Pablo Parrilo and Yong-Sheng Soh for helpful conversations, and Leonard Schulman for pointers to the literature on Von-Neumann entropy. Venkat Chandrasekaran was supported in part by National Science Foundation Career award CCF-1350590 and Air Force Office of Scientific Research grant FA9550-14-1-0098.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Venkat Chandrasekaran.

Appendix

Appendix

The second-order cone \(L_2 \subset \mathbb {R}^3\) from (5) can be written as:

$$\begin{aligned} L_2 = \left\{ (\mathbf {x},y) \in \mathbb {R}^2 \times \mathbb {R}~|~ \begin{pmatrix} y-\mathbf {x}_1 &{} \mathbf {x}_2 \\ \mathbf {x}_2 &{} y+\mathbf {x}_1 \end{pmatrix} \succeq \mathbf {0}\right\} . \end{aligned}$$

Combining this reformulation with the next result gives us the description (6).

Proposition 6

We have that \(\begin{pmatrix} a &{} c \\ c &{} b\end{pmatrix} \in \mathbb {S}^2_+\) if and only if there exists \(\nu \in \mathbb {R}_+\) such that:

$$\begin{aligned} \begin{aligned} \nu \log \left( \frac{\nu }{a}\right) + \nu \log \left( \frac{\nu }{b}\right) - 2 \nu&\le 2 c \\ \nu \log \left( \frac{\nu }{a}\right) + \nu \log \left( \frac{\nu }{b}\right) - 2 \nu&\le - 2 c \\ a,b \in \mathbb {R}_+. \end{aligned} \end{aligned}$$

Proof

We have that \(\begin{pmatrix} a &{} c \\ c &{} b\end{pmatrix} \in \mathbb {S}^2_+\) if and only if \(a \bar{\mathbf {z}}_1^2 + b \bar{\mathbf {z}}_2^2 + 2c \bar{\mathbf {z}}_1 \bar{\mathbf {z}}_2 \ge 0, ~ \forall \bar{\mathbf {z}} \in \mathbb {R}^2\). This latter condition can in turn be rewritten to obtain the following reformulation:

$$\begin{aligned} \begin{pmatrix} a &{} c \\ c &{} b\end{pmatrix} \in \mathbb {S}^2_+ \Leftrightarrow a \mathbf {z}_1^2 + b \mathbf {z}_2^2 + 2c \mathbf {z}_1 \mathbf {z}_2 \ge 0 \,\,{\mathrm {and}}\,\, a \mathbf {z}_1^2 + b \mathbf {z}_2^2 - 2c \mathbf {z}_1 \mathbf {z}_2 \ge 0 ~ \forall \mathbf {z}\in \mathbb {R}^2_+. \end{aligned}$$
(44)

Each of these inequalities with universal quantifiers can be reformulated by appealing to GP duality. Specifically, based on the change of variables \(\mathbf {w}_i \leftarrow \log (\mathbf {z}_i), i=1,2\), which is commonly employed in the GP literature [11, 20], we have from (44) that \(\begin{pmatrix} a &{} c \\ c &{} b\end{pmatrix} \in \mathbb {S}^2_+\) if and only if:

$$\begin{aligned} \inf _{\mathbf {w}\in \mathbb {R}^2} ~ a \exp \{\mathbf {w}_1 - \mathbf {w}_2\} + b \exp \{\mathbf {w}_2 - \mathbf {w}_1\} ~~ \ge ~~ \max \{2c,-2c\}. \end{aligned}$$
(45)

As the optimization problem on the left-hand-side is a GP for \(a,b \in \mathbb {R}_+\), we can appeal to convex duality to conclude that

$$\begin{aligned} \inf _{\mathbf {w}\in \mathbb {R}^2} ~ a \exp \{\mathbf {w}_1 - \mathbf {w}_2\} + b \exp \{\mathbf {w}_2 - \mathbf {w}_1\} = \sup _{\nu \in \mathbb {R}_+} ~ -\nu \log \left( \frac{\nu }{a}\right) - \nu \log \left( \frac{\nu }{b}\right) + 2\nu . \end{aligned}$$

Combining this result with (45), we have that \(\begin{pmatrix} a &{} c \\ c &{} b\end{pmatrix} \in \mathbb {S}^2_+\) if and only if:

$$\begin{aligned} a,b \in \mathbb {R}_+ ~{\mathrm {and}}~ \exists \nu \in \mathbb {R}_+ ~{\mathrm {s.t.}}~ -\nu \log \left( \frac{\nu }{a}\right) - \nu \log \left( \frac{\nu }{b}\right) + 2\nu \ge \max \{2c,-2c\}. \end{aligned}$$
(46)

This concludes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chandrasekaran, V., Shah, P. Relative entropy optimization and its applications. Math. Program. 161, 1–32 (2017). https://doi.org/10.1007/s10107-016-0998-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-016-0998-2

Keywords

Mathematics Subject Classification

Navigation