Skip to main content
Log in

A semi-definite programming approach for robust tracking

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

Tracking problems are prevalent in the present day GPS and video systems. The problem of target tracking is a specific case of dynamic linear system estimation with additive noise. The most widely used filter for these systems is the Kalman filter (KF). The optimality of the KF and similar Bayesian filters is guaranteed under particular probabilistic assumptions. However, in practice, and specifically in applications such as tracking, these probabilistic assumptions are not realistic; indeed, the system noise is typically bounded and in fact might be adversarial. For such cases, robust estimation approaches, such as \({\mathcal {H}}_\infty \) filtering and set-value estimation, were introduced with the aim of providing filters with guaranteed worst case performance. In this paper we present an innovative approximated set-value estimator (SVE) which is obtained through a semi-definite programming problem. We demonstrate that our problem is practically tractable even for long time horizons. The framework is extended to include the case of partially statistical noise, thus combining the KF and SVE frameworks. A variation of this filter which applies a rolling window approach is developed, achieving fixed computational cost per-iteration and coinciding with the classical SVE when window size is one. Finally, we present numerical results that show the advantages of this filter when dealing with adversarial noise and compare the performance of the various robust filters with the KF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bar-Shalom, Y., Li, X.R.: Multitarget-Multisensor Tracking : Principles and Techniques. YBS Publishing, Storrs (1995)

    Google Scholar 

  2. Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation with Applications to Tracking and Navigation. wiley, New York (2001)

    Book  Google Scholar 

  3. Başar, T., Bernhard, P.: H-Infinity Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach. Modern Birkhäuser Classics, 2nd edn. Birkhäuser, Boston (1995)

    Google Scholar 

  4. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging. Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust optimization. Princeton University Press, Princeton (2009)

    Book  MATH  Google Scholar 

  6. Bertsekas, D., Rhodes, I.: Recursive state estimation for a set-membership description of uncertainty. IEEE Trans. Autom. Control 16(2), 117–128 (1971)

    Article  MathSciNet  Google Scholar 

  7. Dong, Z., You, Z.: Finite-horizon robust kalman filtering for uncertain discrete time-varying systems with uncertain-covariance white noises. IEEE Signal Process. Lett. 13(8), 493–496 (2006)

    Article  Google Scholar 

  8. El Ghaoui, L., Calafiore, G.: Robust filtering for discrete-time systems with bounded noise and parametric uncertainty. IEEE Trans. Autom. Control 46(7), 1084–1089 (2001)

    Article  MATH  Google Scholar 

  9. Fu, M., de Souza, C.E., Luo, Z.Q.: Finite-horizon robust kalman filter design. IEEE Trans. Signal Process. 49(9), 2103–2112 (2001)

    Article  MathSciNet  Google Scholar 

  10. Garulli, A., Vicino, A., Zappa, G.: Conditional central algorithms for worst case set-membership identification and filtering. IEEE Trans. Autom. Control 45(1), 14–23 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  11. Hazan, E.: Sparse approximate solutions to semidefinite programs. In: Laber, E., Bornstein, C., Nogueira, L., Faria, L. (eds.) LATIN 2008: Theoretical Informatics. Lecture Notes in Computer Science, vol. 4957, pp. 306–316. Springer, Berlin (2008)

  12. Heffes, H.: The effect of erroneous models on the kalman filter response. IEEE Trans. Autom. Control 11(3), 541–543 (1966)

    Article  Google Scholar 

  13. Helmberg, C., Rendl, F., Vanderbei, R.J., Wolkowicz, H.: An interior-point method for semidefinite programming. SIAM J. Optim. 6, 342–361 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  14. Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng T. ASME 82(Series D), 35–45 (1960)

    Article  Google Scholar 

  15. Kurzhanski, A.B., Vályi, I.: Ellipsoidal calculus for estimation and control. Systems and control: foundations and applications. Birkhäuser, Boston (1997)

    Book  Google Scholar 

  16. Masreliez, C.: Approximate non-Gaussian filtering with linear state and observation relations. IEEE Trans. Autom. Control 20(1), 107–110 (1975)

    Article  MATH  Google Scholar 

  17. Masreliez, C., Martin, R.: Robust bayesian estimation for the linear model and robustifying the kalman filter. IEEE Trans. Autom. Control 22(3), 361–371 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  18. Nagpal, K.M., Khargonekar, P.P.: Filtering and smoothing in an H infinity setting. IEEE Trans. Autom. Control 36(2), 152–166 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  19. Navarro, J.: A very simple proof of the multivariate Chebyshev’s inequality. Commun. Stat. Theory Methods (2013). doi:10.1080/03610926.2013.873135

  20. Nemirovski, A.: Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15, 229–251 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  21. Nesterov, Y.: Semidefinite relaxation and nonconvex quadratic optimization. Optim. Methods Softw. 9(1–3), 141–160 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  22. Nesterov, Y.: Dual extrapolation and its applications to solving variational inequalities and related problems. Math. Program. 109(2–3), 319–344 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  23. Nishimura, T.: On the a priori information in sequential estimation problems. IEEE Trans. Autom. Control 11(2), 197–204 (1966)

    Article  Google Scholar 

  24. Sasa, S.: Robustness of a kalman filter against uncertainties of noise covariances. In: Proceedings of the 1998, American Control Conference, vol. 4, pp. 2344–2348 (1998)

  25. Sayed, A.: A framework for state-space estimation with uncertain models. IEEE Trans. Autom. Control 46(7), 998–1013 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  26. Schweppe, F.: Recursive state estimation: Unknown but bounded errors and system inputs. IEEE Trans. Autom. Control 13(1), 22–28 (1968)

    Article  Google Scholar 

  27. Shaked, U., Theodor, Y.: \({\cal H}_{\infty } \)-optimal estimation: a tutorial. In: Proceedings of the 31st IEEE Conference on Decision and Control, vol. 2, pp. 2278–2286 (1992)

  28. Shtern, S., Ben-Tal, A.: Algorithms for solving nonconvex block constrained quadratic problems. In: Technion Report IE/OR-2013-01, Technion - Israel Instutute of Technology (2013)

  29. Simon, D.: Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. Wiley-Interscience, Hoboken (2006)

    Book  Google Scholar 

  30. Tsai, C., Kurz, L.: An adaptive robustizing approach to kalman filtering. Automatica 19(3), 279–288 (1983)

    Article  MATH  Google Scholar 

  31. Willems, J.C.: Deterministic least squares filtering. J. Econom. 118(1), 341–373 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  32. Xie, L., Soh, Y.C., de Souza, C.E.: Robust kalman filtering for uncertain discrete-time systems. IEEE Trans. Autom. Control 39(6), 1310–1314 (1994)

    Article  MATH  Google Scholar 

  33. Yang, F., Wang, Z., Hung, Y.S.: Robust kalman filtering for discrete time-varying uncertain systems with multiplicative noises. IEEE Trans. Autom. Control 47(7), 1179–1183 (2002)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

We would like to thank two anonymous referees for their insightful remarks, which helped to improve the presentation of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shimrit Shtern.

Appendices

Appendix 1: Implementation of HRVW-alg to saddle point problem

Consider problem (DSDR-P) with parameters \(\mathbf{C}\in \mathbb {R}^{\tilde{q}\times n}\) and \(\mathbf{D}\in \mathbb {R}^{r\times n}\), and \(\varvec{\mathcal {I}}_i\in \mathbb {R}^{n\times n}\; i=1,\ldots ,m\), and decision variables Let \(\mathbf{K}\in \mathbb {R}^{r\times \tilde{q}}\), \({\varvec{\mu }}\in \mathbb {R}^m\) (we negate non-negativeness of \(\mu \) since \(\mu \) must be non-negative in order to satisfy the constraints), where \(r\) and \(q\) are and therefore \(O(\tilde{q})=O(m)=O(n)=O(T)\). We add variable \(\varvec{\mathcal {Z}}\in \mathbb {R}^{(n+r)\times (n+r)}\) such that:

$$\begin{aligned} \varvec{\mathcal {Z}}= \begin{bmatrix} \sum _{i=1}^m \mu _i\varvec{\mathcal {I}}_i&\quad \! \mathbf{D}'+\mathbf{C}'\mathbf{K}'\\ \mathbf{K}\mathbf{C}+\mathbf{D}&\quad \! \mathbf{I}\end{bmatrix},\; \varvec{\mathcal {Z}}\succeq 0. \end{aligned}$$

This problem satisfies the Slater condition for \(\mathbf{K}=\mathbf{0}\) and \(\mu _i\equiv \alpha =\Vert \mathbf{D}\Vert ^2+1\;\forall i=1,\ldots ,m\) since appropriate \(\varvec{\mathcal {Z}}\) is PD and so the problem is regular.

Defining the dual variable \(\mathbf{Y}\) to be a symmetric matrix such that

$$\begin{aligned} Y=\begin{bmatrix}\mathbf{Y}^{11}&\quad \mathbf{Y}^{12}\\ {\mathbf{Y}^{21}}&\quad \mathbf{Y}^{22} \end{bmatrix}\succeq 0, \end{aligned}$$

we can then formulate the following dual problem.

(25)

where matrices \(\tilde{\varvec{\mathcal {I}}}_i\) and \(\tilde{\varvec{\mathcal {C}}}^{ij}\) are given by:

$$\begin{aligned} \tilde{\varvec{\mathcal {I}}}_i= \begin{bmatrix}\varvec{\mathcal {I}}_i&\quad \mathbf{0}\\ \mathbf{0}&\mathbf{0}\end{bmatrix} ,\quad \tilde{\varvec{\mathcal {C}}}^{ij}= \begin{bmatrix} \mathbf{0}&\tilde{\mathbf{C}}^{ij}\\ (\tilde{\mathbf{C}}^{ij})'&\mathbf{0}\end{bmatrix},\quad \tilde{\mathbf{C}}^{ij}\in \mathbb {R}^{n\times r}: \tilde{C}^{ij}_{kl}= {\left\{ \begin{array}{ll} C_{ik} &{} \quad l=j\\ 0 &{} \quad \text {otherwise} \end{array}\right. }. \end{aligned}$$

We can see that this problem also satisfies the Slater conditions, since strict feasibility hold for \(\mathbf{Y}\) such that \(\mathbf{Y}^{12}=0\), \(\mathbf{Y}^{11}={{\mathrm{Diag}}}(\sum _{i=1}^m \frac{1}{n_i}\varvec{\mathcal {I}}_i\mathbf{e})\) and \(\mathbf{Y}^{22}=\mathbf{I}\). Strict feasibility of both primal and dual problem ensure strong duality, and therefore a duality gap of zero.

Notice that the second set of constraints is actually equivalent to \(\mathbf{C}'\cdot \mathbf{Y}=\mathbf{0}\), where \(\mathbf{0}\) here denotes an \(\tilde{q}\times r\) matrix of all zeros. Therefore, if \(\tilde{q}\ge n \) and \(rank(\mathbf{C})=n\) for any feasible solution \(\mathbf{Y}^{12}=\mathbf{0}\) holds and the optimal solution satisfies \(\mathbf{Y}^{22}=\mathbf{0}\), for any feasible \(\mathbf{Y}^{11}\) and the problem value is zero. This also means that \({\varvec{\mu }}=\mathbf{0}\) and \(\mathbf{K}\) is simply some solution for the equation \(\mathbf{K}\mathbf{C}+\mathbf{D}=0\). If \(rank(\mathbf{C})<n\), we can take only the set of linearly independent rows of \(\mathbf{C}\) and corresponding columns of \(\mathbf{K}\) thus having \(\mathbf{C}\) of full row rank and \(\tilde{q}<n\). Therefore, without loss of generality we can assume \(\tilde{q}<n\).

Since strong duality holds an optimal solution can be found using the \(HRVW-alg\) algorithm based on [13]. \(HRVW-alg\) conducts interior point iterations in order to reduce the duality gap. In order to do so we have to define a linear operator \(B(\mathbf{Y}):\mathbb {R}^{(n+r)\times (n+r)}\rightarrow \mathbb {R}^{m+rq}\) such that \(B(\mathbf{Y})=[B_1(\mathbf{Y}),\ldots , B_{m+rq}(\mathbf{Y})]\) where \(B_j(\mathbf{Y})=\mathbf{B}_j\circ \mathbf{Y}\). In our case, denoting the subsets of indexes \(J_1=\{1,\ldots ,m\}\), \(J_2=\{m+1,\ldots ,m+rq\}\) we have that:

$$\begin{aligned} \mathbf{B}_i= {\left\{ \begin{array}{ll} \tilde{\varvec{\mathcal {I}}}_i &{} i\in J_1\\ \tilde{\mathcal {\mathbf{C}}}^{kl} &{} i\in J_2, l=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1, k=i-m-(l-1)\tilde{q} \end{array}\right. }. \end{aligned}$$
(26)

Its adjoint \(B'(\mathbf{y}):\mathbb {R}^{m+rq}\rightarrow \mathbb {R}^{(n+r)\times (n+r)}\) where \(\mathbf{y}=({\varvec{\mu }},\mathbf{K})\), is then defined as:

$$\begin{aligned} B'(\mathbf{K},{\varvec{\mu }})= \begin{bmatrix} \sum _{i=1}^m \mu _i\varvec{\mathcal {I}}_i&\quad \mathbf{C}'\mathbf{K}'\\ \mathbf{K}\mathbf{C}&0 \end{bmatrix}. \end{aligned}$$
(27)

We can now write the primal and dual problems as:

The algorithms consists of several stages:

figure b

We can clearly see that the computational costs consist of inverting \(\varvec{\mathcal {Z}}\), computing \(\varvec{\mathcal {P}}\), calculating \(B(\mathbf{W})\), inverting \(\varvec{\mathcal {P}}\), computing \(B'(\mathbf{y})\), checking whether \(\mathbf{Y}\) and \(\varvec{\mathcal {Z}}\) are positive definite and finally computing the duality gap. The inversion of \(\varvec{\mathcal {Z}}\) as well as checking whether \(\mathbf{Y}\) and \(\varvec{\mathcal {Z}}\) are positive definite, can be done by Cholesky decomposition and takes \(O((n+r)^3)\) operations. Calculation of \(B(\mathbf{W})\) takes only \(O(n+r\tilde{q}n)\) operations, since \(B_i(\mathbf{W})\) for each \(i\in J_1\) takes \(O(n_i)\) operations and \(B_i(\mathbf{W})\) for each \(i\in J_2\) takes \(O(n)\) operations. Calculation of \(B'(\mathbf{y})\) takes \(O(r\tilde{q}n)\) to calculate the off diagonal blocks and \(O(n)\) to compute the diagonal. Calculating the duality gap takes \(O((n+r)^2)\) operations and inverting \(\varvec{\mathcal {P}}\) takes \(O((m+r\tilde{q})^3)\). Since \(O(\tilde{q})=O(n)\) this is equivalent to all these operations being done in \(O(n^3)\). The only potentially computational problematic operation in computing \(\varvec{\mathcal {P}}\), which consists of computing the matrices \(\mathbf{B}_i\mathbf{W}\) and \(\mathbf{B}_i\mathbf{Y}\) for every \(i\in J_1\bigcup J_2\). Computing \(\mathbf{B}_i\mathbf{W}\) takes up to \(O((n+r)^3)\) operations for each \(i\), and so we have \(O(m+r\tilde{q})\) such multiplications, resulting in potentially up to \(O(n^4)\) operations. We will now prove that the special structure of \(\mathbf{B}_i\) ensure that this is not the case and \(\varvec{\mathcal {P}}\) can be computed in \(O(n^3)\) as well.

From symmetry of \(\mathbf{W}\), \(\mathbf{B}_i\) and \(\mathbf{Y}\) can conclude that \(\varvec{\mathcal {P}}\) is a symmetric matrix with entries defined as follows.

$$\begin{aligned} {\mathcal {P}}_{ij}={{\mathrm{Tr}}}(\mathbf{B}_i\mathbf{W}\mathbf{B}_j\mathbf{Y})=\mathbf{W}\circ \mathbf{B}_j\mathbf{Y}\mathbf{B}_i={{\mathrm{Tr}}}(\mathbf{W}(\mathbf{B}_j\mathbf{Y}\mathbf{B}_i)')=\mathbf{W}\circ \mathbf{B}_i\mathbf{Y}\mathbf{B}_j. \end{aligned}$$

We will now compute \(\mathbf{B}_j\mathbf{Y}\mathbf{B}_i\) for each \(i\) and \(j\) and calculate the computational complexity. We look at each combination of \(i,j\):

  1. 1.

    \(i,j\in J_1\): we have that \({\mathcal {P}}_{ij}= \varvec{\mathcal {I}}_i\mathbf{W}^{11}\varvec{\mathcal {I}}_j\circ \varvec{\mathcal {I}}_i{\mathbf{Y}}^{11}\varvec{\mathcal {I}}_j\) which is the inner product of block \(i,j\) of the \((\cdot )^{11}\) block in each of the matrices. So to compute the whole \(m\times m\) sub-matrix \(\{{\mathcal {P}}_{ij}\}_{i,j\in J_1}\) we need \(O(n^2)\) operations.

  2. 2.

    \(i\in J_1\) and \(j\in J_2\): We denote \(l=\lfloor \frac{j-m-1}{\tilde{q}}\rfloor +1, k=j-m-(l-1)\tilde{q}\) (or vice versa) we have that

    $$\begin{aligned} \tilde{\varvec{\mathcal {I}}}_i\mathbf{Y}\tilde{\varvec{\mathcal {C}}}^{kl}= \begin{bmatrix} \varvec{\mathcal {I}}_i\mathbf{Y}^{11}&\quad \varvec{\mathcal {I}}_i\mathbf{Y}^{12}\\ \mathbf{0}&\quad \mathbf{0}\end{bmatrix} \tilde{\varvec{\mathcal {C}}}^{kl}= \begin{bmatrix} \varvec{\mathcal {I}}_i\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{kl})'&\varvec{\mathcal {I}}_i\mathbf{Y}^{11}\tilde{\mathbf{C}}^{kl}\\ \mathbf{0}&\mathbf{0}\end{bmatrix}. \end{aligned}$$

    The matrix \(\varvec{\mathcal {I}}_i\mathbf{Y}^{11}\tilde{\mathbf{C}}^{kl}\) has only \(n_i\) non zeros elements, in indexes corresponding to rows in partition element \(i\) and column \(l\). Each of these elements can be calculated by \(\mathbf{Y}^{11}_{s\cdot }\mathbf{C}_{k\cdot }'\), where \(s\) is in partition element \(i\), and so the total number of operations needed for the computation is \(O(nn_i)\). The matrix \(\varvec{\mathcal {I}}_i\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{kl})'\) has only \(n_i\) non zero rows of length \(n\) each, where each row can be calculated as the product: \(Y^{12}_{sl}\mathbf{C}_{k\cdot }\). Therefore, the total number of operation to compute this matrix is \(O(nn_i)\). The multiplication with \(\mathbf{W}\) includes multiplying only the non zero elements and therefore takes \(O(n_i(n+r))\) operations. Consequently computing \(\{{\mathcal {P}}_{ij}\}_{i\in J_1,j\in J_2}\) requires \(\sum _{i=1}^m\sum _{j=m+1}^{m+r\tilde{q}} O(n_i(n+r))= O(n(n+r)r\tilde{q})= O(n^3)\).

  3. 3.

    \(i,j\!\in \! J_2\): Denoting \({l_1\!=\!\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1}\), \({k_1=i-m-(l_1-1)\tilde{q}}\), \({l_2=\lfloor \frac{j-m-1}{\tilde{q}}\rfloor +1}\), and \({k_2=j-m-(l_2-1)\tilde{q}}\) we have that:

    $$\begin{aligned} \tilde{\varvec{\mathcal {C}}}^{k_1l_1}\mathbf{Y}\tilde{\varvec{\mathcal {C}}}^{k_2l_2}= & {} \begin{bmatrix} \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{21}&\quad \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{22} \\ (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{11}&\quad (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{12} \end{bmatrix}\\ \tilde{\varvec{\mathcal {C}}}^{k_2l_2}= & {} \begin{bmatrix} \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{22}(\tilde{\mathbf{C}}^{k_2l_2})'&\quad \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{21}\tilde{\mathbf{C}}^{k_2l_2}\\ (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{k_2l_2})'&\quad (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{11}\tilde{\mathbf{C}}^{k_2l_2} \end{bmatrix}. \end{aligned}$$

    Notice the following:

    • The first matrix \(\tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{22}(\tilde{\mathbf{C}}^{k_2l_2})'=Y^{22}_{l_1l_2}\mathbf{C}_{k_1\cdot }'\mathbf{C}_{k_2\cdot }\).

    • Matrix \((\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{11}\tilde{\mathbf{C}}^{k_2l_2}\) is a zero matrix with one non zero element in \((l_1,l_2)\) with value \(\mathbf{C}_{k_1\cdot }\mathbf{Y}^{11}\mathbf{C}_{k_2\cdot }'\).

    • Matrix \(\tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{21}\tilde{\mathbf{C}}^{k_2l_2}\) will have non-zero entries only in column \(l_2\) which will equal \(\mathbf{C}_{k_2\cdot }'(\mathbf{Y}^{21}_{l_1\cdot }\mathbf{C}_{k_1\cdot }')\). Similarly, we have \((\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{k_2l_2})'\!=\!(\tilde{\mathbf{C}}^{k_2l_2}\mathbf{Y}^{21}\tilde{\mathbf{C}}^{k_1l_1})'\) which means that only the \(l_1\) row is non-zero with value \(\mathbf{C}_{k_1\cdot }(\mathbf{Y}^{21}_{l_2\cdot }\mathbf{C}_{k_2\cdot }')\).

    Therefore, from symmetry of \(\mathbf{W}\) the total calculation sums up to:

    $$\begin{aligned} {{\mathrm{Tr}}}\left( \mathbf{W}\tilde{\varvec{\mathcal {C}}}^{k_1l_1}Y\tilde{\varvec{\mathcal {C}}}^{k_2l_2}\right)= & {} Y^{22}_{l_1l_2}(\mathbf{C}_{k_2\cdot }\mathbf{W}^{11}\mathbf{C}_{k_1\cdot }')+ W^{22}_{l_1l_2}(\mathbf{C}_{k_1\cdot }\mathbf{Y}^{11}\mathbf{C}_{k_2\cdot }')\nonumber \\&+ \, (\mathbf{W}^{21}_{l_2\cdot }\mathbf{C}_{k_2\cdot }')(\mathbf{Y}^{21}_{l_1\cdot }\mathbf{C}_{k_1\cdot }') +(\mathbf{W}^{21}_{l_1\cdot }\mathbf{C}_{k_1\cdot }')(\mathbf{Y}^{21}_{l_2\cdot }\mathbf{C}_{k_2\cdot }') \end{aligned}$$
    (28)

    Both \(\mathbf{W}^{11}\mathbf{C}_{k_1\cdot }'\) and \(\mathbf{Y}^{11}\mathbf{C}_{k_1\cdot }'\) can be computed once for every \(k_1\) at a total cost of \(O(n^2\tilde{q})= O(n^3)\), creating \(2\tilde{q}\) vectors of length \(n\). Multiplying each of this vectors by each \(\mathbf{C}_{k_2\cdot }'\) takes \(O(n\tilde{q}^2)= O(n^3)\). Finally multiplying these \(\tilde{q}^2\) results by the appropriate scalars \(Y^{22}_{l_1l_2}\) and \(W^{22}_{l_1l_2}\) for each combination of \((l_1,l_2)\) takes \(O(r^2\tilde{q}^2)\) operations. Similarly, computing \(\mathbf{W}^{21}_{l\cdot }\mathbf{C}_{k\cdot }'\) and \(\mathbf{Y}^{21}_{l\cdot }\mathbf{C}_{k\cdot }'\) takes total of \(O(r\tilde{q}n)\) for all combinations of \(l\) and \(k\), resulting in \(r\tilde{q}\) scalars each. The multiplication of all combinations of this scalars costs \(O(r^2\tilde{q}^2)=O(n^2)\). Therefore, we can compute \(\{{\mathcal {P}}_{ij}\}_{i,j\in J_2}\) in \(O(n^3)\) operations.

Accordingly, the total computational cost of constructing \(\varvec{\mathcal {P}}\) is \(O(n^3)\).

In conclusion, the computational cost of solving problem (DSDR-P) each step of the interior point Algorithm 2 is \(O((n+r\tilde{q})^3)\) operations per iterations and \(O\left( \sqrt{n+r}\log \left( \frac{{{\mathrm{Tr}}}(\varvec{\mathcal {Z}}_0\mathbf{Y}_0)}{\epsilon }\right) \right) \) iteration to obtain duality gap of \(\epsilon \).

Appendix 2: Implementation of HRVW-alg to worst case expectation minimization approximation problem

Consider problem (\(\alpha \)-SDP-EP) where objective function is divided by the constant \((1+\frac{1}{\alpha })\). We denote

$$\begin{aligned} \varvec{\mathcal {Z}}^1= & {} \begin{bmatrix} \sum \limits _{i=1}^m\mu _i\varvec{\mathcal {I}}_i&\quad \! (\mathbf{K}{\mathbf{C}}^1+{\mathbf{D}}^1)'\\ (\mathbf{K}{\mathbf{C}}^1+{\mathbf{D}}^1)&\quad \! \mathbf{I}\end{bmatrix},\quad \varvec{\mathcal {Z}}^2= \begin{bmatrix} \varvec{\mathcal {U}}&\quad \! (\mathbf{K}{\mathbf{C}}^2+{\mathbf{D}}^2) \\ (\mathbf{K}{\mathbf{C}}^1+{\mathbf{D}}^1)'&\quad \! \mathbf{I}\\ \end{bmatrix},\\ \quad \varvec{\mathcal {Z}}= & {} \begin{bmatrix} \varvec{\mathcal {Z}}^1&\quad \! 0\\ 0&\quad \! \varvec{\mathcal {Z}}^2\\ \end{bmatrix} ,\quad \varvec{\mathcal {Z}}\succeq 0 \end{aligned}$$

Defining the dual variable \(\mathbf{Y}\) to be a block diagonal matrix, with blocks \({\mathbf{Y}}^1\in \mathbb {R}^{({\bar{n}_1+r})\times ({\bar{n}_1+r})}\) and \({\mathbf{Y}}^2\in \mathbb {R}^{({r+\bar{n}_2})\times ({r+\bar{n}_2})}\) which themselves are separated to blocks.

$$\begin{aligned} {\mathbf{Y}}= \begin{bmatrix} {\mathbf{Y}}^1&\quad \! 0\\ 0&\quad \!{\mathbf{Y}}^2 \end{bmatrix},\quad {\mathbf{Y}}^1= \begin{bmatrix}{\mathbf{Y}}^1_{[11]}&\quad \! {\mathbf{Y}}^1_{[12]}\\ {\mathbf{Y}}^1_{[21]}&\quad \! {\mathbf{Y}}^1_{[22]} \end{bmatrix}\succeq 0,\quad {\mathbf{Y}}^2= \begin{bmatrix}{\mathbf{Y}}^2_{[11]}&\quad \! {\mathbf{Y}}^2_{[12]}\\ {\mathbf{Y}}^2_{[21]}&\quad \! {\mathbf{Y}}^2_{[22]} \end{bmatrix}\succeq 0, \end{aligned}$$

we can formulate the following dual problem.

(29)

where \(\mathbf{1}_{j}(i)\) is the Dirac measure, and matrices \(\underline{\varvec{\mathcal {I}}}^{ij}\), \(\tilde{\varvec{\mathcal {I}}_i}\) and \(\tilde{\varvec{\mathcal {C}}}^{\ell ij}\), \(\tilde{\mathbf{C}}^{1 ij}\in \mathbb {R}^{\bar{n}_1\times r}\) and \(\tilde{\mathbf{C}}^{2 ij}\in \mathbb {R}^{r \times \bar{n}_2}\) are given by:

$$\begin{aligned} \tilde{\varvec{\mathcal {I}}_i}= & {} \begin{bmatrix} \varvec{\mathcal {I}}_i&\quad \! \mathbf{0}\\ \mathbf{0}&\quad \! \mathbf{0}\end{bmatrix} ,\quad \tilde{\varvec{\mathcal {C}}}^{\ell ij}= \begin{bmatrix} \mathbf{0}&\quad \! \tilde{\mathbf{C}}^{\ell ij}\\ (\tilde{\mathbf{C}}^{\ell ij})'&\quad \! \mathbf{0}\end{bmatrix} ,\quad \ell =1,2\\ \underline{\varvec{\mathcal {I}}}^{ij}_{kl}\!= & {} \! {\left\{ \begin{array}{ll} 1 &{} \quad k=i,\;l=j\\ 0 &{} \quad \text {otherwise} \end{array}\right. } ,\quad \tilde{C}^{1ij}_{kl}= {\left\{ \begin{array}{ll} C^1 _{ik} &{} l=j\\ 0 &{}\text {otherwise} \end{array}\right. } ,\quad \tilde{C}^{2ij}_{kl}= {\left\{ \begin{array}{ll} C^2 _{il} &{} k=j\\ 0 &{}\text {otherwise} \end{array}\right. }. \end{aligned}$$

Both primal and dual problem are strictly feasible with \(\mathbf{K}=0\), \({\varvec{\mu }}=(\Vert \mathbf{D}^1\Vert ^2+1) \mathbf{e}\) and \(\varvec{\mathcal {U}}=(\Vert \mathbf{D}^2\Vert ^2+1)\mathbf{I}\) for the primal and \(\mathbf{Y}^1_{[11]}=\sum \limits _{i=1}^m \frac{\alpha }{n_i}\varvec{\mathcal {I}}_i\), \(\mathbf{Y}^1_{[22]}=\mathbf{I}\), \(\mathbf{Y}^1_{[12]}=0\), \(\mathbf{Y}^2=\mathbf{I}\) for the dual problem, and therefore strong duality holds, and the interior point method can be utilized.

In order to solve this problem via interior point method, we need to formulate linear operator \(B\) and its adjoint \(B'\). To do so, we will first define three index sets \(J_1=\{1,\ldots ,m\}\), \(J_2=\{m+1,\ldots ,m+\tilde{q}r\}\) and \(J_3=\{m+\tilde{q}r+1,\ldots , m+\tilde{q}r+r^2\}\). The linear operator can then be written as

$$\begin{aligned} B(Y)=B(\mathbf{Y}^1,\mathbf{Y}^2)=[B_1(\mathbf{Y}^1,\mathbf{Y}^2),\ldots ,B_{m+\tilde{q}r+r^2}(\mathbf{Y}^1,\mathbf{Y}^2)] \end{aligned}$$

where \(B_i(\mathbf{Y}^1,\mathbf{Y}^2)=Tr(\mathbf{B}^1_i\mathbf{Y}^1)+Tr(\mathbf{B}^2_i\mathbf{Y}^2)\) such that

$$\begin{aligned} \mathbf{B}^1_i= & {} {\left\{ \begin{array}{ll} \tilde{\varvec{\mathcal {I}}}_i &{} i\in J_1\\ \tilde{\varvec{\mathcal {C}}}^{1kl} &{} i\in J_2, l=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1, k=i-m-(l-1)\tilde{q}\\ 0 &{} i\in J_3\\ \end{array}\right. }.\end{aligned}$$
(30)
$$\begin{aligned} \mathbf{B}^2_i= & {} {\left\{ \begin{array}{ll}0 &{} i\in J_1\\ \tilde{\varvec{\mathcal {C}}}^{2kl} &{} i\in J_2, l=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1, k=i-m-(l-1)\tilde{q}\\ \underline{\varvec{\mathcal {I}}}^{kl} &{} i\in J_3, l=\lfloor \frac{i-m-\tilde{q}r-1}{r}\rfloor +1, k=i-m-(\tilde{q}+l-1)r\\ \end{array}\right. }. \end{aligned}$$
(31)

The adjoint operator is \(B'(\mathbf{K},\varvec{\mathcal {U}},{\varvec{\mu }})\equiv \varvec{\mathcal {Z}}\) where \(\varvec{\mathcal {Z}}\) is defined above.

According to Algorithm 2 the most computationally expensive stage in the interior point algorithm consist of constructing matrix \(\varvec{\mathcal {P}}\) and inverting it, since all other calculation take at most \(O((\bar{n}_1+\bar{n}_2+2r)^3)=O(n+2r)^3\) operations. We will show that this construction also takes \(O(n)^3\) operations.

We can separate the constructing of matrix \(\varvec{\mathcal {P}}\) to several parts. For this purpose we will first denote \(\mathbf{W}=\varvec{\mathcal {Z}}^{-1}\) as a block diagonal matrix with blocks \(\mathbf{W}^1={\varvec{\mathcal {Z}}^1}^{-1}\) and \(\mathbf{W}^2={\varvec{\mathcal {Z}}^2}^{-1}\). An important observation is that the construction of matrix \(\varvec{\mathcal {P}}\) can be separated into two independent calculations:

$$\begin{aligned} {\mathcal {P}}_{ij}=\mathbf{W}\circ \mathbf{B}_j\mathbf{Y}\mathbf{B}_i=\mathbf{W}^1\circ \mathbf{B}^1_j\mathbf{Y}^1\mathbf{B}^1_i+\mathbf{W}^2\circ \mathbf{B}^2_j\mathbf{Y}^2\mathbf{B}^2_i =\mathbf{W}\circ \mathbf{B}_i\mathbf{Y}\mathbf{B}_j={\mathcal {P}}_{ji}. \end{aligned}$$

The first term of the calculation is identical to the terms calculated in “Appendix 1” (or is equal to zero when either \(i\) or \(j\) are in \(J3\)) and therefore costs \(O(\bar{n}_1^3)\) to calculate for all \(i\),\(j\). Consequently, we will focus on the computational complexity of the second term:

  1. 1.

    \(\forall {i}\in J_1\) or \(j\in J_1\). The term is identically zero therefore doesn’t need to be computed.

  2. 2.

    \(\forall {i,j}\in J_2\). We start by calculating \(\mathbf{B}^2_i\mathbf{Y}^2\mathbf{B}^2_j\). Denoting, as before, \(l_1=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1,\quad k_1=i-m-l_1\tilde{q},\quad l_2=\lfloor \frac{j-m-1}{\tilde{q}}\rfloor +1,\quad k_2=j-m-l_2\tilde{q}\) We have that:

    $$\begin{aligned} \tilde{\varvec{\mathcal {C}}}^{2ij}\mathbf{Y}^2= \begin{bmatrix} \tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[21]}&\quad \!\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[22]}\\ (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[11]}&\quad \! (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[12]} \end{bmatrix} \end{aligned}$$

    and consequently

    $$\begin{aligned} \tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\tilde{\varvec{\mathcal {C}}}^{2k_2l_2}= \begin{bmatrix} \tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[22]}(\tilde{\mathbf{C}}^{2k_2l_2})'&\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[21]}\tilde{\mathbf{C}}^{2k_2l_2}\\ (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[12]}(\tilde{\mathbf{C}}^{2k_2l_2})'&\quad \! (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[11]}\tilde{\mathbf{C}}^{2k_2l_2} \end{bmatrix} \end{aligned}$$

    \(\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[22]}(\tilde{C}^{2k_2l_2})'\) has only one component which is not zero in index \((l_1,l_2)\) and its value is \(C^2_{k_1\cdot }\mathbf{Y}^2_{[22]}(C^2_{k_2\cdot })'\). On the other hand \((\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[11]}\tilde{C}^{2k_2l_2}=(\mathbf{Y}^2_{[11]})_{l_1l_2}(C^2_{k_1\cdot })'{C}^{2}_{k_2\cdot }\). Furthermore, only row \(l_1\) of matrix \(\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[21]}\tilde{\mathbf{C}}^{2k_2l_2}\) is not zero, with value of \((\mathbf{C}^2_{k_1\cdot }(\mathbf{Y}^2_{[21]})_{\cdot l_2})C^2_{k_2\cdot }\), and similarly only column \(l_2\) of matrix \((\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[12]} (\tilde{\mathbf{C}}^{2k_2l_2})'\) is not zero, with value \((\mathbf{C}^2_{k_1\cdot })'(\mathbf{Y}^2_{[12]})_{l_2 \cdot })(\mathbf{C}^2_{k_2\cdot })'\). The inner product is, therefore, given by:

    $$\begin{aligned} \begin{aligned} \mathbf{W}^2\circ \tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\tilde{\varvec{\mathcal {C}}}^{2k_2l_2}&= (W^2_{[11]})_{l_1l_2}\mathbf{C}^2_{k_1\cdot }\mathbf{Y}^2_{[22]}(\mathbf{C}^2_{k_2\cdot })' +(Y^2_{[11]})_{l_1l_2}\mathbf{C}^2_{k_2\cdot }\mathbf{W}^2_{[22]}(C^2_{k_1\cdot })'\\&\quad +(\mathbf{C}^2_{k_1\cdot }(\mathbf{Y}^2_{[12]})'_{ l_2\cdot })(\mathbf{C}^2_{k_2\cdot }(\mathbf{W}^2_{[12]})'_{l_1\cdot })\\&\quad +((\mathbf{W}^2_{[12]})_{l_1 \cdot }(\mathbf{C}^2_{k_1\cdot })')((\mathbf{Y}^2_{[12]})_{l_2 \cdot }(\mathbf{C}^2_{k_2\cdot })')\\ \end{aligned} \end{aligned}$$

    Calculating the first term for all \(k_1\) takes \(O(\bar{n_2}^2\tilde{q})\) operations resulting in \(2\tilde{q}\) vectors of dimension \(\bar{n_2}\). Multiplying these vectors with \(\mathbf{C}^2_{k_1\cdot }\) for all values of \(k_1\) takes \(O(\tilde{q}^2\bar{n_2})\) operations. The last two terms can also be calculated in two stages, first constructing \(2\tilde{q}r\) scalars for each combination of \(k\) and \(l\) by multiplying the two appropriate vectors, at total cost of \(O(r\tilde{q}\bar{n}_2)\), and then multiplying all combinations of these scalars costing \(O(\tilde{q}^2r^2)\) operations. Therefore the total computational cost for this sub-matrix is therefore \(O(\tilde{q}\bar{n}_2^2+\bar{n}_2\tilde{q}^2+r^2\tilde{q}^2+r\tilde{q}\bar{n}_2)\). Since \(\bar{n}_2\le n\) and since \(O(\tilde{q})=O(n)\) we have that this is equivalent to \(O(n^3)\)

  3. 3.

    \(\forall {i\in J_2\,j\in J_3}\). Denoting, as before, \(l_1=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1,\quad k_1=i-m-(l_1-1)\tilde{q},\quad l_2=\lfloor \frac{j-m-\tilde{q}r-1}{r}\rfloor +1,\quad k_2=j-m-(\tilde{q}+l_2-1)r\), to obtain \(l_1, \quad l_2,\;k_2\in [r]\) and \(k_1\in [\tilde{q}]\). Notice that only column \(l_2\) of matrix \(\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2}\) is non zero, with value \(\begin{bmatrix} (\mathbf{Y}^2_{[11]})_{\cdot k_2}; (\mathbf{Y}^2_{[21]})_{\cdot k_2}\\ \end{bmatrix}\) and therefore only the \(l_2\) column of matrix \(\tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2}\) is non zero. Consequently,

    $$\begin{aligned} \mathbf{W}\circ \tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2}= (W^2_{[11]})_{l_1l_2}\mathbf{C}^2_{k_1\cdot }(\mathbf{Y}^2_{[21]})_{\cdot k_2} + \mathbf{C}^2_{k_1\cdot }(\mathbf{W}^2_{[21]})_{\cdot l_2} (Y^2_{[11]})_{l_1 k_2}. \end{aligned}$$

    Similarly to the previous case, assuming \(O(\tilde{q})=O(n)\), this calculation can be done for all such indexes in \(O(r\bar{n_2}\tilde{q}+r^3\tilde{q})= O(n^2)\) operations.

  4. 4.

    \(\forall {i,j}\in J_3\). In this case we denote \(l_1=\lfloor \frac{i-m-\tilde{q}r-1}{r}\rfloor +1,\quad k_1=i-m-(\tilde{q}+l_1-1)r,\;l_2=\lfloor \frac{j-m-\tilde{q}r-1}{r}\rfloor +1,\quad k_2=j-m-(\tilde{q}+l_2-1)r\). According to the previous case we can conclude that:

    $$\begin{aligned} \mathbf{W}\circ \underline{\varvec{\mathcal {I}}}^{k_1l_1}\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2} =(W^2_{[11]})_{l_2k_1}(Y^2_{[11]})_{l_1k_2}. \end{aligned}$$

    Obviously computing this for all relevant indexes takes \(O(r^4)\).

Therefore, computing \(\varvec{\mathcal {P}}\) in this case takes \(O(n^3)\) operations.

Consequently, the interior point Algorithm 2 implemented on problem (\(\alpha \)-SDP-EP) takes \(O(n^3)\) per iteration and obtains an \(\epsilon \) accurate solution in \(O(\sqrt{n+r}\log {(\frac{1}{\epsilon })})\) iterations.

Appendix 3: Obtaining a solution to problem (SVE)

Given problem (SVE) we define additional variable \(\varvec{\mathcal {Z}}\) as:

$$\begin{aligned} \begin{aligned}&\quad \varvec{\mathcal {Z}}^1= \begin{bmatrix} \sum \limits _{i=1}^m\mu _i\varvec{\mathcal {I}}_i&\quad \! \tilde{\mathbf{E}}'\\ \tilde{\mathbf{E}}&\quad \! {\varvec{\varSigma }}\end{bmatrix} ,\quad {\mathcal {Z}}^2=1-\sum \limits _{i=1}^m\mu _i,\quad \varvec{\mathcal {Z}}= \begin{bmatrix} \varvec{\mathcal {Z}}^1&\quad \! \mathbf{0}\\ \mathbf{0}&\quad \! {\mathcal {Z}}^2 \end{bmatrix},\quad \varvec{\mathcal {Z}}\succeq 0,\\ \end{aligned} \end{aligned}$$

Since \({\varvec{\varSigma }}\) is defined to be symmetric it is represented by \(r(r+1)/2\) variables. Constructing the problem dual via its Lagrangian.

The dual problem is given by:

(32)

Both primal and dual problem are strictly feasible with \({\varvec{\mu }}=\mathbf{e}/n\) and \({\varvec{\varSigma }}=\tilde{\mathbf{E}}'(\sum _i\mu _i \varvec{\mathcal {I}}_i)^{-1}\tilde{\mathbf{E}}+\mathbf{I}\) for the primal and \(\mathbf{Y}^{11}=\sum _i \varvec{\mathcal {I}}_i/n_i\), \(\mathbf{Y}^{22}=\mathbf{I}\), \(\mathbf{Y}^{12}=\mathbf{0}\), \(\beta =1\) for the dual problem. Therefore, strong duality holds for these problems.

Observe that since \(\mathbf{Y}^{22}=\mathbf{I}\) the PSD constraint on \(\mathbf{Y}\) can be reformulated, using the Schur complement, as \(\varvec{\mathcal {Y}}=(\mathbf{Y}^{11}-\mathbf{Y}^{12}{\mathbf{Y}^{12}}')\succeq 0\). Moreover, any square sub-matrix on the diagonal of \(\varvec{\mathcal {Y}}\) is also PSD and therefore the first constraints can be reformulated as:

$$\begin{aligned} {{\mathrm{Tr}}}({\mathbf{Y}^{12}}'\varvec{\mathcal {I}}_i\mathbf{Y}^{12})=\mathbf{Y}^{12}{\mathbf{Y}^{12}}'\circ \varvec{\mathcal {I}}_i\le \beta . \end{aligned}$$

Defining \(k^i_j \quad j=1,\ldots , n_i\) to be the \(j\)th index in the index set corresponding to partition element \(i\), we can define vector \({\varvec{\zeta }}_i=[\zeta _i^1,\ldots ,\zeta _i^{n_i}]'\) where \({\varvec{\zeta }}_i^j={\mathbf{Y}^{12}_{k^i_j\cdot }}\) is the \(k^i_j\) row of matrix \(\mathbf{Y}^{12}\). Similarly, we define \(\tilde{\mathbf{e}}_i=[\tilde{\mathbf{e}}_i^1;\ldots ;\tilde{\mathbf{e}}_i^{n_i}]\) where \(\tilde{\mathbf{e}}_i^j={\tilde{\mathbf{E}}_{\cdot k^i_j}}\) is the \(k^i_j\) column of matrix \(\tilde{\mathbf{E}}\). Therefor, problem (32) can be reformulates as follows.

$$\begin{aligned} \begin{aligned}&\max \limits _{\beta \ge 0}\max \limits _{{\varvec{\zeta }}}&-2\sum \limits _{i=1}^m{\varvec{\zeta }}_i'\tilde{\mathbf{e}}_i- \beta \\&s.t.&\quad \Vert {\varvec{\zeta }}_i\Vert ^2\le \beta \quad i=1,\ldots ,m\\ \end{aligned} \end{aligned}$$
(33)

Since \(\Vert e_i\Vert =\Vert \varvec{\mathcal {I}}_i\tilde{\mathbf{E}}\Vert _F\) where \(\Vert \cdot \Vert _F\) denotes the Frobenius norm, the definition of \(J_0\) can be updated to \(J_0=\{i\in \{1,\ldots ,m\}:\Vert \tilde{\mathbf{e}}_i\Vert \ne 0\}\). Given known \(\beta \) the problem is separable in \({\varvec{\zeta }}_i\), and an optimal solution (not necessarily unique) is known to be

$$\begin{aligned} {\varvec{\zeta }}^*_i={\left\{ \begin{array}{ll} -\sqrt{\beta }\frac{ \tilde{\mathbf{e}}_i}{\Vert \tilde{\mathbf{e}}_i\Vert }&{} i\in J_0\\ \sqrt{\frac{\beta }{n_i}} &{} \text {otherwise} \end{array}\right. }\quad . \end{aligned}$$
(34)

Plugging this solution we obtain the following problem in variable \(\beta \).

$$\begin{aligned} \max \limits _{\beta \ge 0} 2\sqrt{\beta }\sum \limits _{i=1}^m \Vert \tilde{\mathbf{e}}_i\Vert -\beta . \end{aligned}$$

Since \(\tilde{\mathbf{E}}\ne 0\) the solution is obtained in a stationary point, i.e. when \(\beta =1/{(\sum _{i=1}^m \Vert \tilde{\mathbf{e}}_i\Vert )}^2\), in either case the optimal objective function value is \(\sum _{i=1}^m \Vert \tilde{\mathbf{e}}_i\Vert \). Notice that if \(\tilde{\mathbf{E}}= 0\) the problem becomes trivial with solution \({\varvec{\varSigma }}=0\)

In order to retrieve the solution to the primal problem, we first look at the dual of problem (33), which is given by:

(35)

The optimal solution to (35) is given, by the primal dual connection \(\tilde{\mathbf{e}}_i={\varvec{\zeta }}_i\mu _i\) and the optimal solution (34), to be

$$\begin{aligned} \mu _i^\diamond =\frac{\left\| \tilde{\mathbf{e}}_i\right\| }{\sqrt{\beta }}= \frac{\left\| \tilde{\mathbf{e}}_i\right\| }{\sum _i \left\| \tilde{\mathbf{e}}_i\right\| }\quad i=1,\ldots ,m. \end{aligned}$$

Returning to problem (SVE) we obtain \({\varvec{\varSigma }}\succeq \tilde{\mathbf{E}}'(\sum _{i}\mu _i\varvec{\mathcal {I}}_i)^{-1}\tilde{\mathbf{E}}\) provided \(\mu _i=0 \quad \forall i\notin J_0\), and so \({{\mathrm{Tr}}}({\varvec{\varSigma }})\ge \sum _{i\in J_0}{\Vert \tilde{\mathbf{e}}_i\Vert }^2/{\mu _i}\). Therefore, the optimal \({\varvec{\varSigma }}\) for problem (SVE) is obtained when the inequality is satisfied with equality, and is given by:

$$\begin{aligned} {\varvec{\varSigma }}^\diamond =\left( \sum \limits _{i=1}^m{\Vert \varvec{\mathcal {I}}_i \tilde{\mathbf{E}}\Vert _F}\right) \tilde{\mathbf{E}}'\left( \sum _{i\in J_0} \frac{1}{\Vert \varvec{\mathcal {I}}_i\tilde{\mathbf{E}}\Vert _F}\varvec{\mathcal {I}}_i\right) \tilde{\mathbf{E}} \end{aligned}$$

with optimal value of \({{\mathrm{Tr}}}({\varvec{\varSigma }}^\diamond )=(\sum _{i=1}^m{\Vert \varvec{\mathcal {I}}_i\tilde{\mathbf{E}}\Vert _F})^2\). The most computationally expensive operation is the multiplication \(\tilde{\mathbf{E}}'\tilde{\mathbf{E}}\) which costs \(O(n^2r^2)\).

In conclusion, solving problem (SVE), including obtaining the optimal \({\varvec{\varSigma }}^\diamond \) takes at most \(O(n^2r^2)\) operations. Given \(r\) is constant and \(n\le S\max _i n_i\) where \(\max _i n_i\) is constant, we obtain computational complexity of \(O(S^2)\).

Appendix 4: Rolling horizon objective function coefficients

Representing

$$\begin{aligned} \begin{aligned} {\varvec{\delta }}_T&=\mathbf{A}_T^{{\varvec{\omega }}}{\varvec{\omega }}_{T}^{T-S} +\mathbf{A}_T^{\mathbf{Z}}\mathbf{Z}_{T-S}^{T-2S+2}\\&\equiv \mathbf{A}^{{\varvec{\delta }}_{T-S}}{\varvec{\delta }}_{T-S}+\sum _{t=T-S+1}^T \mathbf{A}^{\mathbf{a}_{t}}_T\mathbf{a}_t+\sum _{t=T-S+1}^T \mathbf{A}^{\mathbf{v}_{t}}_T\mathbf{v}_t+\sum _{t=T-2S+2}^{T-S}\mathbf{A}^{\mathbf{z}_{t}}_T\mathbf{z}_t, \end{aligned} \end{aligned}$$

where the coefficient matrices are given by:

$$\begin{aligned} \mathbf{A}_T^{\mathbf{a}_t}\!&=\! {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{A}_{T-1}^{\mathbf{a}_t}\!-\!\mathbf{K}_T^{T} \mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{a}_t}\!+\!\mathbf{K}^t_T\mathbf{H}\mathbf{G}\!-\!\sum \limits _{\tau =t+1}^{T-1} \mathbf{K}_T^\tau \mathbf{H}\mathbf{F}\mathbf{A}_{\tau -1}^{\mathbf{a}_t} &{}\quad \max (T-S+1,1)\le t<T\\ \left( \mathbf{K}^T_T \mathbf{H}\mathbf{G}-\mathbf{G}\right) &{}\quad t=T,\; T\ge 1\\ 0&{} \quad \text {otherwise}\\ \end{array}\right. } \\ \mathbf{A}_T^{\mathbf{v}_t}\!&=\! {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{A}_{T-1}^{\mathbf{v}_t}\!-\!\mathbf{K}_T^{T}\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{v}_t}\!+\!\mathbf{K}^t_T-\sum \limits _{\tau =t+1}^{T-1} \mathbf{K}_T^\tau \mathbf{H}\mathbf{F}\mathbf{A}_{\tau -1}^{\mathbf{v}_t} &{}\qquad \max (T-S+1,1)\le t<T\\ \mathbf{K}^T_T&{}\qquad t=T, \quad T\ge 1 \\ \mathbf{0}&{}\qquad \text {otherwise}\\ \end{array}\right. } \\ \mathbf{A}_T^{{\varvec{\delta }}_{t}}&= {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{A}_{T-1}^{{\varvec{\delta }}_{t}}-\sum \limits _{\tau =t+2}^{T}\mathbf{K}_T^{\tau } \mathbf{H}\mathbf{F}\mathbf{A}_{\tau -1}^{{\varvec{\delta }}_{t}}&{}\qquad \max (T-S,0)\le t< T-1\\ \mathbf{F}-\mathbf{K}_T^{T}\mathbf{H}\mathbf{F}&{} \qquad t=T-1\\ \mathbf{0}&{}\qquad \text {otherwise}\\ \end{array}\right. } \\ \mathbf{L}_T^{\mathbf{z}_{t},j}\!&=\! {\left\{ \begin{array}{ll} \mathbf{K}_T^t\!-\!\sum \limits _{\tau =t+j}^T \mathbf{H}\mathbf{F}\mathbf{L}_{\tau -1}^{\mathbf{z}_t,j} &{}\quad max(T\!-\!S\!+\!1,1)\!\le \! t\!\le \! T\!-\!1,\quad 1\!\le \! j\!\le \! T\!-\!t\\ \mathbf{F}\mathbf{L}_{T-1}^{\mathbf{z}_{t},j}\!-\!\sum \limits _{\tau =max(T-S,t+j)}^T \mathbf{H}\mathbf{F}\mathbf{L}_{\tau -1}^{\mathbf{z}_t,j}&{}\quad max(T\!-\!2S\!+\!2,1)\!\le \! t \!\le \! T\!-\!S,\quad 1\!\le \! j\!\le \! S\!-\!1\\ \mathbf{0}&{}\quad \text {otherwise}\\ \end{array}\right. }\\ \mathbf{A}_T^{\mathbf{z}_{t}}\!&=\! {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{L}_T^{\mathbf{z}_t,j}-\sum \limits _{\tau =T-S+1}^T \mathbf{H}\mathbf{F}\mathbf{L}_{\tau -1}^{\mathbf{z}_t,j} &{}\quad \max (T\!-\!2S\!+\!2,1) \!\le \! t\!\le \! T\!-\!S,\quad j\!=\!T-S-t+1\\ \mathbf{0}&{} \quad \text {otherwise}.\\ \end{array}\right. } \end{aligned}$$

Similarly we can represent

$$\begin{aligned} \mathbf{z}_T=\hat{\mathbf{A}}_T^{{\varvec{\omega }}}{\varvec{\omega }}_{T}^{T-S}+\hat{\mathbf{A}}_T^{\mathbf{Z}} \mathbf{Z}_{T-S}^{T-2S+2}\equiv \hat{\mathbf{A}}^{{\varvec{\delta }}_{T-S}} {\varvec{\delta }}_{T-S}+\sum _{t=T-S+1}^T \hat{\mathbf{A}}^{\mathbf{a}_{t}}_T\mathbf{a}_t+\hat{\mathbf{A}}^{\mathbf{v}_{t}}_T\mathbf{v}_t \end{aligned}$$

where:

$$\begin{aligned} \hat{\mathbf{A}}_T^{\mathbf{a}_t}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{a}_t}&{} \quad \max (T-S+1,1)\le t<T\\ \mathbf{H}\mathbf{G}&{} \quad t=T, \quad T\ge 1\\ 0&{} \quad \text {otherwise}\\ \end{array}\right. }\\ \hat{\mathbf{A}}_T^{\mathbf{v}_t}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{v}_t} &{} \quad \max (T-S+1,1)\le t<T\\ \mathbf{I}&{} \quad t=T, \quad T\ge 1 \\ \mathbf{0}&{} \quad \text {otherwise}\\ \end{array}\right. }\\ \hat{\mathbf{A}}_T^{{\varvec{\delta }}_{t}}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{{\varvec{\delta }}_{t}} &{} \quad \max (T-S,0)\le t<T-1\\ -\mathbf{H}\mathbf{F}&{} \quad t=T-1, \quad T\ge 1\\ \mathbf{0}&{} \quad \text {otherwise}\\ \end{array}\right. }\\ \hat{\mathbf{A}}_T^{\mathbf{z}_{t}}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}{\mathbf{A}}_{T-1}^{\mathbf{z}_{t}} &{} \quad \max (T-2S+2,1) \le t\le T-S\\ \mathbf{0}&{} \quad \text {otherwise}\\ \end{array}\right. } \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shtern, S., Ben-Tal, A. A semi-definite programming approach for robust tracking. Math. Program. 156, 615–656 (2016). https://doi.org/10.1007/s10107-015-0910-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-015-0910-5

Keywords

Mathematics Subject Classification

Navigation