A semi-definite programming approach for robust tracking

Shtern, Shimrit; Ben-Tal, Aharon

doi:10.1007/s10107-015-0910-5

A semi-definite programming approach for robust tracking

Full Length Paper
Series A
Published: 28 May 2015

Volume 156, pages 615–656, (2016)
Cite this article

Mathematical Programming Submit manuscript

Shimrit Shtern¹ &
Aharon Ben-Tal^1,2,3

689 Accesses
3 Citations
Explore all metrics

Abstract

Tracking problems are prevalent in the present day GPS and video systems. The problem of target tracking is a specific case of dynamic linear system estimation with additive noise. The most widely used filter for these systems is the Kalman filter (KF). The optimality of the KF and similar Bayesian filters is guaranteed under particular probabilistic assumptions. However, in practice, and specifically in applications such as tracking, these probabilistic assumptions are not realistic; indeed, the system noise is typically bounded and in fact might be adversarial. For such cases, robust estimation approaches, such as ${\mathcal {H}}_\infty $ filtering and set-value estimation, were introduced with the aim of providing filters with guaranteed worst case performance. In this paper we present an innovative approximated set-value estimator (SVE) which is obtained through a semi-definite programming problem. We demonstrate that our problem is practically tractable even for long time horizons. The framework is extended to include the case of partially statistical noise, thus combining the KF and SVE frameworks. A variation of this filter which applies a rolling window approach is developed, achieving fixed computational cost per-iteration and coinciding with the classical SVE when window size is one. Finally, we present numerical results that show the advantages of this filter when dealing with adversarial noise and compare the performance of the various robust filters with the KF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust visual tracking based on adaptive gradient descent optimization of a cost function with parametric models of appearance and geometry

Article 18 March 2019

Robust visual tracking using information theoretical learning

Article 23 March 2017

Visual tracking via robust multi-task multi-feature joint sparse representation

Article 07 June 2018

References

Bar-Shalom, Y., Li, X.R.: Multitarget-Multisensor Tracking : Principles and Techniques. YBS Publishing, Storrs (1995)
Google Scholar
Bar-Shalom, Y., Li, X.R., Kirubarajan, T.: Estimation with Applications to Tracking and Navigation. wiley, New York (2001)
Book Google Scholar
Başar, T., Bernhard, P.: H-Infinity Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach. Modern Birkhäuser Classics, 2nd edn. Birkhäuser, Boston (1995)
Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging. Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust optimization. Princeton University Press, Princeton (2009)
Book MATH Google Scholar
Bertsekas, D., Rhodes, I.: Recursive state estimation for a set-membership description of uncertainty. IEEE Trans. Autom. Control 16(2), 117–128 (1971)
Article MathSciNet Google Scholar
Dong, Z., You, Z.: Finite-horizon robust kalman filtering for uncertain discrete time-varying systems with uncertain-covariance white noises. IEEE Signal Process. Lett. 13(8), 493–496 (2006)
Article Google Scholar
El Ghaoui, L., Calafiore, G.: Robust filtering for discrete-time systems with bounded noise and parametric uncertainty. IEEE Trans. Autom. Control 46(7), 1084–1089 (2001)
Article MATH Google Scholar
Fu, M., de Souza, C.E., Luo, Z.Q.: Finite-horizon robust kalman filter design. IEEE Trans. Signal Process. 49(9), 2103–2112 (2001)
Article MathSciNet Google Scholar
Garulli, A., Vicino, A., Zappa, G.: Conditional central algorithms for worst case set-membership identification and filtering. IEEE Trans. Autom. Control 45(1), 14–23 (2000)
Article MathSciNet MATH Google Scholar
Hazan, E.: Sparse approximate solutions to semidefinite programs. In: Laber, E., Bornstein, C., Nogueira, L., Faria, L. (eds.) LATIN 2008: Theoretical Informatics. Lecture Notes in Computer Science, vol. 4957, pp. 306–316. Springer, Berlin (2008)
Heffes, H.: The effect of erroneous models on the kalman filter response. IEEE Trans. Autom. Control 11(3), 541–543 (1966)
Article Google Scholar
Helmberg, C., Rendl, F., Vanderbei, R.J., Wolkowicz, H.: An interior-point method for semidefinite programming. SIAM J. Optim. 6, 342–361 (1996)
Article MathSciNet MATH Google Scholar
Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng T. ASME 82(Series D), 35–45 (1960)
Article Google Scholar
Kurzhanski, A.B., Vályi, I.: Ellipsoidal calculus for estimation and control. Systems and control: foundations and applications. Birkhäuser, Boston (1997)
Book Google Scholar
Masreliez, C.: Approximate non-Gaussian filtering with linear state and observation relations. IEEE Trans. Autom. Control 20(1), 107–110 (1975)
Article MATH Google Scholar
Masreliez, C., Martin, R.: Robust bayesian estimation for the linear model and robustifying the kalman filter. IEEE Trans. Autom. Control 22(3), 361–371 (1977)
Article MathSciNet MATH Google Scholar
Nagpal, K.M., Khargonekar, P.P.: Filtering and smoothing in an H infinity setting. IEEE Trans. Autom. Control 36(2), 152–166 (1991)
Article MathSciNet MATH Google Scholar
Navarro, J.: A very simple proof of the multivariate Chebyshev’s inequality. Commun. Stat. Theory Methods (2013). doi:10.1080/03610926.2013.873135
Nemirovski, A.: Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15, 229–251 (2004)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Semidefinite relaxation and nonconvex quadratic optimization. Optim. Methods Softw. 9(1–3), 141–160 (1998)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Dual extrapolation and its applications to solving variational inequalities and related problems. Math. Program. 109(2–3), 319–344 (2007)
Article MathSciNet MATH Google Scholar
Nishimura, T.: On the a priori information in sequential estimation problems. IEEE Trans. Autom. Control 11(2), 197–204 (1966)
Article Google Scholar
Sasa, S.: Robustness of a kalman filter against uncertainties of noise covariances. In: Proceedings of the 1998, American Control Conference, vol. 4, pp. 2344–2348 (1998)
Sayed, A.: A framework for state-space estimation with uncertain models. IEEE Trans. Autom. Control 46(7), 998–1013 (2001)
Article MathSciNet MATH Google Scholar
Schweppe, F.: Recursive state estimation: Unknown but bounded errors and system inputs. IEEE Trans. Autom. Control 13(1), 22–28 (1968)
Article Google Scholar
Shaked, U., Theodor, Y.: ${\cal H}_{\infty } $-optimal estimation: a tutorial. In: Proceedings of the 31st IEEE Conference on Decision and Control, vol. 2, pp. 2278–2286 (1992)
Shtern, S., Ben-Tal, A.: Algorithms for solving nonconvex block constrained quadratic problems. In: Technion Report IE/OR-2013-01, Technion - Israel Instutute of Technology (2013)
Simon, D.: Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. Wiley-Interscience, Hoboken (2006)
Book Google Scholar
Tsai, C., Kurz, L.: An adaptive robustizing approach to kalman filtering. Automatica 19(3), 279–288 (1983)
Article MATH Google Scholar
Willems, J.C.: Deterministic least squares filtering. J. Econom. 118(1), 341–373 (2004)
Article MathSciNet MATH Google Scholar
Xie, L., Soh, Y.C., de Souza, C.E.: Robust kalman filtering for uncertain discrete-time systems. IEEE Trans. Autom. Control 39(6), 1310–1314 (1994)
Article MATH Google Scholar
Yang, F., Wang, Z., Hung, Y.S.: Robust kalman filtering for discrete time-varying uncertain systems with multiplicative noises. IEEE Trans. Autom. Control 47(7), 1179–1183 (2002)
Article MathSciNet Google Scholar

Download references

Acknowledgments

We would like to thank two anonymous referees for their insightful remarks, which helped to improve the presentation of the paper.

Author information

Authors and Affiliations

The William Davidson Faculty of Industrial Engineering and Management, Technion - Israel Institute of Technology, 32000, Haifa, Israel
Shimrit Shtern & Aharon Ben-Tal
CentER Extramural Fellow, CentER, Tilburg University, Tilburg, The Netherlands
Aharon Ben-Tal
IEOR Department, Columbia University, New York, NY, USA
Aharon Ben-Tal

Authors

Shimrit Shtern
View author publications
You can also search for this author inPubMed Google Scholar
Aharon Ben-Tal
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shimrit Shtern.

Appendices

Appendix 1: Implementation of HRVW-alg to saddle point problem

Consider problem (DSDR-P) with parameters $\mathbf{C}\in \mathbb {R}^{\tilde{q}\times n}$ and $\mathbf{D}\in \mathbb {R}^{r\times n}$, and $\varvec{\mathcal {I}}_i\in \mathbb {R}^{n\times n}\; i=1,\ldots ,m$, and decision variables Let $\mathbf{K}\in \mathbb {R}^{r\times \tilde{q}}$, ${\varvec{\mu }}\in \mathbb {R}^m$ (we negate non-negativeness of $\mu $ since $\mu $ must be non-negative in order to satisfy the constraints), where $r$ and $q$ are and therefore $O(\tilde{q})=O(m)=O(n)=O(T)$. We add variable $\varvec{\mathcal {Z}}\in \mathbb {R}^{(n+r)\times (n+r)}$ such that:

$$\begin{aligned} \varvec{\mathcal {Z}}= \begin{bmatrix} \sum _{i=1}^m \mu _i\varvec{\mathcal {I}}_i&\quad \! \mathbf{D}'+\mathbf{C}'\mathbf{K}'\\ \mathbf{K}\mathbf{C}+\mathbf{D}&\quad \! \mathbf{I}\end{bmatrix},\; \varvec{\mathcal {Z}}\succeq 0. \end{aligned}$$

This problem satisfies the Slater condition for $\mathbf{K}=\mathbf{0}$ and $\mu _i\equiv \alpha =\Vert \mathbf{D}\Vert ^2+1\;\forall i=1,\ldots ,m$ since appropriate $\varvec{\mathcal {Z}}$ is PD and so the problem is regular.

Defining the dual variable $\mathbf{Y}$ to be a symmetric matrix such that

$$\begin{aligned} Y=\begin{bmatrix}\mathbf{Y}^{11}&\quad \mathbf{Y}^{12}\\ {\mathbf{Y}^{21}}&\quad \mathbf{Y}^{22} \end{bmatrix}\succeq 0, \end{aligned}$$

we can then formulate the following dual problem.

(25)

where matrices $\tilde{\varvec{\mathcal {I}}}_i$ and $\tilde{\varvec{\mathcal {C}}}^{ij}$ are given by:

$$\begin{aligned} \tilde{\varvec{\mathcal {I}}}_i= \begin{bmatrix}\varvec{\mathcal {I}}_i&\quad \mathbf{0}\\ \mathbf{0}&\mathbf{0}\end{bmatrix} ,\quad \tilde{\varvec{\mathcal {C}}}^{ij}= \begin{bmatrix} \mathbf{0}&\tilde{\mathbf{C}}^{ij}\\ (\tilde{\mathbf{C}}^{ij})'&\mathbf{0}\end{bmatrix},\quad \tilde{\mathbf{C}}^{ij}\in \mathbb {R}^{n\times r}: \tilde{C}^{ij}_{kl}= {\left\{ \begin{array}{ll} C_{ik} &{} \quad l=j\\ 0 &{} \quad \text {otherwise} \end{array}\right. }. \end{aligned}$$

We can see that this problem also satisfies the Slater conditions, since strict feasibility hold for $\mathbf{Y}$ such that $\mathbf{Y}^{12}=0$, $\mathbf{Y}^{11}={{\mathrm{Diag}}}(\sum _{i=1}^m \frac{1}{n_i}\varvec{\mathcal {I}}_i\mathbf{e})$ and $\mathbf{Y}^{22}=\mathbf{I}$. Strict feasibility of both primal and dual problem ensure strong duality, and therefore a duality gap of zero.

Notice that the second set of constraints is actually equivalent to $\mathbf{C}'\cdot \mathbf{Y}=\mathbf{0}$, where $\mathbf{0}$ here denotes an $\tilde{q}\times r$ matrix of all zeros. Therefore, if $\tilde{q}\ge n $ and $rank(\mathbf{C})=n$ for any feasible solution $\mathbf{Y}^{12}=\mathbf{0}$ holds and the optimal solution satisfies $\mathbf{Y}^{22}=\mathbf{0}$, for any feasible $\mathbf{Y}^{11}$ and the problem value is zero. This also means that ${\varvec{\mu }}=\mathbf{0}$ and $\mathbf{K}$ is simply some solution for the equation $\mathbf{K}\mathbf{C}+\mathbf{D}=0$. If $rank(\mathbf{C})<n$, we can take only the set of linearly independent rows of $\mathbf{C}$ and corresponding columns of $\mathbf{K}$ thus having $\mathbf{C}$ of full row rank and $\tilde{q}<n$. Therefore, without loss of generality we can assume $\tilde{q}<n$.

Since strong duality holds an optimal solution can be found using the $HRVW-alg$ algorithm based on [13]. $HRVW-alg$ conducts interior point iterations in order to reduce the duality gap. In order to do so we have to define a linear operator $B(\mathbf{Y}):\mathbb {R}^{(n+r)\times (n+r)}\rightarrow \mathbb {R}^{m+rq}$ such that $B(\mathbf{Y})=[B_1(\mathbf{Y}),\ldots , B_{m+rq}(\mathbf{Y})]$ where $B_j(\mathbf{Y})=\mathbf{B}_j\circ \mathbf{Y}$. In our case, denoting the subsets of indexes $J_1=\{1,\ldots ,m\}$, $J_2=\{m+1,\ldots ,m+rq\}$ we have that:

$$\begin{aligned} \mathbf{B}_i= {\left\{ \begin{array}{ll} \tilde{\varvec{\mathcal {I}}}_i &{} i\in J_1\\ \tilde{\mathcal {\mathbf{C}}}^{kl} &{} i\in J_2, l=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1, k=i-m-(l-1)\tilde{q} \end{array}\right. }. \end{aligned}$$

(26)

Its adjoint $B'(\mathbf{y}):\mathbb {R}^{m+rq}\rightarrow \mathbb {R}^{(n+r)\times (n+r)}$ where $\mathbf{y}=({\varvec{\mu }},\mathbf{K})$, is then defined as:

$$\begin{aligned} B'(\mathbf{K},{\varvec{\mu }})= \begin{bmatrix} \sum _{i=1}^m \mu _i\varvec{\mathcal {I}}_i&\quad \mathbf{C}'\mathbf{K}'\\ \mathbf{K}\mathbf{C}&0 \end{bmatrix}. \end{aligned}$$

(27)

We can now write the primal and dual problems as:

The algorithms consists of several stages:

We can clearly see that the computational costs consist of inverting $\varvec{\mathcal {Z}}$, computing $\varvec{\mathcal {P}}$, calculating $B(\mathbf{W})$, inverting $\varvec{\mathcal {P}}$, computing $B'(\mathbf{y})$, checking whether $\mathbf{Y}$ and $\varvec{\mathcal {Z}}$ are positive definite and finally computing the duality gap. The inversion of $\varvec{\mathcal {Z}}$ as well as checking whether $\mathbf{Y}$ and $\varvec{\mathcal {Z}}$ are positive definite, can be done by Cholesky decomposition and takes $O((n+r)^3)$ operations. Calculation of $B(\mathbf{W})$ takes only $O(n+r\tilde{q}n)$ operations, since $B_i(\mathbf{W})$ for each $i\in J_1$ takes $O(n_i)$ operations and $B_i(\mathbf{W})$ for each $i\in J_2$ takes $O(n)$ operations. Calculation of $B'(\mathbf{y})$ takes $O(r\tilde{q}n)$ to calculate the off diagonal blocks and $O(n)$ to compute the diagonal. Calculating the duality gap takes $O((n+r)^2)$ operations and inverting $\varvec{\mathcal {P}}$ takes $O((m+r\tilde{q})^3)$. Since $O(\tilde{q})=O(n)$ this is equivalent to all these operations being done in $O(n^3)$. The only potentially computational problematic operation in computing $\varvec{\mathcal {P}}$, which consists of computing the matrices $\mathbf{B}_i\mathbf{W}$ and $\mathbf{B}_i\mathbf{Y}$ for every $i\in J_1\bigcup J_2$. Computing $\mathbf{B}_i\mathbf{W}$ takes up to $O((n+r)^3)$ operations for each $i$, and so we have $O(m+r\tilde{q})$ such multiplications, resulting in potentially up to $O(n^4)$ operations. We will now prove that the special structure of $\mathbf{B}_i$ ensure that this is not the case and $\varvec{\mathcal {P}}$ can be computed in $O(n^3)$ as well.

From symmetry of $\mathbf{W}$, $\mathbf{B}_i$ and $\mathbf{Y}$ can conclude that $\varvec{\mathcal {P}}$ is a symmetric matrix with entries defined as follows.

$$\begin{aligned} {\mathcal {P}}_{ij}={{\mathrm{Tr}}}(\mathbf{B}_i\mathbf{W}\mathbf{B}_j\mathbf{Y})=\mathbf{W}\circ \mathbf{B}_j\mathbf{Y}\mathbf{B}_i={{\mathrm{Tr}}}(\mathbf{W}(\mathbf{B}_j\mathbf{Y}\mathbf{B}_i)')=\mathbf{W}\circ \mathbf{B}_i\mathbf{Y}\mathbf{B}_j. \end{aligned}$$

We will now compute $\mathbf{B}_j\mathbf{Y}\mathbf{B}_i$ for each $i$ and $j$ and calculate the computational complexity. We look at each combination of $i,j$:

1.
$i,j\in J_1$: we have that ${\mathcal {P}}_{ij}= \varvec{\mathcal {I}}_i\mathbf{W}^{11}\varvec{\mathcal {I}}_j\circ \varvec{\mathcal {I}}_i{\mathbf{Y}}^{11}\varvec{\mathcal {I}}_j$ which is the inner product of block $i,j$ of the $(\cdot )^{11}$ block in each of the matrices. So to compute the whole $m\times m$ sub-matrix $\{{\mathcal {P}}_{ij}\}_{i,j\in J_1}$ we need $O(n^2)$ operations.
2.
$i\in J_1$ and $j\in J_2$: We denote $l=\lfloor \frac{j-m-1}{\tilde{q}}\rfloor +1, k=j-m-(l-1)\tilde{q}$ (or vice versa) we have that
$$\begin{aligned} \tilde{\varvec{\mathcal {I}}}_i\mathbf{Y}\tilde{\varvec{\mathcal {C}}}^{kl}= \begin{bmatrix} \varvec{\mathcal {I}}_i\mathbf{Y}^{11}&\quad \varvec{\mathcal {I}}_i\mathbf{Y}^{12}\\ \mathbf{0}&\quad \mathbf{0}\end{bmatrix} \tilde{\varvec{\mathcal {C}}}^{kl}= \begin{bmatrix} \varvec{\mathcal {I}}_i\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{kl})'&\varvec{\mathcal {I}}_i\mathbf{Y}^{11}\tilde{\mathbf{C}}^{kl}\\ \mathbf{0}&\mathbf{0}\end{bmatrix}. \end{aligned}$$
The matrix $\varvec{\mathcal {I}}_i\mathbf{Y}^{11}\tilde{\mathbf{C}}^{kl}$ has only $n_i$ non zeros elements, in indexes corresponding to rows in partition element $i$ and column $l$. Each of these elements can be calculated by $\mathbf{Y}^{11}_{s\cdot }\mathbf{C}_{k\cdot }'$, where $s$ is in partition element $i$, and so the total number of operations needed for the computation is $O(nn_i)$. The matrix $\varvec{\mathcal {I}}_i\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{kl})'$ has only $n_i$ non zero rows of length $n$ each, where each row can be calculated as the product: $Y^{12}_{sl}\mathbf{C}_{k\cdot }$. Therefore, the total number of operation to compute this matrix is $O(nn_i)$. The multiplication with $\mathbf{W}$ includes multiplying only the non zero elements and therefore takes $O(n_i(n+r))$ operations. Consequently computing $\{{\mathcal {P}}_{ij}\}_{i\in J_1,j\in J_2}$ requires $\sum _{i=1}^m\sum _{j=m+1}^{m+r\tilde{q}} O(n_i(n+r))= O(n(n+r)r\tilde{q})= O(n^3)$.
3.
$i,j\!\in \! J_2$: Denoting ${l_1\!=\!\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1}$, ${k_1=i-m-(l_1-1)\tilde{q}}$, ${l_2=\lfloor \frac{j-m-1}{\tilde{q}}\rfloor +1}$, and ${k_2=j-m-(l_2-1)\tilde{q}}$ we have that:
$$\begin{aligned} \tilde{\varvec{\mathcal {C}}}^{k_1l_1}\mathbf{Y}\tilde{\varvec{\mathcal {C}}}^{k_2l_2}= & {} \begin{bmatrix} \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{21}&\quad \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{22} \\ (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{11}&\quad (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{12} \end{bmatrix}\\ \tilde{\varvec{\mathcal {C}}}^{k_2l_2}= & {} \begin{bmatrix} \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{22}(\tilde{\mathbf{C}}^{k_2l_2})'&\quad \tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{21}\tilde{\mathbf{C}}^{k_2l_2}\\ (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{k_2l_2})'&\quad (\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{11}\tilde{\mathbf{C}}^{k_2l_2} \end{bmatrix}. \end{aligned}$$
Notice the following:
- The first matrix $\tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{22}(\tilde{\mathbf{C}}^{k_2l_2})'=Y^{22}_{l_1l_2}\mathbf{C}_{k_1\cdot }'\mathbf{C}_{k_2\cdot }$.
- Matrix $(\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{11}\tilde{\mathbf{C}}^{k_2l_2}$ is a zero matrix with one non zero element in $(l_1,l_2)$ with value $\mathbf{C}_{k_1\cdot }\mathbf{Y}^{11}\mathbf{C}_{k_2\cdot }'$.
- Matrix $\tilde{\mathbf{C}}^{k_1l_1}\mathbf{Y}^{21}\tilde{\mathbf{C}}^{k_2l_2}$ will have non-zero entries only in column $l_2$ which will equal $\mathbf{C}_{k_2\cdot }'(\mathbf{Y}^{21}_{l_1\cdot }\mathbf{C}_{k_1\cdot }')$. Similarly, we have $(\tilde{\mathbf{C}}^{k_1l_1})'\mathbf{Y}^{12}(\tilde{\mathbf{C}}^{k_2l_2})'\!=\!(\tilde{\mathbf{C}}^{k_2l_2}\mathbf{Y}^{21}\tilde{\mathbf{C}}^{k_1l_1})'$ which means that only the $l_1$ row is non-zero with value $\mathbf{C}_{k_1\cdot }(\mathbf{Y}^{21}_{l_2\cdot }\mathbf{C}_{k_2\cdot }')$.
Therefore, from symmetry of $\mathbf{W}$ the total calculation sums up to:
$$\begin{aligned} {{\mathrm{Tr}}}\left( \mathbf{W}\tilde{\varvec{\mathcal {C}}}^{k_1l_1}Y\tilde{\varvec{\mathcal {C}}}^{k_2l_2}\right)= & {} Y^{22}_{l_1l_2}(\mathbf{C}_{k_2\cdot }\mathbf{W}^{11}\mathbf{C}_{k_1\cdot }')+ W^{22}_{l_1l_2}(\mathbf{C}_{k_1\cdot }\mathbf{Y}^{11}\mathbf{C}_{k_2\cdot }')\nonumber \\&+ \, (\mathbf{W}^{21}_{l_2\cdot }\mathbf{C}_{k_2\cdot }')(\mathbf{Y}^{21}_{l_1\cdot }\mathbf{C}_{k_1\cdot }') +(\mathbf{W}^{21}_{l_1\cdot }\mathbf{C}_{k_1\cdot }')(\mathbf{Y}^{21}_{l_2\cdot }\mathbf{C}_{k_2\cdot }') \end{aligned}$$
(28)
Both $\mathbf{W}^{11}\mathbf{C}_{k_1\cdot }'$ and $\mathbf{Y}^{11}\mathbf{C}_{k_1\cdot }'$ can be computed once for every $k_1$ at a total cost of $O(n^2\tilde{q})= O(n^3)$, creating $2\tilde{q}$ vectors of length $n$. Multiplying each of this vectors by each $\mathbf{C}_{k_2\cdot }'$ takes $O(n\tilde{q}^2)= O(n^3)$. Finally multiplying these $\tilde{q}^2$ results by the appropriate scalars $Y^{22}_{l_1l_2}$ and $W^{22}_{l_1l_2}$ for each combination of $(l_1,l_2)$ takes $O(r^2\tilde{q}^2)$ operations. Similarly, computing $\mathbf{W}^{21}_{l\cdot }\mathbf{C}_{k\cdot }'$ and $\mathbf{Y}^{21}_{l\cdot }\mathbf{C}_{k\cdot }'$ takes total of $O(r\tilde{q}n)$ for all combinations of $l$ and $k$, resulting in $r\tilde{q}$ scalars each. The multiplication of all combinations of this scalars costs $O(r^2\tilde{q}^2)=O(n^2)$. Therefore, we can compute $\{{\mathcal {P}}_{ij}\}_{i,j\in J_2}$ in $O(n^3)$ operations.

Accordingly, the total computational cost of constructing $\varvec{\mathcal {P}}$ is $O(n^3)$.

In conclusion, the computational cost of solving problem (DSDR-P) each step of the interior point Algorithm 2 is $O((n+r\tilde{q})^3)$ operations per iterations and $O\left( \sqrt{n+r}\log \left( \frac{{{\mathrm{Tr}}}(\varvec{\mathcal {Z}}_0\mathbf{Y}_0)}{\epsilon }\right) \right) $ iteration to obtain duality gap of $\epsilon $.

Appendix 2: Implementation of HRVW-alg to worst case expectation minimization approximation problem

Consider problem ($\alpha $-SDP-EP) where objective function is divided by the constant $(1+\frac{1}{\alpha })$. We denote

$$\begin{aligned} \varvec{\mathcal {Z}}^1= & {} \begin{bmatrix} \sum \limits _{i=1}^m\mu _i\varvec{\mathcal {I}}_i&\quad \! (\mathbf{K}{\mathbf{C}}^1+{\mathbf{D}}^1)'\\ (\mathbf{K}{\mathbf{C}}^1+{\mathbf{D}}^1)&\quad \! \mathbf{I}\end{bmatrix},\quad \varvec{\mathcal {Z}}^2= \begin{bmatrix} \varvec{\mathcal {U}}&\quad \! (\mathbf{K}{\mathbf{C}}^2+{\mathbf{D}}^2) \\ (\mathbf{K}{\mathbf{C}}^1+{\mathbf{D}}^1)'&\quad \! \mathbf{I}\\ \end{bmatrix},\\ \quad \varvec{\mathcal {Z}}= & {} \begin{bmatrix} \varvec{\mathcal {Z}}^1&\quad \! 0\\ 0&\quad \! \varvec{\mathcal {Z}}^2\\ \end{bmatrix} ,\quad \varvec{\mathcal {Z}}\succeq 0 \end{aligned}$$

Defining the dual variable $\mathbf{Y}$ to be a block diagonal matrix, with blocks ${\mathbf{Y}}^1\in \mathbb {R}^{({\bar{n}_1+r})\times ({\bar{n}_1+r})}$ and ${\mathbf{Y}}^2\in \mathbb {R}^{({r+\bar{n}_2})\times ({r+\bar{n}_2})}$ which themselves are separated to blocks.

$$\begin{aligned} {\mathbf{Y}}= \begin{bmatrix} {\mathbf{Y}}^1&\quad \! 0\\ 0&\quad \!{\mathbf{Y}}^2 \end{bmatrix},\quad {\mathbf{Y}}^1= \begin{bmatrix}{\mathbf{Y}}^1_{[11]}&\quad \! {\mathbf{Y}}^1_{[12]}\\ {\mathbf{Y}}^1_{[21]}&\quad \! {\mathbf{Y}}^1_{[22]} \end{bmatrix}\succeq 0,\quad {\mathbf{Y}}^2= \begin{bmatrix}{\mathbf{Y}}^2_{[11]}&\quad \! {\mathbf{Y}}^2_{[12]}\\ {\mathbf{Y}}^2_{[21]}&\quad \! {\mathbf{Y}}^2_{[22]} \end{bmatrix}\succeq 0, \end{aligned}$$

we can formulate the following dual problem.

(29)

where $\mathbf{1}_{j}(i)$ is the Dirac measure, and matrices $\underline{\varvec{\mathcal {I}}}^{ij}$, $\tilde{\varvec{\mathcal {I}}_i}$ and $\tilde{\varvec{\mathcal {C}}}^{\ell ij}$, $\tilde{\mathbf{C}}^{1 ij}\in \mathbb {R}^{\bar{n}_1\times r}$ and $\tilde{\mathbf{C}}^{2 ij}\in \mathbb {R}^{r \times \bar{n}_2}$ are given by:

$$\begin{aligned} \tilde{\varvec{\mathcal {I}}_i}= & {} \begin{bmatrix} \varvec{\mathcal {I}}_i&\quad \! \mathbf{0}\\ \mathbf{0}&\quad \! \mathbf{0}\end{bmatrix} ,\quad \tilde{\varvec{\mathcal {C}}}^{\ell ij}= \begin{bmatrix} \mathbf{0}&\quad \! \tilde{\mathbf{C}}^{\ell ij}\\ (\tilde{\mathbf{C}}^{\ell ij})'&\quad \! \mathbf{0}\end{bmatrix} ,\quad \ell =1,2\\ \underline{\varvec{\mathcal {I}}}^{ij}_{kl}\!= & {} \! {\left\{ \begin{array}{ll} 1 &{} \quad k=i,\;l=j\\ 0 &{} \quad \text {otherwise} \end{array}\right. } ,\quad \tilde{C}^{1ij}_{kl}= {\left\{ \begin{array}{ll} C^1 _{ik} &{} l=j\\ 0 &{}\text {otherwise} \end{array}\right. } ,\quad \tilde{C}^{2ij}_{kl}= {\left\{ \begin{array}{ll} C^2 _{il} &{} k=j\\ 0 &{}\text {otherwise} \end{array}\right. }. \end{aligned}$$

Both primal and dual problem are strictly feasible with $\mathbf{K}=0$, ${\varvec{\mu }}=(\Vert \mathbf{D}^1\Vert ^2+1) \mathbf{e}$ and $\varvec{\mathcal {U}}=(\Vert \mathbf{D}^2\Vert ^2+1)\mathbf{I}$ for the primal and $\mathbf{Y}^1_{[11]}=\sum \limits _{i=1}^m \frac{\alpha }{n_i}\varvec{\mathcal {I}}_i$, $\mathbf{Y}^1_{[22]}=\mathbf{I}$, $\mathbf{Y}^1_{[12]}=0$, $\mathbf{Y}^2=\mathbf{I}$ for the dual problem, and therefore strong duality holds, and the interior point method can be utilized.

In order to solve this problem via interior point method, we need to formulate linear operator $B$ and its adjoint $B'$. To do so, we will first define three index sets $J_1=\{1,\ldots ,m\}$, $J_2=\{m+1,\ldots ,m+\tilde{q}r\}$ and $J_3=\{m+\tilde{q}r+1,\ldots , m+\tilde{q}r+r^2\}$. The linear operator can then be written as

$$\begin{aligned} B(Y)=B(\mathbf{Y}^1,\mathbf{Y}^2)=[B_1(\mathbf{Y}^1,\mathbf{Y}^2),\ldots ,B_{m+\tilde{q}r+r^2}(\mathbf{Y}^1,\mathbf{Y}^2)] \end{aligned}$$

where $B_i(\mathbf{Y}^1,\mathbf{Y}^2)=Tr(\mathbf{B}^1_i\mathbf{Y}^1)+Tr(\mathbf{B}^2_i\mathbf{Y}^2)$ such that

$$\begin{aligned} \mathbf{B}^1_i= & {} {\left\{ \begin{array}{ll} \tilde{\varvec{\mathcal {I}}}_i &{} i\in J_1\\ \tilde{\varvec{\mathcal {C}}}^{1kl} &{} i\in J_2, l=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1, k=i-m-(l-1)\tilde{q}\\ 0 &{} i\in J_3\\ \end{array}\right. }.\end{aligned}$$

(30)

$$\begin{aligned} \mathbf{B}^2_i= & {} {\left\{ \begin{array}{ll}0 &{} i\in J_1\\ \tilde{\varvec{\mathcal {C}}}^{2kl} &{} i\in J_2, l=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1, k=i-m-(l-1)\tilde{q}\\ \underline{\varvec{\mathcal {I}}}^{kl} &{} i\in J_3, l=\lfloor \frac{i-m-\tilde{q}r-1}{r}\rfloor +1, k=i-m-(\tilde{q}+l-1)r\\ \end{array}\right. }. \end{aligned}$$

(31)

The adjoint operator is $B'(\mathbf{K},\varvec{\mathcal {U}},{\varvec{\mu }})\equiv \varvec{\mathcal {Z}}$ where $\varvec{\mathcal {Z}}$ is defined above.

According to Algorithm 2 the most computationally expensive stage in the interior point algorithm consist of constructing matrix $\varvec{\mathcal {P}}$ and inverting it, since all other calculation take at most $O((\bar{n}_1+\bar{n}_2+2r)^3)=O(n+2r)^3$ operations. We will show that this construction also takes $O(n)^3$ operations.

We can separate the constructing of matrix $\varvec{\mathcal {P}}$ to several parts. For this purpose we will first denote $\mathbf{W}=\varvec{\mathcal {Z}}^{-1}$ as a block diagonal matrix with blocks $\mathbf{W}^1={\varvec{\mathcal {Z}}^1}^{-1}$ and $\mathbf{W}^2={\varvec{\mathcal {Z}}^2}^{-1}$. An important observation is that the construction of matrix $\varvec{\mathcal {P}}$ can be separated into two independent calculations:

$$\begin{aligned} {\mathcal {P}}_{ij}=\mathbf{W}\circ \mathbf{B}_j\mathbf{Y}\mathbf{B}_i=\mathbf{W}^1\circ \mathbf{B}^1_j\mathbf{Y}^1\mathbf{B}^1_i+\mathbf{W}^2\circ \mathbf{B}^2_j\mathbf{Y}^2\mathbf{B}^2_i =\mathbf{W}\circ \mathbf{B}_i\mathbf{Y}\mathbf{B}_j={\mathcal {P}}_{ji}. \end{aligned}$$

The first term of the calculation is identical to the terms calculated in “Appendix 1” (or is equal to zero when either $i$ or $j$ are in $J3$) and therefore costs $O(\bar{n}_1^3)$ to calculate for all $i$,$j$. Consequently, we will focus on the computational complexity of the second term:

1.
$\forall {i}\in J_1$ or $j\in J_1$. The term is identically zero therefore doesn’t need to be computed.
2.
$\forall {i,j}\in J_2$. We start by calculating $\mathbf{B}^2_i\mathbf{Y}^2\mathbf{B}^2_j$. Denoting, as before, $l_1=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1,\quad k_1=i-m-l_1\tilde{q},\quad l_2=\lfloor \frac{j-m-1}{\tilde{q}}\rfloor +1,\quad k_2=j-m-l_2\tilde{q}$ We have that:
$$\begin{aligned} \tilde{\varvec{\mathcal {C}}}^{2ij}\mathbf{Y}^2= \begin{bmatrix} \tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[21]}&\quad \!\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[22]}\\ (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[11]}&\quad \! (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[12]} \end{bmatrix} \end{aligned}$$
and consequently
$$\begin{aligned} \tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\tilde{\varvec{\mathcal {C}}}^{2k_2l_2}= \begin{bmatrix} \tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[22]}(\tilde{\mathbf{C}}^{2k_2l_2})'&\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[21]}\tilde{\mathbf{C}}^{2k_2l_2}\\ (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[12]}(\tilde{\mathbf{C}}^{2k_2l_2})'&\quad \! (\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[11]}\tilde{\mathbf{C}}^{2k_2l_2} \end{bmatrix} \end{aligned}$$
$\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[22]}(\tilde{C}^{2k_2l_2})'$ has only one component which is not zero in index $(l_1,l_2)$ and its value is $C^2_{k_1\cdot }\mathbf{Y}^2_{[22]}(C^2_{k_2\cdot })'$. On the other hand $(\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[11]}\tilde{C}^{2k_2l_2}=(\mathbf{Y}^2_{[11]})_{l_1l_2}(C^2_{k_1\cdot })'{C}^{2}_{k_2\cdot }$. Furthermore, only row $l_1$ of matrix $\tilde{\mathbf{C}}^{2k_1l_1}\mathbf{Y}^2_{[21]}\tilde{\mathbf{C}}^{2k_2l_2}$ is not zero, with value of $(\mathbf{C}^2_{k_1\cdot }(\mathbf{Y}^2_{[21]})_{\cdot l_2})C^2_{k_2\cdot }$, and similarly only column $l_2$ of matrix $(\tilde{\mathbf{C}}^{2k_1l_1})'\mathbf{Y}^2_{[12]} (\tilde{\mathbf{C}}^{2k_2l_2})'$ is not zero, with value $(\mathbf{C}^2_{k_1\cdot })'(\mathbf{Y}^2_{[12]})_{l_2 \cdot })(\mathbf{C}^2_{k_2\cdot })'$. The inner product is, therefore, given by:
$$\begin{aligned} \begin{aligned} \mathbf{W}^2\circ \tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\tilde{\varvec{\mathcal {C}}}^{2k_2l_2}&= (W^2_{[11]})_{l_1l_2}\mathbf{C}^2_{k_1\cdot }\mathbf{Y}^2_{[22]}(\mathbf{C}^2_{k_2\cdot })' +(Y^2_{[11]})_{l_1l_2}\mathbf{C}^2_{k_2\cdot }\mathbf{W}^2_{[22]}(C^2_{k_1\cdot })'\\&\quad +(\mathbf{C}^2_{k_1\cdot }(\mathbf{Y}^2_{[12]})'_{ l_2\cdot })(\mathbf{C}^2_{k_2\cdot }(\mathbf{W}^2_{[12]})'_{l_1\cdot })\\&\quad +((\mathbf{W}^2_{[12]})_{l_1 \cdot }(\mathbf{C}^2_{k_1\cdot })')((\mathbf{Y}^2_{[12]})_{l_2 \cdot }(\mathbf{C}^2_{k_2\cdot })')\\ \end{aligned} \end{aligned}$$
Calculating the first term for all $k_1$ takes $O(\bar{n_2}^2\tilde{q})$ operations resulting in $2\tilde{q}$ vectors of dimension $\bar{n_2}$. Multiplying these vectors with $\mathbf{C}^2_{k_1\cdot }$ for all values of $k_1$ takes $O(\tilde{q}^2\bar{n_2})$ operations. The last two terms can also be calculated in two stages, first constructing $2\tilde{q}r$ scalars for each combination of $k$ and $l$ by multiplying the two appropriate vectors, at total cost of $O(r\tilde{q}\bar{n}_2)$, and then multiplying all combinations of these scalars costing $O(\tilde{q}^2r^2)$ operations. Therefore the total computational cost for this sub-matrix is therefore $O(\tilde{q}\bar{n}_2^2+\bar{n}_2\tilde{q}^2+r^2\tilde{q}^2+r\tilde{q}\bar{n}_2)$. Since $\bar{n}_2\le n$ and since $O(\tilde{q})=O(n)$ we have that this is equivalent to $O(n^3)$
3.
$\forall {i\in J_2\,j\in J_3}$. Denoting, as before, $l_1=\lfloor \frac{i-m-1}{\tilde{q}}\rfloor +1,\quad k_1=i-m-(l_1-1)\tilde{q},\quad l_2=\lfloor \frac{j-m-\tilde{q}r-1}{r}\rfloor +1,\quad k_2=j-m-(\tilde{q}+l_2-1)r$, to obtain $l_1, \quad l_2,\;k_2\in [r]$ and $k_1\in [\tilde{q}]$. Notice that only column $l_2$ of matrix $\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2}$ is non zero, with value $\begin{bmatrix} (\mathbf{Y}^2_{[11]})_{\cdot k_2}; (\mathbf{Y}^2_{[21]})_{\cdot k_2}\\ \end{bmatrix}$ and therefore only the $l_2$ column of matrix $\tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2}$ is non zero. Consequently,
$$\begin{aligned} \mathbf{W}\circ \tilde{\varvec{\mathcal {C}}}^{2k_1l_1}\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2}= (W^2_{[11]})_{l_1l_2}\mathbf{C}^2_{k_1\cdot }(\mathbf{Y}^2_{[21]})_{\cdot k_2} + \mathbf{C}^2_{k_1\cdot }(\mathbf{W}^2_{[21]})_{\cdot l_2} (Y^2_{[11]})_{l_1 k_2}. \end{aligned}$$
Similarly to the previous case, assuming $O(\tilde{q})=O(n)$, this calculation can be done for all such indexes in $O(r\bar{n_2}\tilde{q}+r^3\tilde{q})= O(n^2)$ operations.
4.
$\forall {i,j}\in J_3$. In this case we denote $l_1=\lfloor \frac{i-m-\tilde{q}r-1}{r}\rfloor +1,\quad k_1=i-m-(\tilde{q}+l_1-1)r,\;l_2=\lfloor \frac{j-m-\tilde{q}r-1}{r}\rfloor +1,\quad k_2=j-m-(\tilde{q}+l_2-1)r$. According to the previous case we can conclude that:
$$\begin{aligned} \mathbf{W}\circ \underline{\varvec{\mathcal {I}}}^{k_1l_1}\mathbf{Y}^2\underline{\varvec{\mathcal {I}}}^{k_2l_2} =(W^2_{[11]})_{l_2k_1}(Y^2_{[11]})_{l_1k_2}. \end{aligned}$$
Obviously computing this for all relevant indexes takes $O(r^4)$.

Therefore, computing $\varvec{\mathcal {P}}$ in this case takes $O(n^3)$ operations.

Consequently, the interior point Algorithm 2 implemented on problem ($\alpha $-SDP-EP) takes $O(n^3)$ per iteration and obtains an $\epsilon $ accurate solution in $O(\sqrt{n+r}\log {(\frac{1}{\epsilon })})$ iterations.

Appendix 3: Obtaining a solution to problem (SVE)

Given problem (SVE) we define additional variable $\varvec{\mathcal {Z}}$ as:

$$\begin{aligned} \begin{aligned}&\quad \varvec{\mathcal {Z}}^1= \begin{bmatrix} \sum \limits _{i=1}^m\mu _i\varvec{\mathcal {I}}_i&\quad \! \tilde{\mathbf{E}}'\\ \tilde{\mathbf{E}}&\quad \! {\varvec{\varSigma }}\end{bmatrix} ,\quad {\mathcal {Z}}^2=1-\sum \limits _{i=1}^m\mu _i,\quad \varvec{\mathcal {Z}}= \begin{bmatrix} \varvec{\mathcal {Z}}^1&\quad \! \mathbf{0}\\ \mathbf{0}&\quad \! {\mathcal {Z}}^2 \end{bmatrix},\quad \varvec{\mathcal {Z}}\succeq 0,\\ \end{aligned} \end{aligned}$$

Since ${\varvec{\varSigma }}$ is defined to be symmetric it is represented by $r(r+1)/2$ variables. Constructing the problem dual via its Lagrangian.

The dual problem is given by:

(32)

Both primal and dual problem are strictly feasible with ${\varvec{\mu }}=\mathbf{e}/n$ and ${\varvec{\varSigma }}=\tilde{\mathbf{E}}'(\sum _i\mu _i \varvec{\mathcal {I}}_i)^{-1}\tilde{\mathbf{E}}+\mathbf{I}$ for the primal and $\mathbf{Y}^{11}=\sum _i \varvec{\mathcal {I}}_i/n_i$, $\mathbf{Y}^{22}=\mathbf{I}$, $\mathbf{Y}^{12}=\mathbf{0}$, $\beta =1$ for the dual problem. Therefore, strong duality holds for these problems.

Observe that since $\mathbf{Y}^{22}=\mathbf{I}$ the PSD constraint on $\mathbf{Y}$ can be reformulated, using the Schur complement, as $\varvec{\mathcal {Y}}=(\mathbf{Y}^{11}-\mathbf{Y}^{12}{\mathbf{Y}^{12}}')\succeq 0$. Moreover, any square sub-matrix on the diagonal of $\varvec{\mathcal {Y}}$ is also PSD and therefore the first constraints can be reformulated as:

$$\begin{aligned} {{\mathrm{Tr}}}({\mathbf{Y}^{12}}'\varvec{\mathcal {I}}_i\mathbf{Y}^{12})=\mathbf{Y}^{12}{\mathbf{Y}^{12}}'\circ \varvec{\mathcal {I}}_i\le \beta . \end{aligned}$$

Defining $k^i_j \quad j=1,\ldots , n_i$ to be the $j$th index in the index set corresponding to partition element $i$, we can define vector ${\varvec{\zeta }}_i=[\zeta _i^1,\ldots ,\zeta _i^{n_i}]'$ where ${\varvec{\zeta }}_i^j={\mathbf{Y}^{12}_{k^i_j\cdot }}$ is the $k^i_j$ row of matrix $\mathbf{Y}^{12}$. Similarly, we define $\tilde{\mathbf{e}}_i=[\tilde{\mathbf{e}}_i^1;\ldots ;\tilde{\mathbf{e}}_i^{n_i}]$ where $\tilde{\mathbf{e}}_i^j={\tilde{\mathbf{E}}_{\cdot k^i_j}}$ is the $k^i_j$ column of matrix $\tilde{\mathbf{E}}$. Therefor, problem (32) can be reformulates as follows.

$$\begin{aligned} \begin{aligned}&\max \limits _{\beta \ge 0}\max \limits _{{\varvec{\zeta }}}&-2\sum \limits _{i=1}^m{\varvec{\zeta }}_i'\tilde{\mathbf{e}}_i- \beta \\&s.t.&\quad \Vert {\varvec{\zeta }}_i\Vert ^2\le \beta \quad i=1,\ldots ,m\\ \end{aligned} \end{aligned}$$

(33)

Since $\Vert e_i\Vert =\Vert \varvec{\mathcal {I}}_i\tilde{\mathbf{E}}\Vert _F$ where $\Vert \cdot \Vert _F$ denotes the Frobenius norm, the definition of $J_0$ can be updated to $J_0=\{i\in \{1,\ldots ,m\}:\Vert \tilde{\mathbf{e}}_i\Vert \ne 0\}$. Given known $\beta $ the problem is separable in ${\varvec{\zeta }}_i$, and an optimal solution (not necessarily unique) is known to be

$$\begin{aligned} {\varvec{\zeta }}^*_i={\left\{ \begin{array}{ll} -\sqrt{\beta }\frac{ \tilde{\mathbf{e}}_i}{\Vert \tilde{\mathbf{e}}_i\Vert }&{} i\in J_0\\ \sqrt{\frac{\beta }{n_i}} &{} \text {otherwise} \end{array}\right. }\quad . \end{aligned}$$

(34)

Plugging this solution we obtain the following problem in variable $\beta $.

$$\begin{aligned} \max \limits _{\beta \ge 0} 2\sqrt{\beta }\sum \limits _{i=1}^m \Vert \tilde{\mathbf{e}}_i\Vert -\beta . \end{aligned}$$

Since $\tilde{\mathbf{E}}\ne 0$ the solution is obtained in a stationary point, i.e. when $\beta =1/{(\sum _{i=1}^m \Vert \tilde{\mathbf{e}}_i\Vert )}^2$, in either case the optimal objective function value is $\sum _{i=1}^m \Vert \tilde{\mathbf{e}}_i\Vert $. Notice that if $\tilde{\mathbf{E}}= 0$ the problem becomes trivial with solution ${\varvec{\varSigma }}=0$

In order to retrieve the solution to the primal problem, we first look at the dual of problem (33), which is given by:

(35)

The optimal solution to (35) is given, by the primal dual connection $\tilde{\mathbf{e}}_i={\varvec{\zeta }}_i\mu _i$ and the optimal solution (34), to be

$$\begin{aligned} \mu _i^\diamond =\frac{\left\| \tilde{\mathbf{e}}_i\right\| }{\sqrt{\beta }}= \frac{\left\| \tilde{\mathbf{e}}_i\right\| }{\sum _i \left\| \tilde{\mathbf{e}}_i\right\| }\quad i=1,\ldots ,m. \end{aligned}$$

Returning to problem (SVE) we obtain ${\varvec{\varSigma }}\succeq \tilde{\mathbf{E}}'(\sum _{i}\mu _i\varvec{\mathcal {I}}_i)^{-1}\tilde{\mathbf{E}}$ provided $\mu _i=0 \quad \forall i\notin J_0$, and so ${{\mathrm{Tr}}}({\varvec{\varSigma }})\ge \sum _{i\in J_0}{\Vert \tilde{\mathbf{e}}_i\Vert }^2/{\mu _i}$. Therefore, the optimal ${\varvec{\varSigma }}$ for problem (SVE) is obtained when the inequality is satisfied with equality, and is given by:

$$\begin{aligned} {\varvec{\varSigma }}^\diamond =\left( \sum \limits _{i=1}^m{\Vert \varvec{\mathcal {I}}_i \tilde{\mathbf{E}}\Vert _F}\right) \tilde{\mathbf{E}}'\left( \sum _{i\in J_0} \frac{1}{\Vert \varvec{\mathcal {I}}_i\tilde{\mathbf{E}}\Vert _F}\varvec{\mathcal {I}}_i\right) \tilde{\mathbf{E}} \end{aligned}$$

with optimal value of ${{\mathrm{Tr}}}({\varvec{\varSigma }}^\diamond )=(\sum _{i=1}^m{\Vert \varvec{\mathcal {I}}_i\tilde{\mathbf{E}}\Vert _F})^2$. The most computationally expensive operation is the multiplication $\tilde{\mathbf{E}}'\tilde{\mathbf{E}}$ which costs $O(n^2r^2)$.

In conclusion, solving problem (SVE), including obtaining the optimal ${\varvec{\varSigma }}^\diamond $ takes at most $O(n^2r^2)$ operations. Given $r$ is constant and $n\le S\max _i n_i$ where $\max _i n_i$ is constant, we obtain computational complexity of $O(S^2)$.

Appendix 4: Rolling horizon objective function coefficients

Representing

$$\begin{aligned} \begin{aligned} {\varvec{\delta }}_T&=\mathbf{A}_T^{{\varvec{\omega }}}{\varvec{\omega }}_{T}^{T-S} +\mathbf{A}_T^{\mathbf{Z}}\mathbf{Z}_{T-S}^{T-2S+2}\\&\equiv \mathbf{A}^{{\varvec{\delta }}_{T-S}}{\varvec{\delta }}_{T-S}+\sum _{t=T-S+1}^T \mathbf{A}^{\mathbf{a}_{t}}_T\mathbf{a}_t+\sum _{t=T-S+1}^T \mathbf{A}^{\mathbf{v}_{t}}_T\mathbf{v}_t+\sum _{t=T-2S+2}^{T-S}\mathbf{A}^{\mathbf{z}_{t}}_T\mathbf{z}_t, \end{aligned} \end{aligned}$$

where the coefficient matrices are given by:

$$\begin{aligned} \mathbf{A}_T^{\mathbf{a}_t}\!&=\! {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{A}_{T-1}^{\mathbf{a}_t}\!-\!\mathbf{K}_T^{T} \mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{a}_t}\!+\!\mathbf{K}^t_T\mathbf{H}\mathbf{G}\!-\!\sum \limits _{\tau =t+1}^{T-1} \mathbf{K}_T^\tau \mathbf{H}\mathbf{F}\mathbf{A}_{\tau -1}^{\mathbf{a}_t} &{}\quad \max (T-S+1,1)\le t<T\\ \left( \mathbf{K}^T_T \mathbf{H}\mathbf{G}-\mathbf{G}\right) &{}\quad t=T,\; T\ge 1\\ 0&{} \quad \text {otherwise}\\ \end{array}\right. } \\ \mathbf{A}_T^{\mathbf{v}_t}\!&=\! {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{A}_{T-1}^{\mathbf{v}_t}\!-\!\mathbf{K}_T^{T}\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{v}_t}\!+\!\mathbf{K}^t_T-\sum \limits _{\tau =t+1}^{T-1} \mathbf{K}_T^\tau \mathbf{H}\mathbf{F}\mathbf{A}_{\tau -1}^{\mathbf{v}_t} &{}\qquad \max (T-S+1,1)\le t<T\\ \mathbf{K}^T_T&{}\qquad t=T, \quad T\ge 1 \\ \mathbf{0}&{}\qquad \text {otherwise}\\ \end{array}\right. } \\ \mathbf{A}_T^{{\varvec{\delta }}_{t}}&= {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{A}_{T-1}^{{\varvec{\delta }}_{t}}-\sum \limits _{\tau =t+2}^{T}\mathbf{K}_T^{\tau } \mathbf{H}\mathbf{F}\mathbf{A}_{\tau -1}^{{\varvec{\delta }}_{t}}&{}\qquad \max (T-S,0)\le t< T-1\\ \mathbf{F}-\mathbf{K}_T^{T}\mathbf{H}\mathbf{F}&{} \qquad t=T-1\\ \mathbf{0}&{}\qquad \text {otherwise}\\ \end{array}\right. } \\ \mathbf{L}_T^{\mathbf{z}_{t},j}\!&=\! {\left\{ \begin{array}{ll} \mathbf{K}_T^t\!-\!\sum \limits _{\tau =t+j}^T \mathbf{H}\mathbf{F}\mathbf{L}_{\tau -1}^{\mathbf{z}_t,j} &{}\quad max(T\!-\!S\!+\!1,1)\!\le \! t\!\le \! T\!-\!1,\quad 1\!\le \! j\!\le \! T\!-\!t\\ \mathbf{F}\mathbf{L}_{T-1}^{\mathbf{z}_{t},j}\!-\!\sum \limits _{\tau =max(T-S,t+j)}^T \mathbf{H}\mathbf{F}\mathbf{L}_{\tau -1}^{\mathbf{z}_t,j}&{}\quad max(T\!-\!2S\!+\!2,1)\!\le \! t \!\le \! T\!-\!S,\quad 1\!\le \! j\!\le \! S\!-\!1\\ \mathbf{0}&{}\quad \text {otherwise}\\ \end{array}\right. }\\ \mathbf{A}_T^{\mathbf{z}_{t}}\!&=\! {\left\{ \begin{array}{ll} \mathbf{F}\mathbf{L}_T^{\mathbf{z}_t,j}-\sum \limits _{\tau =T-S+1}^T \mathbf{H}\mathbf{F}\mathbf{L}_{\tau -1}^{\mathbf{z}_t,j} &{}\quad \max (T\!-\!2S\!+\!2,1) \!\le \! t\!\le \! T\!-\!S,\quad j\!=\!T-S-t+1\\ \mathbf{0}&{} \quad \text {otherwise}.\\ \end{array}\right. } \end{aligned}$$

Similarly we can represent

$$\begin{aligned} \mathbf{z}_T=\hat{\mathbf{A}}_T^{{\varvec{\omega }}}{\varvec{\omega }}_{T}^{T-S}+\hat{\mathbf{A}}_T^{\mathbf{Z}} \mathbf{Z}_{T-S}^{T-2S+2}\equiv \hat{\mathbf{A}}^{{\varvec{\delta }}_{T-S}} {\varvec{\delta }}_{T-S}+\sum _{t=T-S+1}^T \hat{\mathbf{A}}^{\mathbf{a}_{t}}_T\mathbf{a}_t+\hat{\mathbf{A}}^{\mathbf{v}_{t}}_T\mathbf{v}_t \end{aligned}$$

where:

$$\begin{aligned} \hat{\mathbf{A}}_T^{\mathbf{a}_t}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{a}_t}&{} \quad \max (T-S+1,1)\le t<T\\ \mathbf{H}\mathbf{G}&{} \quad t=T, \quad T\ge 1\\ 0&{} \quad \text {otherwise}\\ \end{array}\right. }\\ \hat{\mathbf{A}}_T^{\mathbf{v}_t}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{\mathbf{v}_t} &{} \quad \max (T-S+1,1)\le t<T\\ \mathbf{I}&{} \quad t=T, \quad T\ge 1 \\ \mathbf{0}&{} \quad \text {otherwise}\\ \end{array}\right. }\\ \hat{\mathbf{A}}_T^{{\varvec{\delta }}_{t}}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}\mathbf{A}_{T-1}^{{\varvec{\delta }}_{t}} &{} \quad \max (T-S,0)\le t<T-1\\ -\mathbf{H}\mathbf{F}&{} \quad t=T-1, \quad T\ge 1\\ \mathbf{0}&{} \quad \text {otherwise}\\ \end{array}\right. }\\ \hat{\mathbf{A}}_T^{\mathbf{z}_{t}}= & {} {\left\{ \begin{array}{ll} -\mathbf{H}\mathbf{F}{\mathbf{A}}_{T-1}^{\mathbf{z}_{t}} &{} \quad \max (T-2S+2,1) \le t\le T-S\\ \mathbf{0}&{} \quad \text {otherwise}\\ \end{array}\right. } \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shtern, S., Ben-Tal, A. A semi-definite programming approach for robust tracking. Math. Program. 156, 615–656 (2016). https://doi.org/10.1007/s10107-015-0910-5

Download citation

Received: 02 April 2014
Accepted: 22 April 2015
Published: 28 May 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s10107-015-0910-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A semi-definite programming approach for robust tracking

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust visual tracking based on adaptive gradient descent optimization of a cost function with parametric models of appearance and geometry

Robust visual tracking using information theoretical learning

Visual tracking via robust multi-task multi-feature joint sparse representation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Implementation of HRVW-alg to saddle point problem

Appendix 2: Implementation of HRVW-alg to worst case expectation minimization approximation problem

Appendix 3: Obtaining a solution to problem (SVE)

Appendix 4: Rolling horizon objective function coefficients

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now