Skip to main content
Log in

A Primal-Dual algorithm for nonnegative N-th order CP tensor decomposition: application to fluorescence spectroscopy data analysis

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

This work concerns the resolution of inverse problems encountered in multidimensional signal processing problems. Here, we address the problem of tensor decomposition, more precisely the Canonical Polyadic (CP) Decomposition also known as Parafac. Yet, when the amount of data is very large, this inverse problem may become numerically ill-posed and consequently hard to solve. This requires to introduce a priori information about the system, often in the form of penalty functions. The difficulty that might arise is that the considered cost function may be non differentiable. It is the reason why we develop here a Primal-Dual Projected Gradient (PDPG) optimization algorithm to solve variational problems involving such cost functions. Moreover, despite its theoretical importance, most approaches of the literature developed for nonnegative CP decomposition, have not actually addressed the convergence issue. Furthermore, most methods with convergence rate guarantee require restrictive conditions for their theoretical analyses. Hence, a theoretical void still exists for the convergence rate problem under very mild conditions. Therefore, the two main aims of this work are (i) to design a CP-PDPG algorithm and then (ii) to prove, under mild conditions, its convergence to the set of minimizers at the rate O(1/k), k being the iteration number. The effectiveness and the robustness of the proposed approach are illustrated through numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The sequences \(\widehat{\mathbf{A }}^{(n)}_k\) and \(\widehat{\mathbf{Y }}^{(n)}_k\) (\(k \ge 1\)) defined in the Theorem are called the Cesáro sequences.

References

  • Brewer, J. (1978). Kronecker products and matrix calculus in system theory. IEEE Transactions on Circuits and Systems, 25(9), 772–781.

    Article  MathSciNet  Google Scholar 

  • Bro, R. (1997). Parafac. tutorial and applications. Chemometrics and Intelligent Laboratory Systems, 38(2), 149–172.

    Article  Google Scholar 

  • Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika, 35(3), 283-319.

  • Cichocki, A., Zdunek, R., Phan, A. H., & Amari, S. I. (2009). Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. New York: Wiley.

    Book  Google Scholar 

  • Cichocki, A., Mandic, D., Phan, A. H., Caiafa, C., Zhou, G., Zhao, Q., & Lathauwer, L. D. (2015). Tensor decompositions for signal processing applications: from two-way to multiway component analysis. IEEE Signal Processing Magazine, 32(2), 145–163.

    Article  Google Scholar 

  • Cohen, J., Farias, R. C., & Comon, P. (2014). Fast decomposition of large nonnegative tensors. IEEE Signal Processing Letters, 22(7), 862–866.

    Article  Google Scholar 

  • Finesso, L., & Spreij, P. (2006). Nonnegative matrix factorization and i-divergence alternating minimization. Linear Algebra and its Applications, 416(2–3), 270–287.

    Article  MathSciNet  Google Scholar 

  • Franc, A. (1992). Etude algébrique des multi-tableaux : apport de l’algèbre tensorielle. Phd thesis, University of Montepellier II, Montpellier, France.

  • Harshman, R., & Lundy, M. (1994). Parafac: parallel factor analysis. Computational Statistics & Data Analysis, 18(1), 39–72.

    Article  Google Scholar 

  • Harshman, R. A.(1970). Foundations of the parafac procedure: models and conditions for an “explanatory” multimodal factor analysis. UCLA working papers in phonetics.

  • Hiriart-Urruty, J., & Lemaréchal, C.(1993). Convex analysis and minimization algorithms, vols. i, ii.

  • Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6(1–4), 164–189.

    Article  Google Scholar 

  • Hjørungnes, A., & Gesbert, D. (2007). Complex-valued matrix differentiation: techniques and key results. IEEE Transactions on Signal Processing, 55(6), 2740–2746.

    Article  MathSciNet  Google Scholar 

  • Kim, J., & Park, H.(2012). Fast nonnegative tensor factorization with an active-set-like method. In High-performance scientific computing (pp. 311–326) Springer.

  • Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.

    Article  MathSciNet  Google Scholar 

  • Lai, H. C., & Lin, L. J.(1988) . The fenchel-moreau theorem for set functions. Proceedings of the American Mathematical Society, pp. 85–90.

  • Lathauwer, L. D., Moor, B. D., & Vandewalle, J. (2004). Computation of the canonical decomposition by means of a simultaneous generalized schur decomposition. SIAM journal on Matrix Analysis and Applications, 26(2), 295–327.

    Article  MathSciNet  Google Scholar 

  • Lev-Ari, H. (2005). Efficient solution of linear matrix equations with application to multistatic antenna array processing. Communications in Information & Systems, 5(1), 123–130.

    Article  MathSciNet  Google Scholar 

  • Liavas, A. P., & Sidiropoulos, N. D. (2015). Parallel algorithms for constrained tensor factorization via alternating direction method of multipliers. IEEE Transactions on Signal Processing, 63(20), 5450–5463.

    Article  MathSciNet  Google Scholar 

  • Lin, C. J. (2007). On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Transactions on Neural Networks, 18(6), 1589–1596.

    Article  Google Scholar 

  • Lin, C. J. (2007). Projected gradient methods for nonnegative matrix factorization. Neural computation, 19(10), 2756–2779.

    Article  MathSciNet  Google Scholar 

  • Maehara, T., & Hayashi, K. (2016). Expected tensor decomposition with stochastic gradient descent. In Thirtieth AAAI conference on artificial intelligence

  • Magnus, J. R., & Neudecker, H. (2019). Matrix differential calculus with applications in statistics and econometrics. New York: Wiley.

    Book  Google Scholar 

  • Mocks, J. (1988). Topographic components model for event-related potentials and some biophysical considerations. IEEE transactions on biomedical engineering, 35(6), 482–484.

    Article  Google Scholar 

  • Nedić, A., & Ozdaglar, A. (2009). Subgradient methods for saddle-point problems. Journal of Optimization Theory and Applications, 142(1), 205–228.

    Article  MathSciNet  Google Scholar 

  • Oseledets, I. V. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing, 33(5), 2295–2317.

    Article  MathSciNet  Google Scholar 

  • Pasko, A., & Savchenko, V. (1997). Projection operation for multidimensional geometric modeling with real functions. In Geometric modeling: theory and practice (pp. 197–205). Springer.

  • Phan, A. H., & Cichocki, A. (2011). Parafac algorithms for large-scale problems. Neurocomputing, 74(11), 1970–1984.

    Article  Google Scholar 

  • Phan, A. H., & Cichocki, A. (2011). Parafac algorithms for large-scale problems. Neurocomputing, 74(11), 1970–1984.

    Article  Google Scholar 

  • Phan, A.-H., Tichavskỳ, P., & Cichocki, A. (2013). Fast alternating ls algorithms for high order candecomp/parafac tensor factorizations. IEEE Transactions on Signal Processing, 61(19), 4834–4846.

    Article  Google Scholar 

  • Repetti, A. (2015). Algorithmes d’optimisation en grande dimension: applications à la résolution de problèmes inverses. PhD thesis, Paris-Est, Marne La Vallée University.

  • Repetti, A., Chouzenoux, E., & Pesquet, J.-C. (2015). Un petit tutoriel sur les méthodes primales-duales proximales pour l’optimisation convexe. In GRETSI.

  • Royer, J. P., Thirion-Moreau, N., Comon, P. P., Redon, R., & Mounier, S. (2015). A regularized nonnegative canonical polyadic decomposition algorithm with preprocessing for 3d fluorescence spectroscopy. Journal of Chemometrics, 29(4), 253–265.

    Article  Google Scholar 

  • Sanchez, E., & Kowalski, B. R. (1990). Tensorial resolution: a direct trilinear decomposition. Journal of Chemometrics, 4, 29–45.

    Article  Google Scholar 

  • Shangina, E., Shangin, G.A., Medvedev, A.N., & Melnichenka, D.A. (2018). Construction of a geometric model for multidimensional space. In AIP Conference Proceedings, volume 1978. AIP Publishing LLC.

  • Smilde, A., Bro, R., & Geladi, P. (2005). Multi-way analysis: applications in the chemical sciences. New York: Wiley.

    Google Scholar 

  • Traoré, A., Berar, M., & Rakotomamonjy, A. (2019). Online multimodal dictionary learning. Neurocomputing, 368, 163–179.

    Article  Google Scholar 

  • Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 279–311.

    Article  MathSciNet  Google Scholar 

  • Uschmajew, A. (2012). Local convergence of the alternating least squares algorithm for canonical tensor approximation. SIAM Journal on Matrix Analysis and Applications, 33(2), 639–652.

    Article  MathSciNet  Google Scholar 

  • Vandenberghe, L., & Boyd, S. (1995). A primal-dual potential reduction method for problems involving matrix inequalities. Mathematical programming, 69(1), 205–236.

    MathSciNet  MATH  Google Scholar 

  • Vu, X. T., Maire, S., Chaux, C., & Thirion-Moreau, N. (2015). A new stochastic optimization algorithm to decompose large nonnegative tensors. IEEE Signal Processing Letters, 22(10), 1713–1717.

    Article  Google Scholar 

  • Vu, X. T., Chaux, C., Thirion-Moreau, N., & Maire, E. M. C. S. (2017). A new penalized nonnegative third-order tensor decomposition using a block coordinate proximal gradient approach: Application to 3d fluorescence spectroscopy. Journal of Chemometrics, 31(4), e2859.

    Article  Google Scholar 

  • Xu, Y., & Yin, W. (2013). A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on imaging sciences, 6(3), 1758–1789.

    Article  MathSciNet  Google Scholar 

  • Xu, Y., & Yin, W. (2017). A globally convergent algorithm for nonconvex optimization based on block coordinate update. Journal of Scientific Computing, 72(2), 700–734.

    Article  MathSciNet  Google Scholar 

  • Zdunek, R., & Fonał, K. (2020). Trust-region strategy with cauchy point for nonnegative tensor factorization with beta-divergence. In International Conference on Intelligent Decision Technologies (pp. 315–325) Springer.

  • Zdunek, R., & Sadowski, T. (2019). Segmented convex-hull algorithms for near-separable nmf and ntf. Neurocomputing, 331, 150–164.

    Article  Google Scholar 

  • Zosso, D., & Bustin, A.(2014). A primal-dual projected gradient algorithm for efficient beltrami regularization. Computer Vision and Image Understanding 14–52. https://www.researchgate.net/publication/291331974_A_Primal-Dual_Projected_Gradient_Algorithm_for_Efficient_Beltrami_Regularization

Download references

Author information

Authors and Affiliations

Authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 Appendix A

The steps of our proof proceed very much in the same way as the analysis provided by N. Angelia and O. Asuman in Nedić and Ozdaglar (2009) in the context of Lagrangian duality, where they consider a convex primal optimization problem and its Lagrangian dual. Then, before we prove the Theorem 2, we will state a simple following lemma:

Lemma 2

Let the sequences \(\{ \mathbf{A }^{(n)}_k \}\) and \(\{ \mathbf{Y} ^{(n)}_k \}\) be generated by the subgradient algorithm. Then, we have:

  1. 1.

    For any \( \mathbf{A }^{(n)}\in U\) and for all \(k \ge 0\),

    $$\begin{aligned} \parallel \mathbf{A }^{(n)}_{k+1}- \mathbf{A }^{(n)}\parallel ^2\le & {} \parallel \mathbf{A }^{(n)}_{k}- \mathbf{A }^{(n)}\parallel ^2-2\sigma ^{(n)}\left( g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)-g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_k)\right) \\&+(\sigma ^{(n)})^2\parallel \partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)\parallel ^2 \end{aligned}$$
  2. 2.

    For any \( \mathbf{Y }^{(n)}\in V^*\) and for all \(k \ge 0\),

    $$\begin{aligned} \parallel \mathbf{Y }^{(n)}_{k+1}- \mathbf{Y }^{(n)}\parallel ^2\le & {} \parallel \mathbf{Y }^{(n)}_{k}- \mathbf{Y }^{(n)}\parallel ^2+2\tau ^{(n)}\left( g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)-g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)})\right) \\&+(\tau ^{(n)})^2\parallel \partial _{\mathbf{Y }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)\parallel ^2 \end{aligned}$$

Proof

  1. 1.

    Concerning the first inequality of the Lemma 2, for any \(\mathbf{A }^{(n)}_k\in U\) and all \(k \ge 0\)

    $$\begin{aligned} \parallel \mathbf{A }^{(n)}_{k+1}- \mathbf{A }^{(n)}\parallel ^2&=\parallel P_{U} \{ \mathbf{A }^{(n)}_{k}-\sigma ^{(n)}\partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k) \} - \mathbf{A }^{(n)}\parallel ^2\\&\le \parallel \mathbf{A }^{(n)}_{k}-\sigma ^{(n)}\partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k) - \mathbf{A }^{(n)}\parallel ^2\\&= \parallel \mathbf{A }^{(n)}_{k}- \mathbf{A }^{(n)} \parallel ^2-2\sigma ^{(n)}(\mathbf{A }^{(n)}_{k}-\mathbf{A }^{(n)})^{T}\partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)\\&\quad +(\sigma ^{(n)})^2 \parallel \partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k) \parallel ^2 \end{aligned}$$

    Since the function \(g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)})\) is convex in \(\mathbf{A }^{(n)}\) for each \(\mathbf{Y }^{(n)}\in V^*\), and since \(\partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)\) is a subgradient of \(g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_k)\) with respect to \(\mathbf{A }^{(n)}\) at \(\mathbf{A }^{(n)}=\mathbf{A }^{(n)}_k\), we obtain for any \(\mathbf{A }^{(n)}\)

    $$\begin{aligned} \partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)^T(\mathbf{A }^{(n)}- \mathbf{A }^{(n)}_{k})\le g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_k)-g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k) \end{aligned}$$

    or equivalently

    $$\begin{aligned} -\partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)^T(\mathbf{A }^{(n)}_{k}- \mathbf{A }^{(n)})\le g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)-g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_k) \end{aligned}$$

    Hence, for any \(\mathbf{A }^{(n)}_k\in U\) and all \(k \ge 0\),

    $$\begin{aligned} \parallel \mathbf{A }^{(n)}_{k+1}- \mathbf{A }^{(n)}\parallel ^2&\le \parallel \mathbf{A }^{(n)}_{k}- \mathbf{A }^{(n)} \parallel ^2-2\sigma ^{(n)}(g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_k)-g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k))\\&\quad +(\sigma ^{(n)})^2 \parallel \partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k) \parallel ^2 \end{aligned}$$
  2. 2.

    In the same way, and by using the concavity of the function \(g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)})\) with respect to \(\mathbf{Y }^{(n)}\) for each \(\mathbf{A }^{(n)}\in U\), we can establish the second inequality in the Lemma 2.

\(\square \)

Now, we are going to give the proof of Lemma 1

Proof

Using the first statement of Lemma 2 and assumption of boundedness of \(\partial _{\mathbf{A }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)\), we have for any \(\mathbf{A }^{(n)}\in U\) and all \(i \ge 0\),

$$\begin{aligned} \parallel \mathbf{A }^{(n)}_{i+1}- \mathbf{A }^{(n)}\parallel ^2 \le \parallel \mathbf{A }^{(n)}_{i}- \mathbf{A }^{(n)}\parallel ^2-2\sigma ^{(n)}\left( g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_i)\right) +(\sigma ^{(n)})^2 L^2 \end{aligned}$$

therefore

$$\begin{aligned} g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_i) \le \frac{1}{2\sigma ^{(n)}}\left( \parallel \mathbf{A }^{(n)}_{i}- \mathbf{A }^{(n)}\parallel ^2-\parallel \mathbf{A }^{(n)}_{i+1}- \mathbf{A }^{(n)}\parallel ^2 \right) +\frac{\sigma ^{(n)} L^2}{2} \end{aligned}$$

Let us now sum the last relation from \(i = 0,\dots ,k-1\), we obtain for any \(\mathbf{A }^{(n)}\in U\) and \(i \ge 1\),

$$\begin{aligned}&\sum _{i=0}^{k-1}\left( g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_i)\right) \\&\qquad \le \frac{1}{2\sigma ^{(n)}}\left( \parallel \mathbf{A }^{(n)}_{0}- \mathbf{A }^{(n)}\parallel ^2-\parallel \mathbf{A }^{(n)}_{k}- \mathbf{A }^{(n)}\parallel ^2 \right) +\frac{k \sigma ^{(n)} L^2}{2} \end{aligned}$$

implying that

$$\begin{aligned} \frac{1}{k}\sum _{i=0}^{k-1}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-\frac{1}{k}\sum _{i=0}^{k-1}g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_i) \le \frac{\parallel \mathbf{A }^{(n)}_{0}- \mathbf{A} ^{(n)}\parallel ^2}{2\sigma ^{(n)} k}+\frac{\sigma ^{(n)} L^2}{2} \end{aligned}$$

Since the function \(g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)})\) is concave in \(\mathbf{Y }^{(n)}\) for any fixed \(\mathbf{A }^{(n)}\in U\), there holds

$$\begin{aligned} \frac{1}{k}\sum _{i=0}^{k-1}g(\mathbf{A }^{(n)},\mathbf{Y }^{(n)}_i)\le g(\mathbf{A }^{(n)},\widehat{\mathbf{Y }}^{(n)}_i) \end{aligned}$$

Combining the preceding two relations, we obtain for any \(\mathbf{A }^{(n)}\in U\) and \(k \ge 1\),

$$\begin{aligned} \frac{1}{k}\sum _{i=0}^{k-1}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\mathbf{A }^{(n)},\widehat{\mathbf{Y }}^{(n)}_k)\le \frac{\parallel \mathbf{A }^{(n)}_{0}- \mathbf{A} ^{(n)}\parallel ^2}{2\sigma ^{(n)} k}+\frac{\sigma ^{(n)} L^2}{2} \end{aligned}$$

thus establishing relation (32).

In the same way, using the second statement of Lemma 2, the convexity of the function g with respect to \(A^{(n)}\) and the Assumption of boundedness of \(\partial _{\mathbf{Y }^{(n)}} g(\mathbf{A }^{(n)}_k,\mathbf{Y }^{(n)}_k)\), we can establish the second relation (33) in the Lemma 1. \(\square \)

Now, we are going to establish the proof of the Theorem 2.

Proof

Using \(\mathbf{A }^{(n)}=\mathbf{A }^{(n)}_{*}\) and \(\mathbf{Y }^{(n)}=\mathbf{Y }^{(n)}_{*}\) in relations (32) and (33), respectively, we obtain for any \(k \ge 1\):

$$\begin{aligned} \frac{1}{k} \underset{i=0}{\overset{k-1}{\sum }}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\mathbf{A }^{(n)}_{*},\widehat{\mathbf{Y }}^{(n)}_k)\le & {} \frac{\parallel \mathbf{A }^{(n)}_0 -\mathbf{A }^{(n)}_{*}\parallel ^2}{2\sigma ^{(n)} k}+\frac{\sigma ^{(n)} L^2}{2}\\ -\frac{\parallel \mathbf{Y }^{(n)}_0 -\mathbf{Y }^{(n)}_{*}]\parallel ^2}{2\tau ^{(n)} k}-\frac{\tau ^{(n)} L^2}{2}\le & {} \frac{1}{k} \underset{i=0}{\overset{k-1}{\sum }}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\widehat{\mathbf{A }}^{(n)}_k,\mathbf{Y }^{(n)}_{*}) \end{aligned}$$

since U and \(V^*\) are convex sets, then \(\widehat{\mathbf{A }}^{(n)}_{k}\in U\) and \(\widehat{\mathbf{Y }}^{(n)}_{k}\in V^*\) for all \(k \ge 1\). therefore, by the saddle-point relation (8), we have

$$\begin{aligned} g(\mathbf{A }^{(n)}_{*},\widehat{\mathbf{Y }}^{(n)}_k)\le g(\mathbf{A }^{(n)}_{*},\mathbf{Y }^{(n)}_{*})\le g(\widehat{\mathbf{A }}^{(n)}_k,\mathbf{Y }^{(n)}_{*}) \end{aligned}$$

Combining the preceding three relations, we obtain then

$$\begin{aligned} -\frac{\parallel \mathbf{Y }^{(n)}_0 -\mathbf{Y }^{(n)}_{*}\parallel ^2}{2\tau ^{(n)}k }-\frac{\tau ^{(n)} L^2}{2}\le & {} \frac{1}{k} \underset{i=0}{\overset{k-1}{\sum }}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\mathbf{A }^{(n)}_{*},\mathbf{Y }^{(n)}_{*})\nonumber \\\le & {} \frac{\parallel \mathbf{A }^{(n)}_0 -\mathbf{A }^{(n)}_{*}\parallel ^2}{2\sigma ^{(n)} } +\frac{\sigma ^{(n)} L^2}{2} \end{aligned}$$
(34)

Moreover, since \(\widehat{\mathbf{A }}^{(n)}\in U\) and \(\widehat{\mathbf{Y }}^{(n)}\in V^*\) for all \(k \ge 1\), using Lemma 2with \(\mathbf{A }^{(n)}=\widehat{\mathbf{A }}^{(n)}\) and \(\mathbf{Y }^{(n)}=\widehat{\mathbf{Y }}^{(n)}\), we obtain for all \(k \ge 1\),

$$\begin{aligned} \frac{1}{k} \underset{i=0}{\overset{k-1}{\sum }}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\widehat{\mathbf{A }}^{(n)}_k,\widehat{\mathbf{Y }}^{(n)}_k)\le & {} \frac{\parallel \mathbf{A }^{(n)}_0 -\widehat{\mathbf{A }}^{(n)}_k\parallel ^2}{2\sigma ^{(n)} k}+\frac{\sigma ^{(n)} L^2}{2}\\ -\frac{\parallel \mathbf{Y }^{(n)}_0 -\widehat{\mathbf{Y }}^{(n)}_k\parallel ^2}{2\tau ^{(n)} k}-\frac{\tau ^{(n)} L^2}{2}\le & {} \frac{1}{k} \underset{i=0}{\overset{k-1}{\sum }}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)-g(\widehat{\mathbf{A }}^{(n)}_k,\widehat{\mathbf{Y }}^{(n)}_k) \end{aligned}$$

By multiplying by \(-1\) the preceding relations and combining them, we finally obtain

$$\begin{aligned} -\frac{\parallel \mathbf{A }^{(n)}_0 -\widehat{\mathbf{A }}^{(n)}_k\parallel ^2}{2\sigma ^{(n)}k }-\frac{\sigma ^{(n)} L^2}{2}\le & {} g(\widehat{\mathbf{A }}^{(n)}_k,\widehat{\mathbf{Y }}^{(n)}_k)-\frac{1}{k} \underset{i=0}{\overset{k-1}{\sum }}g(\mathbf{A }^{(n)}_i,\mathbf{Y }^{(n)}_i)\nonumber \\\le & {} \frac{\parallel \mathbf{A }^{(n)}_0 -\widehat{\mathbf{A }}^{(n)}_k\parallel ^2}{2\sigma ^{(n)} } +\frac{\sigma ^{(n)} L^2}{2} \end{aligned}$$
(35)

Using this last relation together with (34), we finally demonstrate the result stated by the Theorem 2 of convergence. \(\square \)

1.2 Appendix B

In this appendix, we provide the proof of the Proposition 1. But, first, we need the following results:

Lemma 3

Given \(\mathbf{z }\in {\mathbb {R}}^n\), a vector \(\mathbf{x }\in Z\) is equal to \(P_Z\{\mathbf{z }\}\) if and only if

$$\begin{aligned} <\mathbf{y }-\mathbf{x },\mathbf{z }-\mathbf{x }> \le 0\;\; \forall \mathbf{y }\in Z \end{aligned}$$
(36)

Proof

By definition of the orthogonal projection, \(P_Z \{\mathbf{z }\}\) is the minimizer of \(f(\mathbf{x })=|\mathbf{x }-\mathbf{z }|^2\) over all \(\mathbf{x }\in Z\), Moreover, if a vector \(\mathbf{z }\in Z\) minimizes \(f : {\mathbb {R}}^n \longrightarrow {\mathbb {R}}\), then \(<\mathbf{y }-\mathbf{z },\nabla f(\mathbf{z })> \ge 0,\; \forall \mathbf{y }\in Z\). We have

$$\begin{aligned} \nabla f(\mathbf{x })=2(\mathbf{x }-\mathbf{z }) \end{aligned}$$
(37)

and thus

$$\begin{aligned} <\mathbf{y }-\mathbf{x },\mathbf{z }-\mathbf{x }> \le 0\;\; \forall \mathbf{y }\in Z \end{aligned}$$
(38)

\(\square \)

Proof

(Proposition 1) Suppose that \(\mathbf{z }^*=P_Z\{\mathbf{z }^* -r\mathbf{h }(\mathbf{z }^*)\}\), then, thanks to Lemma 3 we have:

$$\begin{aligned} <\mathbf{z }-\mathbf{z }^*,-r\mathbf{h }(\mathbf{z }^*)> \le 0\;\; \forall \mathbf{z }\in Z \end{aligned}$$
(39)

and since r is positive, it follows that \(\mathbf{z }^*\) solves inequality (7). Conversely, suppose that \(\mathbf{z }^*\) solves inequality (7), then

$$\begin{aligned} <\mathbf{z }-\mathbf{z }^*,r\mathbf{h }(\mathbf{z }^*)> \le 0\;\; \forall \mathbf{z }\in Z \end{aligned}$$
(40)

which can be rewritten as

$$\begin{aligned} <\mathbf{z }-\mathbf{z }^*,\mathbf{z }^* -(\mathbf{z }^*-r\mathbf{h }(\mathbf{z }^*))> \le 0\;\; \forall \mathbf{z }\in Z \end{aligned}$$
(41)

and by using the lemma, we get

$$\begin{aligned} \mathbf{z }^*=P_Z\{\mathbf{z }^* -r\mathbf{h }(\mathbf{z }^*)\} \end{aligned}$$
(42)

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

EL Qate, K., El Rhabi, M., Hakim, A. et al. A Primal-Dual algorithm for nonnegative N-th order CP tensor decomposition: application to fluorescence spectroscopy data analysis. Multidim Syst Sign Process 33, 665–682 (2022). https://doi.org/10.1007/s11045-021-00818-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-021-00818-4

Keywords

Navigation