Abstract
This paper concerns maximizing the sum of coupled traces of quadratic and linear matrix forms. The coupling comes from requiring the matrix variables in the quadratic and linear matrix forms to be packed together to have orthonormal columns. At a maximum, the KKT condition becomes a nonlinear polar decomposition (NPD) of a matrix-valued function with dependency on the orthogonal polar factor. A self-consistent-field iteration, along with a locally optimal conjugate gradient (LOCG) acceleration, are proposed to compute the NPD. It is proved that both methods are convergent and it is demonstrated numerically that the LOCG acceleration is very effective. As applications, we demonstrate our methods on the MAXBET subproblem and the multi-view partially shared subspace learning (MvPS) subproblem, both of which sit at the computational kernels of two multi-view subspace learning models. In particular, we also demonstrate MvPS on several real world data sets.




Similar content being viewed by others
Notes
Computing the SVD of an \(n\times k\) matrix takes \(6nk^2+20k^3\) flops [18, p.493].
A MATLAB toolbox for OPTimization on MANifolds available online at https://www.manopt.org/.
A toolbox for STiefel manifold OPtimization available online at https://stmopt.gitee.io/.
All numerical demonstrations in this paper were done in MATLAB on a laptop with Intel(R) Core(TM) i7-1165G7 CPU 2.80 GHz and 32GB memory, except those in Sect. 5.2.2 which were performed on an EXXACT workstation (www.exxactcorp.com).
Up to this point, \(X_j\) as in (1.1) has been used for an orthonormal projection matrix. For the rest of this section, we will use them as data matrices as is done conventionally. Hopefully, no confusion will arise.
References
Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)
Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. SIAM, Philadelphia (1999)
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H. (eds.): Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia (2000)
Bai, Z., Li, R.C., Lu, D.: Sharp estimation of convergence rate for self-consistent field iteration to solve eigenvector-dependent nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl. 43(1), 301–327 (2022)
Balogh, J., Csendes, T., Rapcsá, T.: Some global optimization problems on Stiefel manifolds. J. Glob. Optim. 30, 91–101 (2004)
Birtea, P., Caşu, I., Comănescu, D.: First order optimality conditions and steepest descent algorithm on orthogonal Stiefel manifolds. Opt. Lett. 13, 1773–1791 (2019)
Bolla, M., Michaletzky, G., Tusnády, G., Ziermann, M.: Extrema of sums of heterogeneous quadratic forms. Linear Algebra Appl. 269(1), 331–365 (1998). https://doi.org/10.1016/S0024-3795(97)00230-9
Borg, I., Lingoes, J.: Multidimensional Similarity Structure Analysis. Springer-Verlag, New York (1987)
Boumal, N., Mishra, B., Absil, P.A., Sepulchre, R.: Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res. 15(42), 1455–1459 (2014)
Cai, Y., Zhang, L.H., Bai, Z., Li, R.C.: On an eigenvector-dependent nonlinear eigenvalue problem. SIAM J. Matrix Anal. Appl. 39(3), 1360–1382 (2018)
Chu, M.T., Trendafilov, N.T.: The orthogonally constrained regression revisited. J. Comput. Graph. Stat. 10(4), 746–771 (2001)
Cunningham, J.P., Ghahramani, Z.: Linear dimensionality reduction: survey, insights, and generalizations. J. Mach. Learn. Res. 16, 2859–2900 (2015)
Demmel, J.: Applied Numerical Linear Algebra. SIAM, Philadelphia (1997)
Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1999)
Eldén, L., Park, H.: A procrustes problem on the Stiefel manifold. Numer. Math. 82, 599–619 (1999)
Fan, K.: On a theorem of Weyl concerning eigenvalues of linear transformations. I. Proc. Natl. Acad. Sci. USA 35(11), 652–655 (1949)
Gao, B., Liu, X., Chen, X., Yuan, Y.X.: A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. 28(1), 302–332 (2018). https://doi.org/10.1137/16M1098759
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore (2013)
Gower, J.C., Dijksterhuis, G.B.: Procrustes Problems. Oxford University Press, New York (2004)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16, 2639–2664 (2004)
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, New York (2013)
Hotelling, H.: Relations between two sets of variates. Biometrika 28(3–4), 321–377 (1936)
Hurley, J.R., Cattell, R.B.: The Procrustes program: producing direct rotation to test a hypothesized factor structure. Comput. Behav. Sci. 7, 258–262 (1962)
Imakura, A., Li, R.C., Zhang, S.L.: Locally optimal and heavy ball GMRES methods. Jpn. J. Ind. Appl. Math. 33, 471–499 (2016)
Kanzow, C., Qi, H.D.: A QP-free constrained Newton-type method for variational inequality problems. Math. Program. 85, 81–106 (1999)
Knyazev, A.V.: Toward the optimal preconditioned eigensolver: locally optimal block preconditioned conjugate gradient method. SIAM J. Sci. Comput. 23(2), 517–541 (2001)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 2169–2178. IEEE (2006)
Li, F.F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)
Li, L., Zhang, Z.: Semi-supervised domain adaptation by covariance matching. IEEE Trans. Pattern Anal. Mach. Intell. 41(11), 2724–2739 (2019). https://doi.org/10.1109/TPAMI.2018.2866846
Li, R.C.: New perturbation bounds for the unitary polar factor. SIAM J. Matrix Anal. Appl. 16, 327–332 (1995)
Li, R.C.: Relative perturbation bounds for the unitary polar factor. BIT 37, 67–75 (1997)
Li, R.C.: Matrix perturbation theory. In: Hogben, L., Brualdi, R., Stewart, G.W. (eds.) Handbook of Linear Algebra, 2nd edn, Chapter 21. CRC Press, Boca Raton (2014)
Li, R.C.: Rayleigh quotient based optimization methods for eigenvalue problems. In: Bai, Z., Gao, W., Su, Y. (eds.) Matrix Functions and Matrix Equations, Series in Contemporary Applied Mathematics, vol. 19, pp. 76–108. World Scientific, Singapore (2015)
Li, Y., Yang, M., Zhang, Z.: A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 31(10), 1863–1883 (2018)
Liu, X.G., Wang, X.F., Wang, W.G.: Maximization of matrix trace function of product Stiefel manifolds. SIAM J. Matrix Anal. Appl. 36(4), 1489–1506 (2015)
Ma, X., Shen, C., Wang, L., Zhang, L.H., Li, R.C.: A self-consistent-field iteration for MAXBET with an application to multi-view feature extraction. Adv. Comput. Math. 48, 13 (2022)
Ma, X., Wang, L., Zhang, L.H., Shen, C., Li, R.C.: Multi-view partially shared subspace learning (2021). Submitted
Moré, J., Sorensen, D.: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4(3), 553–572 (1983)
Nie, F., Zhang, R., Li, X.: A generalized power iteration method for solving quadratic problem on the Stiefel manifold. Sci. China Inf. Sci. 60, 1–10 (2017)
Nielsen, A.A.: Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data. IEEE Trans. Image Process. 11(3), 293–305 (2002)
Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer (2006)
Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Parlett, B.N.: The Symmetric Eigenvalue Problem. SIAM, Philadelphia,: This SIAM edition is an unabridged, corrected reproduction of the work first published by Prentice-Hall Inc, p. 1980. Englewood Cliffs, New Jersey (1998)
Polyak, B.T.: Introduction to Optimization. Optimization Software, New York (1987)
Rapcsák, T.: On minimization on Stiefel manifolds. Eur. J. Oper. Res. 143(2), 365–376 (2002)
Saad, Y.: Numerical Methods for Large Eigenvalue Problems. Manchester University Press, Manchester (1992)
Sharma, A., Kumar, A., Daume, H., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2160–2167 (2012)
Stewart, G.W.: Matrix Algorithms, Eigensystems, vol. II. SIAM, Philadelphia (2001)
Sun, J.G.: Matrix Perturbation Analysis. Graduate Texts (Academia, Sinica), 2nd edn. Science Publisher, Beijing (2001). (in Chinese)
Takahashi, I.: A note on the conjugate gradient method. Inf. Process. Jpn. 5, 45–49 (1965)
Ten Berge, J.M.F.: Generalized approaches to the MAXBET problem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika 53(4), 487–494 (1984)
Van de Geer, J.P.: Linear relations among \(k\) sets of variables. Psychometrika 49(1), 70–94 (1984)
Wang, L., Gao, B., Liu, X.: Multipliers correction methods for optimization problems over the Stiefel manifold. CSIAM Trans. Appl. Math. 2(3), 508–531 (2021). https://doi.org/10.4208/csiam-am.SO-2020-0008
Wang, L., Li, R.C.: A scalable algorithm for large-scale unsupervised multi-view partial least squares. IEEE Trans. Big Data (2020). https://doi.org/10.1109/TBDATA.2020.3014937
Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Math. Program. 142(1–2), 397–434 (2013)
Wu, J., Rehg, J.M.: Where am I: place instance and category recognition using spatial pact. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Yang, M., Li, R.C.: Heavy ball flexible GMRES method for nonsymmetric linear systems. J. Comput. Math. (2021). To appear
Zhang, L.H.: Riemannian trust-region method for the maximal correlation problem. Numer. Funct. Anal. Optim. 33(3), 338–362 (2012)
Zhang, L.H., Li, R.C.: Maximization of the sum of the trace ratio on the Stiefel manifold, I: theory. Sci. China Math. 57(12), 2495–2508 (2014)
Zhang, L.H., Li, R.C.: Maximization of the sum of the trace ratio on the Stiefel manifold, II: computation. Sci. China Math. 58(7), 1549–1566 (2015)
Zhang, L.H., Wang, L., Bai, Z., Li, R.C.: A self-consistent-field iteration for orthogonal canonical correlation analysis. IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 890–904 (2022). https://doi.org/10.1109/TPAMI.2020.3012541
Zhang, L.H., Yang, W.H., Shen, C., Ying, J.: An eigenvalue-based method for the unbalanced Procrustes problem. SIAM J. Matrix Anal. Appl. 41(3), 957–983 (2020)
Zhang, Z., Du, K.: Successive projection method for solving the unbalanced procrustes problem. Sci. China Math. 49(7), 971–986 (2006)
Zhao, H., Wang, Z., Nie, F.: Orthogonal least squares regression for feature extraction. Neurocomputing 216, 200–207 (2016)
Zhou, Y., Li, R.C.: Bounding the spectrum of large Hermitian matrices. Linear Algebra Appl. 435, 480–493 (2011)
Acknowledgements
The authors wish to thank the two anonymous referees for their constructive suggestions that greatly improved the presentation of this paper. They are indebted to Prof. M. Overton of New York University for his numerous minor but important corrections across the manuscript. Wang was supported in part by NSF DMS-2009689; Zhang was supported in part by the National Natural Science Foundation of China NSFC-12071332; Li was supported in part by NSF DMS-1719620 and DMS-2009689.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, L., Zhang, LH. & Li, RC. Maximizing sum of coupled traces with applications. Numer. Math. 152, 587–629 (2022). https://doi.org/10.1007/s00211-022-01322-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-022-01322-y