Abstract
Canonical correlation analysis (CCA) is a powerful tool for analyzing multi-dimensional paired data. However, when facing semi-supervised multi-modal data (Also called multi-view Hou et al. (Pattern Recog 43(3):720–730, 2010) or multi-represented Kailing et al. (Clustering multi-represented objects with noise. In: Proceedings of the eighth Pacific-Asia conference on knowledge discovery and data mining (PAKDD). Sydney, Australia, pp 394–403) data. For convenience, we will uniformly call them multi-modal data hereafter.) which widely exist in real-world applications, CCA usually performs poorly due to ignoring useful supervised information. Meanwhile, due to the limited labeled training samples in the semi-supervised scenario, supervised extensions of CCA suffer from overfitting. Several semi-supervised extensions of CCA have been proposed recently. Nevertheless, they either just utilize the global structural information captured from the unlabeled data, or propagate label information by discovering the affinities just between the labeled and unlabeled data points in advance. In this paper, we propose a robust multi-modal semi-supervised feature extraction and fusion framework, termed as dual structural consistency based multi-modal correlation propagation projections (SCMCPP). SCMCPP guarantees the consistency between representation structure and hypotaxis structure in each modality and ensures the consistency of hypotaxis structure between two different modalities. By iteratively propagating labels and learning affinities, discriminative information of both given labels and estimated labels is utilized to improve the affinity construction and infer the remaining unknown labels. Moreover, probabilistic within-class scatter matrices in each modality and probabilistic correlation matrix between two modalities are constructed to enhance the discriminative power of features. Extensive experiments on several benchmark face databases demonstrate the effectiveness of our approach.
Similar content being viewed by others
Notes
It means that we can fully believe that one data point belongs to a certain category. In this paper, it refers to the data points which, in both modalities, have much bigger probabilistic label values corresponding to one category in contrast with the others.
This is easy to understand in the first iteration, and the label propagated data points of both two modalities after each iteration are pair-wise which will be described below.
For this method dealing with semi-paired scenarios and in order to distinguish it from SemiCCA in [30], we rename it in this paper as SemiPCCA.
References
Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International Conference on Machine Learning (ICML), pp 1247–1255
Barlaud M, Mathieu P, Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 36(5):961–1005
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm with application to wavelet-based image deblurring. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 693–696
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning 3(1):1–122
Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: IEEE 11th international conference on computer vision 2007. ICCV 2007, pp 1–7
Camastra F, Vinciarelli A (2002) Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans Pattern Anal Mach Intell 24(10):1404–1407
Chapelle O (2006) Semi-supervised learning. MIT Press, Cambridge
Chen X, Chen S, Xue H, Zhou X (2012) A unified dimensionality reduction framework for semi-paired and semi-supervised multi-view data. Pattern Recogn 45:2005–2018
Chibelushi CC, Deravi F, Mason JSD (2002) A review of speech-based bimodal recognition. IEEE Trans Multimedia 4(1):23–37
Chu DL, Liao LZ, Ng MK, Zhang XW (2013) Sparse canonical correlation analysis: new formulation and algorithm. IEEE Trans Pattern Anal Mach Intell 35(12):3050–3065
Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7(2):179–188
Fu Y, Yan S, Huang TS (2008) Correlation metric for generalized feature extraction. IEEE Trans Pattern Anal Mach Intell 30(12):2229–2235
Guan N, Zhang X, Luo Z, Lan L (2012) Sparse representation based discriminative canonical correlation analysis for face recognition. In: 2012 I.E. 11th international conference on machine learning and applications (ICMLA), pp 51–56
Hardoon DR, Shawe-Tayler JR (2011) Sparse canonical correlation analysis. Mach Learn J 83(3):331–353
He X, Yan S, Hu Y, Niyogi P, Zhang HJ (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Hong M, Luo Z (2013) On the linear convergence of the alternating direction method of multipliers. [Online]. Available: http://arxiv.org/pdf/1208.3922v3.pdf
Hotelling H (1936) Relations between two sets of variates. Biometrika 28(34):321–377
Hou C, Zhang C, Wu Y, Nie F (2010) Multiple view semi-supervised dimensionality reduction. Pattern Recogn 43(3):720–730
Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37
Ji HK, Shen XB, Sun QS, Ji ZX (2015) Sparse discrimination based multiset canonical correlation analysis for multi-feature fusion and recognition. In: Proceedings of the 26th British machine vision conference (BMVC). Swansea, Britain
Jolliffe IT (1986) Principle components analysis. Springer, New York
Kimur A, Kameoka H, Sugiyama M, Nakano T, Maeda E, Sakano H, Ishiguro K (2010) SemiCCA: efficient semi-supervised learning of canonical correlations. IEEE international conference on pattern recognition (ICPR), Istanbul, pp 2933–2936
Lampert CH, Kromer O (2010) Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: Proceedings of the 11th European conference on computer vision. Hersonissos, Greece, pp 566–579
Li CG, Lin ZC, Zhang HG, Guo J (2015) Learning semi-supervised representation towards a unified optimization framework for semi-supervised learning. In: Proceedings of the 15th international conference on computer vision (ICCV), Santiago, Chile
Lu J, Zhou X, Tan YP, Shang Y, Zhou J (2012) Cost-sensitive semi-supervised discriminant analysis for face recognition. IEEE Trans Inf Forensic Secur 7(3):944–953
Martinez AM, Benavente R (1998) The AR face database, CVC technical report #24
Melzer T, Reiter M, Bischof H (2003) Appearance models based on kernel canonical correlation analysis. Pattern Recogn 36(9):1961–1971
Peng Y, Zhang D (2008) Semi-supervised canonical correlation analysis algorithm. J Softw 19:2822–2832
Peng Y, Zhang D, Zhang J (2010) A new canonical correlation analysis algorithm with local discrimination. Neural Process Lett 31:1–15
Sargin M, Yemez Y, Erzin E, Tekalp A (2007) Audio-visual synchronization and fusion using canonical correlation analysis. IEEE Trans Multimedia 9(7):1396–1403
Shen XB, Sun QS (2014) A novel semi-supervised canonical correlation analysis and extensions for multi-view dimensionality reduction. J Vis Commun Image Represent 25:1894–1904
Sim T, Baker S, Bsat M (2003) The CMU pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
Slaney M, Covell M (2000) FaceSync: a linear operator for measuring synchronization of video facial images and audio tracks. In: Annual Conference on Neural Information Processing Systems (NIPS), pp 814–820
Song Y, Nie F, Zhang C, Xiang S (2008) A unified framework for semi-supervised dimensionality reduction. Pattern Recogn 41:2789–2799
Sugiyama M, Ide T, Nakajima S, Sese J (2010) Semi-supervised local fisher discriminant analysis for dimensionality reduction. Mach Learn 78:35–61
Sun TK, Chen SC (2007) Locality preserving CCA with applications to data visualization and pose estimation. Image Vis Comput 25(5):531–543
Sun TK, Chen SC, Yang JY, Shi PF (2008) A supervised combined feature extraction method for recognition. IEEE Int Conf Data Mining :1043–1048
Sun L, Ji S, Ye J (2011) Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis. IEEE Trans Pattern Anal Mach Intell 33(1):194–200
Sun QS, Liu ZD, Heng PA, Xia DS (2005) A theorem on the generalized canonical projective vectors. Pattern Recogn 38(3):449–452
Sun QS, Zeng SG, Liu Y, Heng PA, Xia DS (2005) A new method of feature fusion and its application in image recognition. Pattern Recogn 38(12):2437–2448
Sun QS, Zeng SG, Wang PA, Xia DS (2005) The theory of canonical correlation analysis with its applications to feature fusion. Chin J Comput 28(9):1524–1533
Ting Y, Mei T, Ngo CW (2015) Learning query and image similarities with ranking canonical correlation analysis. In: Proceedings of the 15th international conference on computer vision (ICCV). Santiago, Chile
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Waaijenborg S, de Witt Hamer PCV, Zwinderman AH (2008) Quantifying the association between gene expressions and dna-markers by penalized canonical correlation analysis. Stat Appl Genet Mol Biol 7(1), Article 3
Wang WR, Arora R, Livescu K, Bilmes J (2015) Unsupervised learning of acoustic features via deep canonical correlation analysis. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4590–4594
Wang WR, Arora R, Srebro N, Livescu K (2015) Stochastic optimization for deep CCA via nonlinear orthogonal iterations. 53rd annual Allerton Conference on communication, control, and computing
Warfield S (1996) Fast k-NN classification for multichannel image data. Pattern Recogn Lett 17(7):713–721
Witten DM, Tibshirani R (2009) Extensions of sparse canonical correlation analysis with applications to genomic data. Stat Appl Genet Mol Biol 8(1), Article 28
Witten DM, Tibshirani R, Hastie T (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3):515–534
Wright J, Yang A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Xu C, Tao DC, Xu C (2014) Large-margin multi-view Information bottleneck. IEEE Trans Pattern Anal Mach Intell 36(8):1559–1572
Xu C, Tao DC, Xu C (2015) Multi-view intact space learning. IEEE Trans Pattern Anal Mach Intell 37(12):2531–2544
Yang M, Cool LV, Zhang L (2013) Sparse variation dictionary learning for face recognition with a single training sample per person. 14th IEEE international conf. computer vision (ICCV), pp. 689–696
Yu J, Tao D, Rui Y, Cheng J (2013) Pairwise constraints based multiview features fusion for scene classification. Pattern Recogn 46:483–496
Zhang G, Jiang Z, Davis LS (2013) Online semi-supervised discriminative dictionary learning for sparse representation. Asian conference on computer vision (ACCV), pp 259–273
Zhang GQ, Sun HJ, Ji ZX, Sun QS (2015) Label propagation based on collaborative representation for face recognition. Neurocomputing 171:1193–1204
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition? IEEE International Conference on Computer Vision (ICCV), Barcelona, pp 471–478
Zhang D, Zhou Z-H, Chen S (2007) Semi-supervised dimensionality reduction. In: SIAM International Conference on Data Mining (SDM), pp 629–634
Acknowledgments
This work is supported in part by Graduate Research and Innovation Foundation of Jiangsu Province, China under Grant KYLX15_0379, in part by the National Natural Science Foundation of China under Grants 61273251, 61401209, and 61402203, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20140790, and in part by China Postdoctoral Science Foundation under Grants 2014 T70525 and 2013 M531364.
Author information
Authors and Affiliations
Corresponding authors
Appendix
Appendix
The proof of Theorem 1
We can derive the formula from Eqs. (6) and (7) as
Owing to \( {\Gamma}_{ij}=\frac{1}{2}{\left\Vert {p}_i-{p}_j\right\Vert}^2={\Gamma}_{ji}\ge 0 \), we can get that
The proof of Theorem 2
Owing to the fact that \( {\overset{\sim }{S}}_w^{(X)} \) and \( {\overset{\sim }{S}}_w^{(Y)} \) are both non-singular square matrices, we can get that \( rank\left({\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy}{\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T\right)= rank\left({\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T{\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy}\right)= rank\left({\overset{\sim }{L}}_{xy}\right) \). And we obtain the following inequality
Furthermore, let \( {P}_{i\circ}^{(X)} \) denote the i-th row of \( {P}^{(X)} \), where \( i=1,2,\cdots, c \). Then we can obtain the following equality:
From the definition of the hard label and Eq. (16), for each \( {p}_j^{(X)} \) in \( {P}^{(X)}=\left[{p}_1^{(X)},{p}_2^{(X)},\cdots, {p}_N^{(X)}\right] \) where \( j=1,2,\cdots, N \), we have \( {\sum}_{i=1}^c{p}_{j,i}^{(X)}=1 \). So we can obtain the following equality:
Then
which means that all \( X{P}_{i\circ}^{(X)T} \) are linearly dependent. Therefore, \( rank\left(X{P}^{(X)T}\right)\le min\left\{v,c-1\right\} \). Similarly, we can get \( rank\left({P}^{(Y)}{Y}^T\right)\le min\left\{q,c-1\right\} \). Thus, we have \( rank\left({\overset{\sim }{L}}_{xy}\right)\le min\left\{v,q,c-1\right\} \)
Owing to the fact that c is always much less than v and q, we have \( rank\left({\overset{\sim }{L}}_{xy}\right)\le c-1 \). From Eq. (26), α and β are calculated as the eigenvectors of \( {\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy}{\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T \) and \( {\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T{\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy} \), respectively. Therefore, we infer that in SCMCPP, there are at most \( c-1 \) pairs of projection directions.
Rights and permissions
About this article
Cite this article
Ji, HK., Sun, QS., Yuan, YH. et al. Dual structural consistency based multi-modal correlation propagation projections for data representation. Multimed Tools Appl 76, 20909–20933 (2017). https://doi.org/10.1007/s11042-016-3993-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3993-y