Skip to main content
Log in

Dual structural consistency based multi-modal correlation propagation projections for data representation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Canonical correlation analysis (CCA) is a powerful tool for analyzing multi-dimensional paired data. However, when facing semi-supervised multi-modal data (Also called multi-view Hou et al. (Pattern Recog 43(3):720–730, 2010) or multi-represented Kailing et al. (Clustering multi-represented objects with noise. In: Proceedings of the eighth Pacific-Asia conference on knowledge discovery and data mining (PAKDD). Sydney, Australia, pp 394–403) data. For convenience, we will uniformly call them multi-modal data hereafter.) which widely exist in real-world applications, CCA usually performs poorly due to ignoring useful supervised information. Meanwhile, due to the limited labeled training samples in the semi-supervised scenario, supervised extensions of CCA suffer from overfitting. Several semi-supervised extensions of CCA have been proposed recently. Nevertheless, they either just utilize the global structural information captured from the unlabeled data, or propagate label information by discovering the affinities just between the labeled and unlabeled data points in advance. In this paper, we propose a robust multi-modal semi-supervised feature extraction and fusion framework, termed as dual structural consistency based multi-modal correlation propagation projections (SCMCPP). SCMCPP guarantees the consistency between representation structure and hypotaxis structure in each modality and ensures the consistency of hypotaxis structure between two different modalities. By iteratively propagating labels and learning affinities, discriminative information of both given labels and estimated labels is utilized to improve the affinity construction and infer the remaining unknown labels. Moreover, probabilistic within-class scatter matrices in each modality and probabilistic correlation matrix between two modalities are constructed to enhance the discriminative power of features. Extensive experiments on several benchmark face databases demonstrate the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. It means that we can fully believe that one data point belongs to a certain category. In this paper, it refers to the data points which, in both modalities, have much bigger probabilistic label values corresponding to one category in contrast with the others.

  2. This is easy to understand in the first iteration, and the label propagated data points of both two modalities after each iteration are pair-wise which will be described below.

  3. For this method dealing with semi-paired scenarios and in order to distinguish it from SemiCCA in [30], we rename it in this paper as SemiPCCA.

References

  1. Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International Conference on Machine Learning (ICML), pp 1247–1255

  2. Barlaud M, Mathieu P, Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 36(5):961–1005

    Article  MathSciNet  MATH  Google Scholar 

  3. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm with application to wavelet-based image deblurring. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 693–696

  4. Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720

    Article  Google Scholar 

  5. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning 3(1):1–122

  6. Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: IEEE 11th international conference on computer vision 2007. ICCV 2007, pp 1–7

  7. Camastra F, Vinciarelli A (2002) Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans Pattern Anal Mach Intell 24(10):1404–1407

    Article  Google Scholar 

  8. Chapelle O (2006) Semi-supervised learning. MIT Press, Cambridge

    Book  Google Scholar 

  9. Chen X, Chen S, Xue H, Zhou X (2012) A unified dimensionality reduction framework for semi-paired and semi-supervised multi-view data. Pattern Recogn 45:2005–2018

    Article  MATH  Google Scholar 

  10. Chibelushi CC, Deravi F, Mason JSD (2002) A review of speech-based bimodal recognition. IEEE Trans Multimedia 4(1):23–37

    Article  Google Scholar 

  11. Chu DL, Liao LZ, Ng MK, Zhang XW (2013) Sparse canonical correlation analysis: new formulation and algorithm. IEEE Trans Pattern Anal Mach Intell 35(12):3050–3065

    Article  Google Scholar 

  12. Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781

    Article  Google Scholar 

  13. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugenics 7(2):179–188

    Article  Google Scholar 

  14. Fu Y, Yan S, Huang TS (2008) Correlation metric for generalized feature extraction. IEEE Trans Pattern Anal Mach Intell 30(12):2229–2235

    Article  Google Scholar 

  15. Guan N, Zhang X, Luo Z, Lan L (2012) Sparse representation based discriminative canonical correlation analysis for face recognition. In: 2012 I.E. 11th international conference on machine learning and applications (ICMLA), pp 51–56

  16. Hardoon DR, Shawe-Tayler JR (2011) Sparse canonical correlation analysis. Mach Learn J 83(3):331–353

    Article  MathSciNet  MATH  Google Scholar 

  17. He X, Yan S, Hu Y, Niyogi P, Zhang HJ (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340

    Article  Google Scholar 

  18. Hong M, Luo Z (2013) On the linear convergence of the alternating direction method of multipliers. [Online]. Available: http://arxiv.org/pdf/1208.3922v3.pdf

  19. Hotelling H (1936) Relations between two sets of variates. Biometrika 28(34):321–377

    Article  MATH  Google Scholar 

  20. Hou C, Zhang C, Wu Y, Nie F (2010) Multiple view semi-supervised dimensionality reduction. Pattern Recogn 43(3):720–730

    Article  MATH  Google Scholar 

  21. Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37

    Article  Google Scholar 

  22. Ji HK, Shen XB, Sun QS, Ji ZX (2015) Sparse discrimination based multiset canonical correlation analysis for multi-feature fusion and recognition. In: Proceedings of the 26th British machine vision conference (BMVC). Swansea, Britain

  23. Jolliffe IT (1986) Principle components analysis. Springer, New York

    Book  MATH  Google Scholar 

  24. Kimur A, Kameoka H, Sugiyama M, Nakano T, Maeda E, Sakano H, Ishiguro K (2010) SemiCCA: efficient semi-supervised learning of canonical correlations. IEEE international conference on pattern recognition (ICPR), Istanbul, pp 2933–2936

  25. Lampert CH, Kromer O (2010) Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: Proceedings of the 11th European conference on computer vision. Hersonissos, Greece, pp 566–579

  26. Li CG, Lin ZC, Zhang HG, Guo J (2015) Learning semi-supervised representation towards a unified optimization framework for semi-supervised learning. In: Proceedings of the 15th international conference on computer vision (ICCV), Santiago, Chile

  27. Lu J, Zhou X, Tan YP, Shang Y, Zhou J (2012) Cost-sensitive semi-supervised discriminant analysis for face recognition. IEEE Trans Inf Forensic Secur 7(3):944–953

    Article  Google Scholar 

  28. Martinez AM, Benavente R (1998) The AR face database, CVC technical report #24

  29. Melzer T, Reiter M, Bischof H (2003) Appearance models based on kernel canonical correlation analysis. Pattern Recogn 36(9):1961–1971

    Article  MATH  Google Scholar 

  30. Peng Y, Zhang D (2008) Semi-supervised canonical correlation analysis algorithm. J Softw 19:2822–2832

    Article  Google Scholar 

  31. Peng Y, Zhang D, Zhang J (2010) A new canonical correlation analysis algorithm with local discrimination. Neural Process Lett 31:1–15

    Article  Google Scholar 

  32. Sargin M, Yemez Y, Erzin E, Tekalp A (2007) Audio-visual synchronization and fusion using canonical correlation analysis. IEEE Trans Multimedia 9(7):1396–1403

    Article  Google Scholar 

  33. Shen XB, Sun QS (2014) A novel semi-supervised canonical correlation analysis and extensions for multi-view dimensionality reduction. J Vis Commun Image Represent 25:1894–1904

    Article  Google Scholar 

  34. Sim T, Baker S, Bsat M (2003) The CMU pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618

    Article  Google Scholar 

  35. Slaney M, Covell M (2000) FaceSync: a linear operator for measuring synchronization of video facial images and audio tracks. In: Annual Conference on Neural Information Processing Systems (NIPS), pp 814–820

  36. Song Y, Nie F, Zhang C, Xiang S (2008) A unified framework for semi-supervised dimensionality reduction. Pattern Recogn 41:2789–2799

    Article  MATH  Google Scholar 

  37. Sugiyama M, Ide T, Nakajima S, Sese J (2010) Semi-supervised local fisher discriminant analysis for dimensionality reduction. Mach Learn 78:35–61

    Article  MathSciNet  Google Scholar 

  38. Sun TK, Chen SC (2007) Locality preserving CCA with applications to data visualization and pose estimation. Image Vis Comput 25(5):531–543

    Article  Google Scholar 

  39. Sun TK, Chen SC, Yang JY, Shi PF (2008) A supervised combined feature extraction method for recognition. IEEE Int Conf Data Mining :1043–1048

  40. Sun L, Ji S, Ye J (2011) Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis. IEEE Trans Pattern Anal Mach Intell 33(1):194–200

    Article  Google Scholar 

  41. Sun QS, Liu ZD, Heng PA, Xia DS (2005) A theorem on the generalized canonical projective vectors. Pattern Recogn 38(3):449–452

    Article  MATH  Google Scholar 

  42. Sun QS, Zeng SG, Liu Y, Heng PA, Xia DS (2005) A new method of feature fusion and its application in image recognition. Pattern Recogn 38(12):2437–2448

    Article  Google Scholar 

  43. Sun QS, Zeng SG, Wang PA, Xia DS (2005) The theory of canonical correlation analysis with its applications to feature fusion. Chin J Comput 28(9):1524–1533

    MathSciNet  Google Scholar 

  44. Ting Y, Mei T, Ngo CW (2015) Learning query and image similarities with ranking canonical correlation analysis. In: Proceedings of the 15th international conference on computer vision (ICCV). Santiago, Chile

  45. Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86

    Article  Google Scholar 

  46. Waaijenborg S, de Witt Hamer PCV, Zwinderman AH (2008) Quantifying the association between gene expressions and dna-markers by penalized canonical correlation analysis. Stat Appl Genet Mol Biol 7(1), Article 3

  47. Wang WR, Arora R, Livescu K, Bilmes J (2015) Unsupervised learning of acoustic features via deep canonical correlation analysis. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 4590–4594

  48. Wang WR, Arora R, Srebro N, Livescu K (2015) Stochastic optimization for deep CCA via nonlinear orthogonal iterations. 53rd annual Allerton Conference on communication, control, and computing

  49. Warfield S (1996) Fast k-NN classification for multichannel image data. Pattern Recogn Lett 17(7):713–721

    Article  Google Scholar 

  50. Witten DM, Tibshirani R (2009) Extensions of sparse canonical correlation analysis with applications to genomic data. Stat Appl Genet Mol Biol 8(1), Article 28

  51. Witten DM, Tibshirani R, Hastie T (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3):515–534

    Article  Google Scholar 

  52. Wright J, Yang A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227

    Article  Google Scholar 

  53. Xu C, Tao DC, Xu C (2014) Large-margin multi-view Information bottleneck. IEEE Trans Pattern Anal Mach Intell 36(8):1559–1572

    Article  Google Scholar 

  54. Xu C, Tao DC, Xu C (2015) Multi-view intact space learning. IEEE Trans Pattern Anal Mach Intell 37(12):2531–2544

    Article  Google Scholar 

  55. Yang M, Cool LV, Zhang L (2013) Sparse variation dictionary learning for face recognition with a single training sample per person. 14th IEEE international conf. computer vision (ICCV), pp. 689–696

  56. Yu J, Tao D, Rui Y, Cheng J (2013) Pairwise constraints based multiview features fusion for scene classification. Pattern Recogn 46:483–496

    Article  MATH  Google Scholar 

  57. Zhang G, Jiang Z, Davis LS (2013) Online semi-supervised discriminative dictionary learning for sparse representation. Asian conference on computer vision (ACCV), pp 259–273

  58. Zhang GQ, Sun HJ, Ji ZX, Sun QS (2015) Label propagation based on collaborative representation for face recognition. Neurocomputing 171:1193–1204

    Article  Google Scholar 

  59. Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition? IEEE International Conference on Computer Vision (ICCV), Barcelona, pp 471–478

    Google Scholar 

  60. Zhang D, Zhou Z-H, Chen S (2007) Semi-supervised dimensionality reduction. In: SIAM International Conference on Data Mining (SDM), pp 629–634

Download references

Acknowledgments

This work is supported in part by Graduate Research and Innovation Foundation of Jiangsu Province, China under Grant KYLX15_0379, in part by the National Natural Science Foundation of China under Grants 61273251, 61401209, and 61402203, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20140790, and in part by China Postdoctoral Science Foundation under Grants 2014 T70525 and 2013 M531364.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hong-Kun Ji or Quan-Sen Sun.

Appendix

Appendix

The proof of Theorem 1

We can derive the formula from Eqs. (6) and (7) as

$$ {\left\Vert \Gamma \odot A\right\Vert}_1={\displaystyle {\sum}_{i,j}\left|{\Gamma}_{ij}\cdot {A}_{ij}\right|={\displaystyle {\sum}_{i,j}\left|{\Gamma}_{ij}\cdot \frac{1}{2}\left(\left|{W}_{ij}\right|+\left|{W}_{ji}\right|\right)\right|}}. $$

Owing to \( {\Gamma}_{ij}=\frac{1}{2}{\left\Vert {p}_i-{p}_j\right\Vert}^2={\Gamma}_{ji}\ge 0 \), we can get that

$$ \begin{array}{ll}{\left\Vert \varGamma \odot A\right\Vert}_1\hfill & \begin{array}{l}=\kern0.5em {\displaystyle {\sum}_{i,j}\frac{1}{2}\left(\left|{\varGamma}_{ij}\cdot \left|{W}_{ij}\right|\right|+\left|{\varGamma}_{ji}\cdot \left|{W}_{ji}\right|\right|\right)}\hfill \\ {}={\displaystyle {\sum}_{i,j}\left|{\varGamma}_{ij}\cdot \left|{W}_{ij}\right|\right|}\hfill \\ {}={\displaystyle {\sum}_{i,j}\left|{\varGamma}_{ij}\cdot {W}_{ij}\right|}\hfill \\ {}={\left\Vert \varGamma \odot W\right\Vert}_1\hfill \end{array}\hfill \end{array} $$

The proof of Theorem 2

Owing to the fact that \( {\overset{\sim }{S}}_w^{(X)} \) and \( {\overset{\sim }{S}}_w^{(Y)} \) are both non-singular square matrices, we can get that \( rank\left({\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy}{\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T\right)= rank\left({\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T{\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy}\right)= rank\left({\overset{\sim }{L}}_{xy}\right) \). And we obtain the following inequality

$$ rank\left({\tilde{L}}_{xy}\right)\le min\left\{ rank\left(X{P}^{(X)T}\right), rank\left({P}^{(Y)}{Y}^T\right)\right\}. $$

Furthermore, let \( {P}_{i\circ}^{(X)} \) denote the i-th row of \( {P}^{(X)} \), where \( i=1,2,\cdots, c \). Then we can obtain the following equality:

$$ X{P}^{(X)T}=X\cdot \left[{P}_{1\circ}^{(X)T},{P}_{2\circ}^{(X)T},\cdots, {P}_{c\circ}^{(X)T}\right]=\left[X{P}_{1\circ}^{(X)T},X{P}_{2\circ}^{(X)T},\cdots, X{P}_{c\circ}^{(X)T}\right]. $$

From the definition of the hard label and Eq. (16), for each \( {p}_j^{(X)} \) in \( {P}^{(X)}=\left[{p}_1^{(X)},{p}_2^{(X)},\cdots, {p}_N^{(X)}\right] \) where \( j=1,2,\cdots, N \), we have \( {\sum}_{i=1}^c{p}_{j,i}^{(X)}=1 \). So we can obtain the following equality:

$$ {P}_{1\circ}^{(X)T}+{P}_{2\circ}^{(X)T}+\cdots +{P}_{c\circ}^{(X)T}=\left[{\displaystyle {\sum}_{i=1}^c{p}_{1,i}^{(X)};}{\displaystyle {\sum}_{i=1}^c{p}_{2,i}^{(X)}};\cdots; {\displaystyle {\sum}_{i=1}^c{p}_{N,i}^{(X)}}\right]={1}_{N\times 1} $$

Then

$$ X{P}_{1\circ}^{(X)T}+X{P}_{2\circ}^{(X)T}+\cdots +X{P}_{c\circ}^{(X)T}=X\cdot {1}_{N\times 1}={\displaystyle {\sum}_{j=1}^N{x}_j={0}_{p\times 1}} $$

which means that all \( X{P}_{i\circ}^{(X)T} \) are linearly dependent. Therefore, \( rank\left(X{P}^{(X)T}\right)\le min\left\{v,c-1\right\} \). Similarly, we can get \( rank\left({P}^{(Y)}{Y}^T\right)\le min\left\{q,c-1\right\} \). Thus, we have \( rank\left({\overset{\sim }{L}}_{xy}\right)\le min\left\{v,q,c-1\right\} \)

Owing to the fact that c is always much less than v and q, we have \( rank\left({\overset{\sim }{L}}_{xy}\right)\le c-1 \). From Eq. (26), α and β are calculated as the eigenvectors of \( {\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy}{\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T \) and \( {\left({\overset{\sim }{S}}_w^{(Y)}\right)}^{-1}{\overset{\sim }{L}}_{xy}^T{\left({\overset{\sim }{S}}_w^{(X)}\right)}^{-1}{\overset{\sim }{L}}_{xy} \), respectively. Therefore, we infer that in SCMCPP, there are at most \( c-1 \) pairs of projection directions.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, HK., Sun, QS., Yuan, YH. et al. Dual structural consistency based multi-modal correlation propagation projections for data representation. Multimed Tools Appl 76, 20909–20933 (2017). https://doi.org/10.1007/s11042-016-3993-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3993-y

Keywords

Navigation