Abstract
Canonical correlation analysis (CCA) is a widely used technique for analyzing two datasets (two views of the same objects). However, CCA needs that the samples of the two views are fully-paired. Actually, we are often faced up with the semi-paired scenario where the number of available paired samples is limited and yet the number of unpaired samples is sufficient. For such a scenario, CCA is generally prone to overfitting and thus performs poorly, since its definition itself makes it only able to utilize those paired samples. To overcome such a shortcoming, several semi-paired variants of CCA have been proposed. However, unpaired samples in these methods are just used in the way of single-view leaning to capture individual views’ structure information for regularizing CCA. Intuitively, using unpaired samples in the way of two-view learning should be more natural and more attractive since CCA itself is a two-view learning method. As a result, a novel CCAs semi-paired variant named Neighborhood Correlation Analysis (NeCA), which uses unpaired samples in the two-view learning way, is developed through incorporating between-view neighborhood relationships into CCA. The relationships are acquired through leveraging within-view neighborhood relationships of each view’s all data (including paired and unpaired data) and between-view paired information. Thus, it can take more sufficient advantage of the unpaired samples and then mitigate overfitting effectively caused by the limited paired data. Promising experiments results on several popular multi-view datasets show its feasibility and effectiveness.
Similar content being viewed by others
References
McFee B, Lanckriet G (2011) Learning multi-modal similarity. J Mach Learn Res 12: 491–523
Hou C, Zhang C, Wu Y, Nei F (2010) Multiple view semi-supervised dimensionality reduction. Pattern Recognit 43(3): 720–730
Bickel S, Scheffer T (2004) Multi-view clustering. In: International conferrence on data mining (ICDM), pp 19–26
de Sa Virginia R, Gallagher Patrick W, Lewis Joshua M, Malave Vicente L (2010) Multi-view kernel construction. Mach Learn 79(1–2): 47–71
Ando KR, Zhang T (2007) Two-view feature generation model for semi-supervised learning. In: International conference on machine learning (ICML), pp 25–32
Li G, Hoi Steven CH, Chang K (2010) Two-View transductive support vector machines. In: SIAM international conference on data mining (SDM), pp 235–244
Szedmaka S, Shawe-Taylorc J (2007) Synthesis of maximum margin and multiview learning using unlabeled data. Neurocomputing 70(7–9): 1254–1264
Hardoon DR, Szedmaka S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning method. Neural Comput 16(12): 2639–2664
Correa NM, Eichele T, Adalı T, Li Y-O et al (2010) Multi-set canonical correlation analysis for the fusion of concurrent single trial ERP and functional MRI. NeuroImage 50(4): 1438–1445
Chaudhuri K, Kakade SM, Livescu K, Sridharan K (2009) Multi-view clustering via canonical correlation analysis. In: International conference on machine learning (ICML), pp 129–136
Suna Q, Zeng S, Liu Y, Heng P et al (2005) A new method of feature fusion and its application in image recognition. Pattern Recognit 38(12): 2437–2448
Hardoon DR, Shawe-Taylor J (2011) Sparse canonical correlation analysis. Machine Learning 83(3): 331–353
Hotelling Harold (1936) Relations between two sets of variates. Biometrika 28(3–4): 321–377
Zhu X (2008) Semi-supervised learning literature survey. Technical Report, Computer Sciences, University of Wisconsin-Madison
Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge
Blaschko MB, Lampert CH, Gretton A (2008) Semi-supervised Laplacian regularization of kernel canonical correlation analysis. In: European conference on machine learning and knowledge discovery in databases (ECML PKDD), pp 133–145
Kimura A, Kameoka H, Sugiyama M, Nakano T (2010) SemiCCA: efficient semi-supervised learning of canonical correlations. In: International conference on pattern recognition (ICPR), pp 2933–2936
Chen X, Chen S, Xue H, Zhou X (2012) A unified dimensionality reduction framework for semi-paired and semi-supervised multi-view data. Pattern Recognit 45(5): 2005–2018
Melzer T, Reiter M, Bischof H (2003) Appearance models based on kernel canonical correlation analysis. Pattern Recognit 39(9): 1961–1971
Mackiewicz A, Ratajczak W (1993) Principal components analysis (PCA). Comput Geosci 19: 303–342
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1): 71–86
Belkin M, Niyogi P, Sindhwani V, Bartlett P (2006) Manifold regularization: a geometric framework for learning from examples. J Mach Learn Res 7: 2399–2434
Peng Y, Zhang D, Zhang J (2010) A new canonical correlation analysis algorithm with local discrimination. Neural Process Lett 31(1): 1–15
Aria H, Liang P, Berg-kirkpatrick T, Klein D (2008) Learning bilingual lexicons from monolingual corpora. In: Annual meeting of the Association for Computational Linguistics, pp 771–779
Tripathi A, Klami A, Virpioja S (2010) Bilingual sentence matching using kernel CCA. In: IEEE international workshop on machine learning for signal processing (MLSP), pp 130–135
Tripathi A, Klami A, Orešič M, Kaski S (2011) Matching samples of multiple views. Data Min Knowl Discov 23(2): 300–321
Yamada M, Sugiyama M (2011) Cross-domain object matching with model selection. In: International conference on artificial intelligence and statistics (AISTATS)
Wang C, Mahadevan S (2009) Manifold alignment without correspondence. In: international jont conference on artifical intelligence, pp 1273–1278
Vía J, Santamaría I, Pérez J (2007) A learning algorithm for adaptive canonical correlation analysis of several data sets. Neural Netw 20(1): 139–152
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: International conference on learning theory (COLT), pp 92–100
Sindhwani V, Niyogi P, Belkin M (2005) Beyond the point cloud: from transductive to semi-supervised learning. In: International conference on machine learning (ICML), pp 824–831
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7): 711–720
Mäenpää T, Ojala T, Pietikäinen M, Soriano M (2000) Robust texture classification by subsets of local binary patterns. In: International conference on pattern recognition (ICPR), pp 3947–3950
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, X., Chen, X. & Chen, S. Neighborhood Correlation Analysis for Semi-paired Two-View Data. Neural Process Lett 37, 335–354 (2013). https://doi.org/10.1007/s11063-012-9251-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9251-z