Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

Kuang, Zhanghui; Wong, Kwan-Yee K.

doi:10.1007/s11263-014-0783-8

Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

Published: 12 November 2014

Volume 113, pages 176–192, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Zhanghui Kuang¹ &
Kwan-Yee K. Wong¹

562 Accesses
10 Citations
Explore all metrics

Abstract

Discovering a latent common space between different modalities plays an important role in cross-modality pattern recognition. Existing techniques often require absolutely-paired observations as training data, and are incapable of capturing more general semantic relationships between cross-modality observations. This greatly limits their applications. In this paper, we propose a general framework for learning a latent common space from relatively-paired observations (i.e., two observations from different modalities are more-likely-paired than another two). Relative-pairing information is encoded using relative proximities of observations in the latent common space. By building a discriminative model and maximizing a distance margin, a projection function that maps observations into the latent common space is learned for each modality. Cross-modality pattern recognition can then be carried out in the latent common space. To speed up the learning procedure for large scale training data, the problem is reformulated into learning a structural model, which is efficiently solved by the cutting plane algorithm. To evaluate the performance of the proposed framework, it has been applied to feature fusion, cross-pose face recognition, text-image retrieval and attribute-image retrieval. Experimental results demonstrate that the proposed framework outperforms other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual structural consistency based multi-modal correlation propagation projections for data representation

Article 13 October 2016

Discriminative Subspace Learning for Cross-view Classification with Simultaneous Local and Global Alignment

Correlation maximization machine for multi-modalities multiclass classification

Article 18 February 2019

Notes

The number of variables is very small.
http://archive.ics.uci.edu/ml/datasets/Multiple+Features.
http://vasc.ri.cmu.edu/idb/html/face/.
http://vipl.ict.ac.cn/members/mnkan.

References

Andrea, F., Yoram, S., Sha, F., & Jitendra, M. (2007). Learning globally-consistent local distance functions for shape-based image retrieval. In: ICCV (pp. 1–8).
Bach, F., & Jordan, M. (2005). A probabilistic interpretation of canonical correlation analysis. Technical Report: Department of Statistics, University of California, Berkeley.
Blanz, V., Grother, P., Phillips, P., & Vetter, T. (2005). Face recognition based on frontal views generated from non-frontal images. In: CVPR (pp. 454–461).
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993–1022.
MATH Google Scholar
Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In: COMPSTAT (pp. 177–187).
Bronstein, M., & Bronstein, A. (2010). Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR (pp. 3594–3601).
Chai, X., Shan, S., Chen, X., & Gao, W. (2007). Locally linear regression for pose-invariant face recognition. TIP, 16(7), 1716–1725.
MathSciNet Google Scholar
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I. (2007). Information theoretic metric learning. In: ICML (pp. 209–216).
Ek, C.H., Rihan, J., Torr, P.H.S., Rogez, G., & Lawrence, N.D. (2008). Ambiguity modeling in latent spaces. In: MLMI (pp. 62–73).
Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2004). Neighbourhood components analysis. In: NIPS (pp. 513–520).
Gong, Y., Ke, Q., Isard, M., & Lazebnik, S. (2014). A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 106(2), 210–233.
Article Google Scholar
Gross, R., Matthews, I., & Baker, S. (2004). Appearance-based face recognition and light-fields. PAMI, 26(4), 449–465.
Article Google Scholar
Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.
Article MATH Google Scholar
Joachims, T. (2006). Training linear SVMs in linear time. In: KDD, pp 217–226.
Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.
Article MATH Google Scholar
Kan, M., Shan, S., & Zhang, H. (2012). Multi-view discriminant analysis. In: ECCV (pp. 808–821).
Knutsson, H., Borga, M., & Tomas, L. (1997). Learning canonical correlations. In: SCIA, Computer Vision Laboratory, vol 1.
Kuang, Z., & Wong, K.Y.K. (2013). Relatively-paired space analysis. In: BMVC.
Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In: ICCV.
Lampert, C., & Krömer, O. (2010). Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: ECCV (pp. 566–579).
Lin, D., & Tang, X. (2005). Coupled space learning of image style transformation. In: ICCV (pp. 1699–1706).
Lin, D., & Tang, X. (2006). Inter-modality face recognition. In: ECCV (pp. 13–26).
Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1), 503–528.
Article MATH MathSciNet Google Scholar
Liu, X., & Chen, T. (2005). Pose-robust face recognition using geometry assisted probabilistic modeling. CVPR, 1, 502–509.
Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.
Article Google Scholar
Navaratnam, R., Fitzgibbon, A.W., & Cipolla, R. (2007). The joint manifold model for semi-supervised multi-valued regression. In: ICCV (pp. 1–8).
Parameswaran, S., & Weinberger, K. (2010). Large margin multi-task metric learning. In: NIPS (pp. 1–9).
Parikh, D., & Grauman, K. (2011). Relative attributes. In: ICCV.
Prince, S., Warrell, J., Elder, J., & Felisberti, F. (2008). Tied factor analysis for face recognition across large pose differences. PAMI, 30(6), 970–984.
Article Google Scholar
Quadrianto, N., & Lampert, C. (2011). Learning multi-view neighborhood preserving projections. In: ICML (pp. 425–432).
Rakotomamonjy, A. (2004). Support vector machines and area under ROC curves. PSI-INSA de Rouen: Technical Report.
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., & Vasconcelos, N. (2010). A new approach to cross-modal multimedia retrieval. In: ACM MM (pp. 251–260).
Rosipal, R., & Krämer, N. (2006). Overview and recent advances in partial least squares. Subspace, latent structure and feature selection (pp. 34–51). Berlin: Springer.
Chapter Google Scholar
Rupnik, J., & Shawe-Taylor, J. (2010). Multi-view canonical correlation analysis. In: SiKDD.
Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In: ECCV (pp. 1–14).
Shalev-Shwartz, Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In: ICML.
Sharma, A., & Jacobs, D.W. (2011). Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR (pp. 593–600).
Sharma, A., & Kumar, A. (2012). Generalized multiview analysis: A discriminative latent space. In: CVPR (pp. 2160–2167).
Shen, C., Kim, J., Wang, L., & Hengel, A. (2009). Positive semidefinite metric learning with boosting. In: NIPS (pp. 1651–1659).
Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In: CVPR (pp. 2601–2608).
Shon, A.P., Grochow, K., Hertzmann, A., & Rao, R.P.N. (2006). Learning shared latent structure for image synthesis and robotic imitation. In: NIPS (pp. 1233–1240).
Stewart, G. (1993). On the early history of the singular value decomposition. In: SIAM (pp. 551–566).
Sun, T., Chen, S., Yang, J., & Shi, P. (2008). A novel method of combined feature extraction for recognition. In: ICDM (pp. 1043–1048).
Taskar, B. (2004). Learning structured prediction models: A large margin apporach. PhD thesis, Stanford University.
Tenenbaum, J., & Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6), 1247–1283.
Article Google Scholar
Torre, F., & Black, M. (2001). Dynamic coupled component analysis. CVPR, 2, 643–650.
Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In: ICML (pp. 104–112).
Wang, B., Tang, J., Fan, W., Chen, S., Yang, Z., & Liu, Y. (2009). Heterogeneous cross domain ranking in latent space categories and subject descriptors. In: CIKM.
Weinberger, K.Q., Blitzer, J., & Saul, L.K. (2006). Distance metric learning for large margin nearest neighbor classification. In: NIPS.
Wu, W., Xu, J., & Li, H. (2010). Learning similarity function between objects in heterogeneous spaces. Tech. Rep. MSR-TR-2010-86.
Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. NIPS, 15, 505–512.
Google Scholar
Zhang, J., & Zhang, D. (2011). A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recognition, 44(6), 1162–1171.
Article MATH Google Scholar
Zhang, W., Wang, X., & Tang, X. (2011). Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR (pp. 513–520).
Zheng, W., Gong, S., & Tao, X. (2013). Re-identification by relative distance comparison. PAMI, 35(3), 653–668.
Article Google Scholar
Zhou, H., Kuang, Z., & Wong, K.Y.K. (2012). Markov Weight Fields for face sketch synthesis. In: CVPR (pp. 1091–1097).

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong
Zhanghui Kuang & Kwan-Yee K. Wong

Authors

Zhanghui Kuang
View author publications
You can also search for this author in PubMed Google Scholar
Kwan-Yee K. Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhanghui Kuang.

Additional information

Communicated by Tilo Burghardt , Majid Mirmehdi, Walterio Mayol and Dima Damen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuang, Z., Wong, KY.K. Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations. Int J Comput Vis 113, 176–192 (2015). https://doi.org/10.1007/s11263-014-0783-8

Download citation

Received: 29 April 2014
Accepted: 16 October 2014
Published: 12 November 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11263-014-0783-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

Abstract

Access this article

Similar content being viewed by others

Dual structural consistency based multi-modal correlation propagation projections for data representation

Discriminative Subspace Learning for Cross-view Classification with Simultaneous Local and Global Alignment

Correlation maximization machine for multi-modalities multiclass classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

Abstract

Access this article

Similar content being viewed by others

Dual structural consistency based multi-modal correlation propagation projections for data representation

Discriminative Subspace Learning for Cross-view Classification with Simultaneous Local and Global Alignment

Correlation maximization machine for multi-modalities multiclass classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation