Skip to main content
Log in

Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Discovering a latent common space between different modalities plays an important role in cross-modality pattern recognition. Existing techniques often require absolutely-paired observations as training data, and are incapable of capturing more general semantic relationships between cross-modality observations. This greatly limits their applications. In this paper, we propose a general framework for learning a latent common space from relatively-paired observations (i.e., two observations from different modalities are more-likely-paired than another two). Relative-pairing information is encoded using relative proximities of observations in the latent common space. By building a discriminative model and maximizing a distance margin, a projection function that maps observations into the latent common space is learned for each modality. Cross-modality pattern recognition can then be carried out in the latent common space. To speed up the learning procedure for large scale training data, the problem is reformulated into learning a structural model, which is efficiently solved by the cutting plane algorithm. To evaluate the performance of the proposed framework, it has been applied to feature fusion, cross-pose face recognition, text-image retrieval and attribute-image retrieval. Experimental results demonstrate that the proposed framework outperforms other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. The number of variables is very small.

  2. http://archive.ics.uci.edu/ml/datasets/Multiple+Features.

  3. http://vasc.ri.cmu.edu/idb/html/face/.

  4. http://vipl.ict.ac.cn/members/mnkan.

References

  • Andrea, F., Yoram, S., Sha, F., & Jitendra, M. (2007). Learning globally-consistent local distance functions for shape-based image retrieval. In: ICCV (pp. 1–8).

  • Bach, F., & Jordan, M. (2005). A probabilistic interpretation of canonical correlation analysis. Technical Report: Department of Statistics, University of California, Berkeley.

  • Blanz, V., Grother, P., Phillips, P., & Vetter, T. (2005). Face recognition based on frontal views generated from non-frontal images. In: CVPR (pp. 454–461).

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993–1022.

    MATH  Google Scholar 

  • Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In: COMPSTAT (pp. 177–187).

  • Bronstein, M., & Bronstein, A. (2010). Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR (pp. 3594–3601).

  • Chai, X., Shan, S., Chen, X., & Gao, W. (2007). Locally linear regression for pose-invariant face recognition. TIP, 16(7), 1716–1725.

    MathSciNet  Google Scholar 

  • Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I. (2007). Information theoretic metric learning. In: ICML (pp. 209–216).

  • Ek, C.H., Rihan, J., Torr, P.H.S., Rogez, G., & Lawrence, N.D. (2008). Ambiguity modeling in latent spaces. In: MLMI (pp. 62–73).

  • Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2004). Neighbourhood components analysis. In: NIPS (pp. 513–520).

  • Gong, Y., Ke, Q., Isard, M., & Lazebnik, S. (2014). A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV, 106(2), 210–233.

    Article  Google Scholar 

  • Gross, R., Matthews, I., & Baker, S. (2004). Appearance-based face recognition and light-fields. PAMI, 26(4), 449–465.

    Article  Google Scholar 

  • Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12), 2639–2664.

    Article  MATH  Google Scholar 

  • Joachims, T. (2006). Training linear SVMs in linear time. In: KDD, pp 217–226.

  • Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural SVMs. Machine Learning, 77(1), 27–59.

    Article  MATH  Google Scholar 

  • Kan, M., Shan, S., & Zhang, H. (2012). Multi-view discriminant analysis. In: ECCV (pp. 808–821).

  • Knutsson, H., Borga, M., & Tomas, L. (1997). Learning canonical correlations. In: SCIA, Computer Vision Laboratory, vol 1.

  • Kuang, Z., & Wong, K.Y.K. (2013). Relatively-paired space analysis. In: BMVC.

  • Kumar, N., Berg, A.C., Belhumeur, P.N., & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In: ICCV.

  • Lampert, C., & Krömer, O. (2010). Weakly-paired maximum covariance analysis for multimodal dimensionality reduction and transfer learning. In: ECCV (pp. 566–579).

  • Lin, D., & Tang, X. (2005). Coupled space learning of image style transformation. In: ICCV (pp. 1699–1706).

  • Lin, D., & Tang, X. (2006). Inter-modality face recognition. In: ECCV (pp. 13–26).

  • Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1), 503–528.

    Article  MATH  MathSciNet  Google Scholar 

  • Liu, X., & Chen, T. (2005). Pose-robust face recognition using geometry assisted probabilistic modeling. CVPR, 1, 502–509.

    Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.

    Article  Google Scholar 

  • Navaratnam, R., Fitzgibbon, A.W., & Cipolla, R. (2007). The joint manifold model for semi-supervised multi-valued regression. In: ICCV (pp. 1–8).

  • Parameswaran, S., & Weinberger, K. (2010). Large margin multi-task metric learning. In: NIPS (pp. 1–9).

  • Parikh, D., & Grauman, K. (2011). Relative attributes. In: ICCV.

  • Prince, S., Warrell, J., Elder, J., & Felisberti, F. (2008). Tied factor analysis for face recognition across large pose differences. PAMI, 30(6), 970–984.

    Article  Google Scholar 

  • Quadrianto, N., & Lampert, C. (2011). Learning multi-view neighborhood preserving projections. In: ICML (pp. 425–432).

  • Rakotomamonjy, A. (2004). Support vector machines and area under ROC curves. PSI-INSA de Rouen: Technical Report.

  • Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., & Vasconcelos, N. (2010). A new approach to cross-modal multimedia retrieval. In: ACM MM (pp. 251–260).

  • Rosipal, R., & Krämer, N. (2006). Overview and recent advances in partial least squares. Subspace, latent structure and feature selection (pp. 34–51). Berlin: Springer.

    Chapter  Google Scholar 

  • Rupnik, J., & Shawe-Taylor, J. (2010). Multi-view canonical correlation analysis. In: SiKDD.

  • Saenko, K., Kulis, B., Fritz, M., & Darrell, T. (2010). Adapting visual category models to new domains. In: ECCV (pp. 1–14).

  • Shalev-Shwartz, Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In: ICML.

  • Sharma, A., & Jacobs, D.W. (2011). Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR (pp. 593–600).

  • Sharma, A., & Kumar, A. (2012). Generalized multiview analysis: A discriminative latent space. In: CVPR (pp. 2160–2167).

  • Shen, C., Kim, J., Wang, L., & Hengel, A. (2009). Positive semidefinite metric learning with boosting. In: NIPS (pp. 1651–1659).

  • Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In: CVPR (pp. 2601–2608).

  • Shon, A.P., Grochow, K., Hertzmann, A., & Rao, R.P.N. (2006). Learning shared latent structure for image synthesis and robotic imitation. In: NIPS (pp. 1233–1240).

  • Stewart, G. (1993). On the early history of the singular value decomposition. In: SIAM (pp. 551–566).

  • Sun, T., Chen, S., Yang, J., & Shi, P. (2008). A novel method of combined feature extraction for recognition. In: ICDM (pp. 1043–1048).

  • Taskar, B. (2004). Learning structured prediction models: A large margin apporach. PhD thesis, Stanford University.

  • Tenenbaum, J., & Freeman, W. (2000). Separating style and content with bilinear models. Neural Computation, 12(6), 1247–1283.

    Article  Google Scholar 

  • Torre, F., & Black, M. (2001). Dynamic coupled component analysis. CVPR, 2, 643–650.

    Google Scholar 

  • Tsochantaridis, I., Hofmann, T., Joachims, T., & Altun, Y. (2004). Support vector machine learning for interdependent and structured output spaces. In: ICML (pp. 104–112).

  • Wang, B., Tang, J., Fan, W., Chen, S., Yang, Z., & Liu, Y. (2009). Heterogeneous cross domain ranking in latent space categories and subject descriptors. In: CIKM.

  • Weinberger, K.Q., Blitzer, J., & Saul, L.K. (2006). Distance metric learning for large margin nearest neighbor classification. In: NIPS.

  • Wu, W., Xu, J., & Li, H. (2010). Learning similarity function between objects in heterogeneous spaces. Tech. Rep. MSR-TR-2010-86.

  • Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. NIPS, 15, 505–512.

    Google Scholar 

  • Zhang, J., & Zhang, D. (2011). A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recognition, 44(6), 1162–1171.

    Article  MATH  Google Scholar 

  • Zhang, W., Wang, X., & Tang, X. (2011). Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR (pp. 513–520).

  • Zheng, W., Gong, S., & Tao, X. (2013). Re-identification by relative distance comparison. PAMI, 35(3), 653–668.

    Article  Google Scholar 

  • Zhou, H., Kuang, Z., & Wong, K.Y.K. (2012). Markov Weight Fields for face sketch synthesis. In: CVPR (pp. 1091–1097).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhanghui Kuang.

Additional information

Communicated by Tilo Burghardt , Majid Mirmehdi, Walterio Mayol and Dima Damen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuang, Z., Wong, KY.K. Relatively-Paired Space Analysis: Learning a Latent Common Space From Relatively-Paired Observations. Int J Comput Vis 113, 176–192 (2015). https://doi.org/10.1007/s11263-014-0783-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0783-8

Keywords

Navigation