research-article

Multiview Metric Learning with Global Consistency and Local Smoothness

Authors:

Wen GaoAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 3, Issue 3

Article No.: 53, Pages 1 - 22

https://doi.org/10.1145/2168752.2168767

Published: 01 May 2012 Publication History

Abstract

In many real-world applications, the same object may have different observations (or descriptions) from multiview observation spaces, which are highly related but sometimes look different from each other. Conventional metric-learning methods achieve satisfactory performance on distance metric computation of data in a single-view observation space, but fail to handle well data sampled from multiview observation spaces, especially those with highly nonlinear structure. To tackle this problem, we propose a new method called Multiview Metric Learning with Global consistency and Local smoothness (MVML-GL) under a semisupervised learning setting, which jointly considers global consistency and local smoothness. The basic idea is to reveal the shared latent feature space of the multiview observations by embodying global consistency constraints and preserving local geometric structures. Specifically, this framework is composed of two main steps. In the first step, we seek a global consistent shared latent feature space, which not only preserves the local geometric structure in each space but also makes those labeled corresponding instances as close as possible. In the second step, the explicit mapping functions between the input spaces and the shared latent space are learned via regularized locally linear regression. Furthermore, these two steps both can be solved by convex optimizations in closed form. Experimental results with application to manifold alignment on real-world datasets of pose and facial expression demonstrate the effectiveness of the proposed method.

References

[1]

Akaho, S. 2001. A kernel method for canonical correlation analysis. In Proceedings of the International Meeting of the Psychometric Society (IMPS’01).

[2]

Bar-Hillel, A., Hertz, T., Shental, N., and Weinshall, D. 2003. Learning distance functions using equivalence relations. In Proceedings of the 20th International Conference on Machine Learning. 11--18.

[3]

Belkin, M., Niyogi, P., and Sindhwani, V. 2004. Manifold regularization : A geometric framework for learning from examples. J. Machine Learn. Res., 2399--2434.

Digital Library

[4]

Bottou, L. and Vapnik, V. 1992. Local learning algorithms. Neural Computat. 4, 6, 888--900.

Digital Library

[5]

Cai, D., He, X., and Han, J. 2007. Spectral regression for efficient regularized subspace learning. In Proceedings of the 11th IEEE International Conference on Computer Vision.

[6]

Chang, H. and Yeung, D. 2007. Local smooth metric learning with application to image retrieval. In Proceedings of the 11th IEEE International Conference on Computer Vision.

[7]

Davis, J. V., Kulis, B., Jain, P., Sra, S., and Dhillon, I. S. 2007. Information-theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning. 209--216.

Digital Library

[8]

Ek, C. H., Rihan, J., Torr, P. H. S., Rogez, G., and Lawrence., N. D. 2008. Ambiguity modeling in latent spaces. In Proceedings of the 5th International Workshop on Machine Learning for Multimodal Interaction (MLMI’08). Springer-Verlag, Berlin, 62--73.

Digital Library

[9]

Frome, A., Singer, Y., and Malik, J. 2006. Image retrieval and classification using local distance functions. In Advances in Neural Information Processing Systems 19, 417--424.

[10]

Frome, A., Sha, F., Singer, Y., and Malik, J. 2007. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In Proceedings of the 11th IEEE International Conference on Computer Vision.

[11]

Goldberger, J., Roweis, S., Hinton, G., and Salakhutdinov, R. 2004. Neighbourhood components analysis. In Advances in Neural Information Processing Systems 17, MIT Press, 513--520.

[12]

Gong, H., Pan, C., Yang, Q., Lu, H., and Ma, S. 2005. A semi-supervised framework for mapping data to the intrinsic manifold. In Proceedings of the 10th IEEE International Conference on Computer Vision. Vol. 1, 98--105.

Digital Library

[13]

Ham, J., Lee, D. D., and Saul, L. K. 2005. Semisupervised alignment of manifolds. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics. 120--127.

[14]

Hardoon, D., Szedmak, S., and Shawe-Taylor, J. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 2639--2664.

Digital Library

[15]

Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag.

[16]

He, X. and Niyogi, P. 2003. Locality preserving projections. In Advances in Neural Information Processing Systems 16, MIT Press.

[17]

Hotelling, H. 1936. Relations between two sets of variates. Biometrika 28, 312--377.

[18]

Huang, K., Yang, H., King, I., and Lyu, M. R. 2004. Learning large margin classifiers locally and globally. In Proceedings of the 21st International Conference on Machine Learning (ICML’04). ACM, New York, NY, 401--408.

Digital Library

[19]

Jin, R., Wang, S., and Zhou, Y. 2009. Regularized distance metric learning: Theory and algorithm. In Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta Eds., 862--870.

[20]

Jolliffe, I. 2002. Principal Component Analysis 2nd Ed. Springer, New York.

[21]

Lei, Z. and Li., S. Z. 2009. Coupled spectral regression for matching heterogeneous faces. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]

Levin, D. 1998. The approximation power of moving least squares. Math. Computat. 67, 224, 1517--1531.

Digital Library

[23]

Li, B., Chang, H., Shan, S., and Chen, X. 2009. Coupled metric learning for face recognition with degraded images. In Proceedings of the 1st Asian Conference on Machine Learning.

Digital Library

[24]

Liu, W., Ma, S., Tao, D., Liu, J., and Liu, P. 2010. Semi-supervised sparse metric learning using alternating linearization optimization. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 1139--1148.

Digital Library

[25]

Lyons, M. J., Kamachi, M., Gyoba, J., and Akamatsu, S. 1998. Coding facial expressions with gabor wavelets. In Procedings of the 3rd IEEE Automatic Face and Gesture Recognition.

Digital Library

[26]

Nene, S., Nayar, S., and Murase, H. 1996. Columbia object image library: Coil-20. Tech. rep. CUCS-006-96, Columbia University.

[27]

Petersen, K. B. and Pedersen, M. S. 2008. The matrix cookbook. http://matrixcookbook.com.

[28]

Saul, L. K., Roweis, S. T., and Singer, Y. 2003. Think globally, fit locally: Unsupervised learning of low dimensional manifolds. J. Mach. Learn. Res. 4, 119--155.

Digital Library

[29]

Shao, Y., Zhou, Y., and Cai, D. 2011. Variational inference with graph regularization for image annotation. ACM Trans. Intell. Syst. Technol. 2, 11:1--11:21.

Digital Library

[30]

Shon, A. P., K. Grochow, A. H., and Rao, R. 2006. Learning shared latent structure for image synthesis and robotic imitation. In Advances in Neural Information Processing Systems 18, 1233--1240.

[31]

Sindhwani, V. and Niyogi, P. 2005. A co-regularized approach to semi-supervised learning with multiple views. In Proceedings of the ICML Workshop on Learning with Multiple Views.

[32]

Vapnik, V. 1995. The Nature of Statistical Learning Theory. Springer, New York.

Digital Library

[33]

Wang, C. and Mahadevan, S. 2008. Manifold alignment using procrustes analysis. In Proceedings of the 25th International Conference on Machine Learning. 1120--1127.

Digital Library

[34]

Wang, F., Zhang, C., and Li, T. 2007. Clustering with local and global regularization. In Proceedings of the 22nd AAAI Conference on Artificial Intelligence. 657--662.

Digital Library

[35]

Weinberger, K., Blitzer, J., and Saul, L. 2006. Distance metric learning for large margin nearest neighbor classification. In Advances in Neural Information Processing Systems 18.

[36]

Wu, M. and Schölkopf, B. 2006. A local learning approach for clustering. In Advances in Neural Information Processing Systems 19, MIT Press, Cambridge, MA, 1529--1536.

[37]

Wu, M. and Schölkopf, B. 2007. Transductive classification via local learning regularization. In Proceedings of the 11th International Workshop on Artificial Intelligence and Statistics.

[38]

Wu, M., Yu, K., Yu, S., and Schölkopf, B. 2007. Local learning projections. In Proceedings of the 24th International Conference on Machine Learning. 1039--1046.

Digital Library

[39]

Wu, L., Hoi, S. C., Jin, R., Zhu, J., and Yu, N. 2011. Distance metric learning from uncertain side information for automated photo tagging. ACM Trans. Intell. Syst. Technol. 2, Article 13.

Digital Library

[40]

Xing, E., Ng, A., Jordan, M., and Russell, S. 2003. Distance metric learning, with application to clustering with side-information. In Advances in Neural Information Processing Systems 15, S. Becker, S. Thrun, and K. Obermayer Eds., MIT Press, Cambridge, MA, 505--512.

[41]

Xiong, L., Wang, F., and Zhang, C. 2007. Semi-definite manifold alignment. In Proceedings of the 18th European Conference on Machine Learning (ECML). 773--781.

Digital Library

[42]

Yang, L., Jin, R., Sukthankar, R., and Liu, Y. 2006. An efficient algorithm for local distance metric learning. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI’06). AAAI Press, 543--548.

Digital Library

[43]

Yeung, D.-Y., Chang, H., and Dai, G. 2008. A scalable kernel-based semi-supervised metric learning algorithm with out-of-sample generalization ability. Neural Computat. 20, 11.

Digital Library

[44]

Zelnik-Manor, L. and Perona, P. 2005. Self-tuning spectral clustering. In Advances in Neural Information Processing Systems 17. MIT Press, 1601--1608.

[45]

Zhan, D., Li, M., Li, Y.-F., and Zhou, Z. 2009. Learning instance specific distances using metric propagation. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, New York, NY, 1225--1232.

Digital Library

[46]

Zheng, H., Wang, M., and Z.Li. 2010. Audio-visual speaker identification with multiview distance metric learning. In Proceedings of the IEEE 17th International Conference on Image Processing. 4561--4564.

Cited By

Zhang JYu YTang SQi GWu HHachiya H(2025)Enhancing semantic audio-visual representation learning with supervised multi-scale attentionPattern Analysis & Applications10.1007/s10044-025-01414-z28:2Online publication date: 1-Jun-2025
https://dl.acm.org/doi/10.1007/s10044-025-01414-z
Li SYuan GYang MShen YLi CXu RZhao X(2024)Improving Semi-Supervised Text Classification with Dual Meta-LearningACM Transactions on Information Systems10.1145/364861242:4(1-28)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1145/3648612
Hu PZhen LPeng XZhu HLin JWang XPeng D(2024)Deep Supervised Multi-View Learning With Graph PriorsIEEE Transactions on Image Processing10.1109/TIP.2023.333582533(123-133)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIP.2023.3335825
Show More Cited By

Index Terms

Multiview Metric Learning with Global Consistency and Local Smoothness
1. Computing methodologies
  1. Machine learning

Recommendations

Regularized local metric learning for person re-identification

We have proposed a new metric learning approach by exploiting both global and local information of samples to learn similarity for person re-identification.We have presented a regularization approach to address the small sample size problem in previous ...
Feature extraction by learning Lorentzian metric tensor and its extensions

We develop a supervised dimensionality reduction method, called Lorentzian discriminant projection (LDP), for feature extraction and classification. Our method represents the structures of sample data by a manifold, which is furnished with a Lorentzian ...
Multiview Hessian regularized logistic regression for action recognition

With the rapid development of social media sharing, people often need to manage the growing volume of multimedia data such as large scale video classification and annotation, especially to organize those videos containing human activities. Recently, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 3, Issue 3

May 2012

384 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/2168752

Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2012

Received: 01 November 2011

Accepted: 01 May 2011

Revised: 01 March 2011

Published in TIST Volume 3, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

67
Total Citations
View Citations
880
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)1

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang JYu YTang SQi GWu HHachiya H(2025)Enhancing semantic audio-visual representation learning with supervised multi-scale attentionPattern Analysis & Applications10.1007/s10044-025-01414-z28:2Online publication date: 1-Jun-2025
https://dl.acm.org/doi/10.1007/s10044-025-01414-z
Li SYuan GYang MShen YLi CXu RZhao X(2024)Improving Semi-Supervised Text Classification with Dual Meta-LearningACM Transactions on Information Systems10.1145/364861242:4(1-28)Online publication date: 20-Feb-2024
https://dl.acm.org/doi/10.1145/3648612
Hu PZhen LPeng XZhu HLin JWang XPeng D(2024)Deep Supervised Multi-View Learning With Graph PriorsIEEE Transactions on Image Processing10.1109/TIP.2023.333582533(123-133)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIP.2023.3335825
Bi JDornaika F(2024)Sample-weighted fused graph-based semi-supervised learning on multi-view dataInformation Fusion10.1016/j.inffus.2023.102175104(102175)Online publication date: Apr-2024
https://doi.org/10.1016/j.inffus.2023.102175
Tian QZhang HXia SXu HMa C(2023)Cross-view learning with scatters and manifold exploitation in geodesic spaceElectronic Research Archive10.3934/era.202327531:9(5425-5441)Online publication date: 2023
https://doi.org/10.3934/era.2023275
Zhang JYu YTang SWu JLi W(2023)Variational Autoencoder with CCA for Audio–Visual Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357565819:3s(1-21)Online publication date: 24-Feb-2023
https://dl.acm.org/doi/10.1145/3575658
Chen DZhuang YShen ZYang CWang GTang SYang Y(2023)Cross-Modal Data Augmentation for Tasks of Different ModalitiesIEEE Transactions on Multimedia10.1109/TMM.2022.322869625(7814-7824)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3228696
Li GWang ZXu SFeng CYang XWu NSun F(2022)Deep Adversarial Learning Triplet Similarity Preserving Cross-Modal Retrieval AlgorithmMathematics10.3390/math1015258510:15(2585)Online publication date: 25-Jul-2022
https://doi.org/10.3390/math10152585
Cui ZHu YSun YGao JYin B(2022)Cross-modal alignment with graph reasoning for image-text retrievalMultimedia Tools and Applications10.1007/s11042-022-12444-881:17(23615-23632)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.1007/s11042-022-12444-8
Feng WWang Z(2022)Multi-view multi-manifold learning with local and global structure preservationApplied Intelligence10.1007/s10489-022-04101-253:10(12908-12924)Online publication date: 4-Oct-2022
https://dl.acm.org/doi/10.1007/s10489-022-04101-2
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents