Abstract
Finding the right distance metric is one of the main challenges in machine learning and computer vision procedures. On the other hand, adapting a good classifier with the learned metric has the same importance. In traditional machine learning problems both training and test data come from a same distribution, however it is not common for many real-world data sets. Therefore, they suffer from poor performance where there is inconsistency and dissimilarity between the training (source) data and the test (target) data domain distributions. In this paper, we present a method to overcome this issue efficiently. To this end, a projection matrix is found by learning a cross-domain metric to map the samples of the source and target datasets to a new feature space where the distance of two domains is reduced by utilizing marginal and conditional adaptation terms. Moreover, the metric becomes powerful by employing marginalized de-noising and low-rank strategies. Also, parallel to learning the projection matrix, an adaptive decision function is learned by minimizing the empirical risk while maximizing the consistency of the manifold structure of data with the classifier. In addition, a distribution adaptation term is incorporated into the learning procedure of the classifier to lessen the distance between two domains by minimizing it. The validity of the proposed technique is tested on different image categorization datasets and the experimental results demonstrate that it performs well compared to the state-of-the-art methods.
Graphic Abstract
Similar content being viewed by others
References
Yang X, Wang M, Zhang L, Tao D (2016) Empirical risk minimization for metric learning using privileged information. In: IJCAI, pp 2266–2272
Yeung D-Y, Chang H (2006) Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints. Pattern Recognit 39(5):1007–1010
Yang L, Jin R (2006) Distance metric learning: a comprehensive survey. Mich State Univ 2(2):4
Wang S, Jin R (2009) An information geometry approach for distance metric learning. In: Artificial intelligence and statistics, pp 591–598
Wang H, Wang W, Zhang C, Xu F (2014) Cross-domain metric learning based on information theory. In: AAAI, pp 2099–2105
Geng B, Tao D, Xu C (2011) DAML: domain adaptation metric learning. IEEE Trans Image Process 20(10):2980–2989
Zhang Y, Yeung D-Y (2012) Transfer metric learning with semi-supervised extension. ACM Trans Intell Syst Technol 3(3):54
Ben-David S, Blitzer J, Crammer K, Pereira F (2007) Analysis of representations for domain adaptation. In: Advances in neural information processing systems, pp 137–144
Daume H III, Marcu D (2006) Domain adaptation for statistical classifiers. J Artif Intell Res 26:101–126
Zhu Y et al (2011) Heterogeneous transfer learning for image classification. In: AAAI
Zhuang F et al (2012) Mining distinction and commonality across multiple domains using generative model for text classification. IEEE Trans Knowl Data Eng 24(11):2025–2039
Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520
Hu J, Lu J, Tan Y-P (2015) Deep transfer metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 325–333
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Zhu X, Kandola J, Lafferty J, Ghahramani Z (2006) Graph kernels by spectral transforms. In: Semi-supervised learning, Chap 15. MIT Press, Cambridge, MA, pp 277–289
Kulis B (2013) Metric learning: a survey. Found Trends Mach Learn 5(4):287–364
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems, pp 585–591
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on machine learning, pp 209–216
Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, pp 1473–1480
Chapelle O, Scholkopf B, Zien A (2009) Semi-supervised learning (Chapelle O et al (eds) 2006) [book reviews]. IEEE Trans Neural Netw 20(3):542
Dhillon PS, Talukdar P, Crammer K (2012) Metric learning for graph-based domain adaptation. University of Pennsylvania, Department of Computer and Information Science Technical Report No. MS-CIS-12-17, January 2012
Weiss K, Khoshgoftaar TM, Wang D (2016) A survey of transfer learning. J Big Data 3(1):9
Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Advances in neural information processing systems, pp 1433–1440
Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings 24th international conference machine learning
Cortes C, Mohri M, Riley M, Rostamizadeh A(2008) Sample selection bias correction theory. In: International conference on algorithmic learning theory, pp 38–53
Baktashmotlagh M, Harandi MT, Lovell BC, Salzmann M (2014) Domain adaptation on the statistical manifold. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2481–2488
Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210
Long M, Wang J, Ding G, Sun J, Yu PS (2013) Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE international conference on computer vision, pp 2200–2207
Baktashmotlagh M, Harandi MT, Lovell BC, Salzmann M (2013) Unsupervised domain adaptation by domain invariant projection. In: Proceedings of the IEEE international conference on computer vision, pp 769–776
Fernando B, Habrard A, Sebban M, Tuytelaars T (2013) Unsupervised visual domain adaptation using subspace alignment. In: Proceedings of the IEEE international conference on computer vision, pp 2960–2967
Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2066–2073
Bahadori MT, Liu Y, Zhang D (2011) Learning with minimum supervision: a general framework for transductive transfer learning. In: 2011 IEEE 11th international conference on data mining (ICDM), pp 61–70
Bruzzone L, Marconcini M (2010) Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Trans Pattern Anal Mach Intell 32(5):770–787
Quanz B, Huan J (2009) Large margin transductive transfer learning. In: Proceedings of the 18th ACM conference on information and knowledge management, pp 1327–1336
Duan L, Tsang IW, Xu D (2012) Domain transfer multiple kernel learning. IEEE Trans Pattern Anal Mach Intell 34(3):465–479
Xiao M, Guo Y (2012) Semi-supervised kernel matching for domain adaptation. In: AAAI
Long M, Wang J, Ding G, Pan SJ, Philip SY (2014) Adaptation regularization: a general framework for transfer learning. IEEE Trans Knowl Data Eng 26(5):1076–1089
Ding Z, Fu Y (2017) Robust transfer metric learning for image classification. IEEE Trans Image Process 26(2):660–670
Chen M, Xu Z, Weinberger K, Sha F (2012) Marginalized denoising autoencoders for domain adaptation. arXiv Prepr. arXiv:1206.4683
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
Zhong G, Huang K, Liu C-L (2011) Low rank metric learning with manifold regularization. In: 2011 IEEE 11th international conference on data mining (ICDM), pp 1266–1271
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Schölkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In: International conference on computational learning theory, pp 416–426
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Xu XLY, Fang X, Wu J, Zhang D (2016) Discriminative transfer subspace learning via low-rank and sparse representation. IEEE Trans Image Process 25:850–863
Li S, Song S, Huang G, Ding Z, Wu C (2018) Domain invariant and class discriminative feature learning for visual domain adaptation. IEEE Trans Image Process 27(9):4260–4273
Razzaghi P, Razzaghi P, Abbasi K (2019) Transfer subspace learning via low-rank and discriminative reconstruction matrix. Knowl Based Syst 163:174–185
van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
Alcalá-Fdez J et al (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Azarbarzin, S., Afsari, F. Joint Robust Transfer Metric and Adaptive Transfer Function Learning. Neural Process Lett 51, 1411–1443 (2020). https://doi.org/10.1007/s11063-019-10152-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-019-10152-3