Skip to main content
Log in

Joint Robust Transfer Metric and Adaptive Transfer Function Learning

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Finding the right distance metric is one of the main challenges in machine learning and computer vision procedures. On the other hand, adapting a good classifier with the learned metric has the same importance. In traditional machine learning problems both training and test data come from a same distribution, however it is not common for many real-world data sets. Therefore, they suffer from poor performance where there is inconsistency and dissimilarity between the training (source) data and the test (target) data domain distributions. In this paper, we present a method to overcome this issue efficiently. To this end, a projection matrix is found by learning a cross-domain metric to map the samples of the source and target datasets to a new feature space where the distance of two domains is reduced by utilizing marginal and conditional adaptation terms. Moreover, the metric becomes powerful by employing marginalized de-noising and low-rank strategies. Also, parallel to learning the projection matrix, an adaptive decision function is learned by minimizing the empirical risk while maximizing the consistency of the manifold structure of data with the classifier. In addition, a distribution adaptation term is incorporated into the learning procedure of the classifier to lessen the distance between two domains by minimizing it. The validity of the proposed technique is tested on different image categorization datasets and the experimental results demonstrate that it performs well compared to the state-of-the-art methods.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Yang X, Wang M, Zhang L, Tao D (2016) Empirical risk minimization for metric learning using privileged information. In: IJCAI, pp 2266–2272

  2. Yeung D-Y, Chang H (2006) Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints. Pattern Recognit 39(5):1007–1010

    Article  Google Scholar 

  3. Yang L, Jin R (2006) Distance metric learning: a comprehensive survey. Mich State Univ 2(2):4

    Google Scholar 

  4. Wang S, Jin R (2009) An information geometry approach for distance metric learning. In: Artificial intelligence and statistics, pp 591–598

  5. Wang H, Wang W, Zhang C, Xu F (2014) Cross-domain metric learning based on information theory. In: AAAI, pp 2099–2105

  6. Geng B, Tao D, Xu C (2011) DAML: domain adaptation metric learning. IEEE Trans Image Process 20(10):2980–2989

    Article  MathSciNet  Google Scholar 

  7. Zhang Y, Yeung D-Y (2012) Transfer metric learning with semi-supervised extension. ACM Trans Intell Syst Technol 3(3):54

    Google Scholar 

  8. Ben-David S, Blitzer J, Crammer K, Pereira F (2007) Analysis of representations for domain adaptation. In: Advances in neural information processing systems, pp 137–144

  9. Daume H III, Marcu D (2006) Domain adaptation for statistical classifiers. J Artif Intell Res 26:101–126

    Article  MathSciNet  Google Scholar 

  10. Zhu Y et al (2011) Heterogeneous transfer learning for image classification. In: AAAI

  11. Zhuang F et al (2012) Mining distinction and commonality across multiple domains using generative model for text classification. IEEE Trans Knowl Data Eng 24(11):2025–2039

    Article  Google Scholar 

  12. Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520

  13. Hu J, Lu J, Tan Y-P (2015) Deep transfer metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 325–333

  14. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  15. Zhu X, Kandola J, Lafferty J, Ghahramani Z (2006) Graph kernels by spectral transforms. In: Semi-supervised learning, Chap 15. MIT Press, Cambridge, MA, pp 277–289

    Google Scholar 

  16. Kulis B (2013) Metric learning: a survey. Found Trends Mach Learn 5(4):287–364

    Article  MathSciNet  Google Scholar 

  17. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  18. Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323

    Article  Google Scholar 

  19. Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems, pp 585–591

  20. Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on machine learning, pp 209–216

  21. Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, pp 1473–1480

  22. Chapelle O, Scholkopf B, Zien A (2009) Semi-supervised learning (Chapelle O et al (eds) 2006) [book reviews]. IEEE Trans Neural Netw 20(3):542

    Article  Google Scholar 

  23. Dhillon PS, Talukdar P, Crammer K (2012) Metric learning for graph-based domain adaptation. University of Pennsylvania, Department of Computer and Information Science Technical Report No. MS-CIS-12-17, January 2012

  24. Weiss K, Khoshgoftaar TM, Wang D (2016) A survey of transfer learning. J Big Data 3(1):9

    Article  Google Scholar 

  25. Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Advances in neural information processing systems, pp 1433–1440

  26. Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings 24th international conference machine learning

  27. Cortes C, Mohri M, Riley M, Rostamizadeh A(2008) Sample selection bias correction theory. In: International conference on algorithmic learning theory, pp 38–53

  28. Baktashmotlagh M, Harandi MT, Lovell BC, Salzmann M (2014) Domain adaptation on the statistical manifold. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2481–2488

  29. Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210

    Article  Google Scholar 

  30. Long M, Wang J, Ding G, Sun J, Yu PS (2013) Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE international conference on computer vision, pp 2200–2207

  31. Baktashmotlagh M, Harandi MT, Lovell BC, Salzmann M (2013) Unsupervised domain adaptation by domain invariant projection. In: Proceedings of the IEEE international conference on computer vision, pp 769–776

  32. Fernando B, Habrard A, Sebban M, Tuytelaars T (2013) Unsupervised visual domain adaptation using subspace alignment. In: Proceedings of the IEEE international conference on computer vision, pp 2960–2967

  33. Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2066–2073

  34. Bahadori MT, Liu Y, Zhang D (2011) Learning with minimum supervision: a general framework for transductive transfer learning. In: 2011 IEEE 11th international conference on data mining (ICDM), pp 61–70

  35. Bruzzone L, Marconcini M (2010) Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Trans Pattern Anal Mach Intell 32(5):770–787

    Article  Google Scholar 

  36. Quanz B, Huan J (2009) Large margin transductive transfer learning. In: Proceedings of the 18th ACM conference on information and knowledge management, pp 1327–1336

  37. Duan L, Tsang IW, Xu D (2012) Domain transfer multiple kernel learning. IEEE Trans Pattern Anal Mach Intell 34(3):465–479

    Article  Google Scholar 

  38. Xiao M, Guo Y (2012) Semi-supervised kernel matching for domain adaptation. In: AAAI

  39. Long M, Wang J, Ding G, Pan SJ, Philip SY (2014) Adaptation regularization: a general framework for transfer learning. IEEE Trans Knowl Data Eng 26(5):1076–1089

    Article  Google Scholar 

  40. Ding Z, Fu Y (2017) Robust transfer metric learning for image classification. IEEE Trans Image Process 26(2):660–670

    Article  MathSciNet  Google Scholar 

  41. Chen M, Xu Z, Weinberger K, Sha F (2012) Marginalized denoising autoencoders for domain adaptation. arXiv Prepr. arXiv:1206.4683

  42. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  43. Zhong G, Huang K, Liu C-L (2011) Low rank metric learning with manifold regularization. In: 2011 IEEE 11th international conference on data mining (ICDM), pp 1266–1271

  44. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  45. Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  46. Schölkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In: International conference on computational learning theory, pp 416–426

  47. Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86

    Article  Google Scholar 

  48. Xu XLY, Fang X, Wu J, Zhang D (2016) Discriminative transfer subspace learning via low-rank and sparse representation. IEEE Trans Image Process 25:850–863

    Article  MathSciNet  Google Scholar 

  49. Li S, Song S, Huang G, Ding Z, Wu C (2018) Domain invariant and class discriminative feature learning for visual domain adaptation. IEEE Trans Image Process 27(9):4260–4273

    Article  MathSciNet  Google Scholar 

  50. Razzaghi P, Razzaghi P, Abbasi K (2019) Transfer subspace learning via low-rank and discriminative reconstruction matrix. Knowl Based Syst 163:174–185

    Article  Google Scholar 

  51. van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

  52. Alcalá-Fdez J et al (2009) KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatemeh Afsari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Azarbarzin, S., Afsari, F. Joint Robust Transfer Metric and Adaptive Transfer Function Learning. Neural Process Lett 51, 1411–1443 (2020). https://doi.org/10.1007/s11063-019-10152-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-019-10152-3

Keywords

Navigation