A general framework for transfer sparse subspace learning

Yang, Shizhun; Lin, Ming; Hou, Chenping; Zhang, Changshui; Wu, Yi

doi:10.1007/s00521-012-1084-1

A general framework for transfer sparse subspace learning

Original Article
Published: 01 August 2012

Volume 21, pages 1801–1817, (2012)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Shizhun Yang^1,2,
Ming Lin²,
Chenping Hou¹,
Changshui Zhang² &
…
Yi Wu¹

770 Accesses
20 Citations
Explore all metrics

Abstract

In this paper, we propose a general framework for transfer learning, referred to as transfer sparse subspace learning (TSSL). This framework is suitable for different assumptions on the divergence measures of the data distributions, such as maximum mean discrepancy, Bregman divergence, and K–L divergence. We introduce an effective sparse regularization to the proposed transfer subspace learning framework, which can reduce time and space cost obviously, and more importantly, which can avoid or at least reduce over-fitting problem. We give different solutions to the problems based on different distribution distance estimation criteria, and convergence analysis is also given. Comprehensive experiments on the text data sets and the face image data sets demonstrate that TSSL-based methods outperform existing transfer learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transfer subspace learning joint low-rank representation and feature selection

Article 23 April 2022

Unsupervised Domain Adaptation Based on Subspace Alignment

Laplacian regularized low-rank sparse representation transfer learning

Article 01 October 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Yan S, Xu D, Zhang B, Zhang H, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Article Google Scholar
Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Li H, Jiang T, Zhang K (2006) Effective and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17(1):157–165
Article Google Scholar
He X, Niyogi P (2003) Locality preserving projections. In: Proceedings of the annual conference on advances in neural information processing systems (NIPS-03)
Zhang Y, d’Aspremont A, Ghaoui L (2010) Sparse PCA: convex relaxations, algorithms and applications, handbook on semidefinite, cone and polynomial optimization
Zou H, Hastie T, Tibshirani R (2004) Sparse principle component analysis. Technical report, Statistics Department, Stanford University
Moghaddam B, Weiss Y, Avidan S (2005) Spectral bounds for sparse PCA: exact and greedy algorithms. In: Proceedings of the annual conference on advances in neural information processing systems (NIPS-05)
Moghaddam B, Weiss Y, Avidan S (2006) Generalized spectral bounds for sparse LDA. In: Proceedings of the 23rd international conference on Machine learning (ICML-06), pp 641–648
Cai D, He X, Han J (2007) Spectral regression: a unified approach for sparse subspace learning. In: Proceedings of 2007 international conference on data mining (ICDM-07), Omaha
Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75
Article MathSciNet Google Scholar
Tikhonov AN (1963) Regularization of incorrectly posed problems. Soviet Math Dokl 4:1624–1627
MATH Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Ando RK, Zhang T (2006) Learning on graph with Laplacian regularization, advances in neural information processing systems (NIPS-06), vol 19. MIT Press, Cambridge, pp 25–33
Google Scholar
Bradley P, Mangasarian O (1998) Feature selection via concave minimization and support vector machines. In: Proceedings of the 15th international conference on machine learning (ICML-98)
Wang L, Zhu J, Zou H (2007) Hybrid huberized support vector machines for microarray classification. In: Proceedings of the 24th international conference on machine learning (ICML-07)
Obozinski G, Taskar B, Jordan M (2006) Multi-task feature selection. Technical report, Department of Statistics, University of California, Berkeley
Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. In: Proceedings of the annual conference on advances in neural information processing systems (NIPS-07), pp 41–48
Gu Q, Li Z, Han J (2011) Joint feature selection and subspace learning. In: The 22nd international joint conference on artificial intelligence (IJCAI-11), Barcelona
Ding C, Zhou D, He X, Zha H (2006) R1-PCA: rotational invariant l1-norm principal component analysis for robust subspace factorization. In: Proceedings of the 23rd international conference on machine learning (ICML-06)
Liu J, Ji S, Ye J (2009) Multi-task feature learning via effective L2,1-norm minimization. In: The conference on uncertainty in artificial intelligence (UAI-09)
Nie F, Huang H, Cai X, Ding C (2010) Effective and robust feature selection via joint l2,1-norms minimization. In: Proceedings of the annual conference on advances in neural information processing systems (NIPS-10)
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: Proceedings of the 20th annual conference on neural information processing systems (NIPS-08), Vancouver
Dai W, Yang Q, Xue G, Yu Y (2007) Boosting for transfer learning. In Proceedings of the 24th international conference on machine learning (ICML-07), New York, pp 193–200
Eaton E, desJardins M (2009) Set-based boosting for instance level transfer. In Proceedings of the 2009 IEEE international conference on data mining workshops (ICDMW-09), Washington, pp 422–428
Pardoe D, Stone P (2010) Boosting for regression transfer. In: Proceedings of the 27th international conference on Machine learning (ICML-10), pp 863–870
Yao Y, Doretto G (2010) Boosting for transfer learning with multiple sources. In: The 24th IEEE conference on computer vision and pattern recognition (CVPR-10), pp 1855–1862
Lawrence ND, Platt JC (2004) Learning to learn with the informative vector machine. In: Proceedings of the 21st international conference on machine learning (ICML-04). ACM, Banff
Tong B, Gao J, Thach N, Suzuki E (2011) Gaussian process for dimensionality reduction in transfer learning. In: Proceedings of the 11th SIAM international conference on data mining (SDM-11), pp 783–794
Gao X, Wang X, Li X, Tao D (2011) Transfer latent variable model based on divergence analysis. Pattern Recogn 44(10–11):2358–2366
Article MATH Google Scholar
Mihalkova L, Mooney RJ (2008) Transfer learning by mapping with minimal target data. In: Proceedings of the AAAI-2008 workshop on transfer learning for complex tasks, Chicago
Davis J, Domingos P (2008) Deep transfer via second-order markov logic. In: Proceedings of the AAAI-2008 workshop on transfer learning for complex tasks, Chicago
Arnold A, Nallapati R, Cohen W (2007) A comparative study of methods for transductive transfer learning. In: Proceedings of the seventh IEEE international conference on data mining workshops (ICDMW-07), Washington, pp 77–82
Daum′e H III (2007) Frustratingly easy domain adaptation. The association for computational linguistics (ACL-2007)
Blitzer J, Dredze M, Pereira F. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Association for computational linguistics, Prague
Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP-06), Association for Computational Linguistics, Stroudsburg, pp 120–128
Krupka E, Tishby N (2007) Incorporating prior knowledge on features into learning. In: Proceedings of the 11th international conference on artificial intelligence and statistics, San Juan
Satpal S, Sarawagi S (2007) Domain adaptation of conditional probability models via feature subsetting. In: Proceedings of the 11th European conference on principles and practice of knowledge discovery in databases (PKDD-2007), Berlin, pp 224–235
Tu W, Sun S (2011) Transferable discriminative dimensionality reduction. In: Proceedings of the ICTAI, pp 865–868
Tu W, Sun S (2012) Subject transfer framework for EEG classification. Neurocomputing 82:109–116
Article Google Scholar
Pan SJ, Kwok JT, Yang Q (2008) Transfer learning via dimensionality reduction. In: Proceedings of the 23rd AAAI conference on artificial intelligence, Chicago (AAAI-08), Illinois, pp 677–682
Pan SJ, Tsang IW, Kwok JT, Yang Q (2009) Domain adaptation via transfer component analysis. In: Proceedings of the 21st international joint conference on artificial intelligence (IJCAI-09), Pasadena
Borgwardt K, Gretton A, Rasch M, Kriegel H, Schölkopf B, Smola A. Integrating structured biological data by kernel maximum mean discrepancy. In: Proceedings of the 14th international conference on intelligent systems for molecular biology, pp 49–57
Steinwart I (2001) On the influence of the kernel on the consistency of support vector machines. J Mach Learn Res 2:67–93
MathSciNet Google Scholar
Quanz B, HuanJ, Mishra M (2011) Knowledge transfer with low-quality data: a feature extraction issue. In: Proceedings of the IEEE international conference on data engineering (ICDE-11), Hannover
Quanz B, Huan J (2009) Large Margin Transductive Transfer Learning. In: Proceedings of the 18th ACM conference on information and knowledge management (CIKM-09), Hong Kong, pp 1327–1336
Ren J, Liang Z, Hu S (2010) Multiple kernel learning improved by MMD. ADMA (2):63–74
Zhang Z, Zhou J (2012) Multi-task clustering via domain adaptation. Pattern Recogn 45(1):465–473
Article MATH Google Scholar
Uguroglu S, Carbonell J (2011) Feature selection for transfer learning. ECML/PKDD 3:430–442
Google Scholar
Duan L, Tsang I, Xu D (2012) Domain transfer multiple kernel learning. IEEE Trans Pattern Anal Mach Intell 34(3):465–479
Article Google Scholar
Bregman L (1967) The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput Mathe Mathe Phys 7:200–217
Article Google Scholar
Zhang J, Zhang C (2010) Multitask Bregman clustering. In: Proceedings of the 25th AAAI conference on artificial intelligence, Chicago (AAAI-10), pp 655–660
Si S, Tao D, Geng B (2010) Bregman divergence-based regularization for transfer subspace learning. IEEE Trans Knowl Data Eng 22(7):929–942
Article Google Scholar
Si S, Tao D, Chan K (2010) Evolutionary cross-domain discriminative hessian eigenmaps. IEEE Trans Image Process 19(4):1075–1086
Article MathSciNet Google Scholar
Si S, Tao D, Wang M, Chan K (2012) Social image annotation via cross-domain subspace learning. Multimed Tools Appl 56(1):91–108
Article Google Scholar
Wu L, Hoi S, Jin R, Zhu J, Yu N (2012) Learning Bregman distance functions for semi-supervised clustering. IEEE Trans Knowl Data Eng 24(3):478–491
Article Google Scholar
Gao X, Wang X, Li X, Tao D (2011) Transfer latent variable model based on divergence analysis. Pattern Recogn 44(10–11):2358–2366
Article MATH Google Scholar
Zhang J, Zhang C (2011) Multitask Bregman clustering. Neurocomputing 74(10):1720–1734
Article Google Scholar
Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Article Google Scholar
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076
Article MathSciNet MATH Google Scholar
Dai W, Yang Q, Xue G-R, Yu Y (2009) EigenTransfer: a unified framework for transfer learning. In: Proceedings of the 26th international conference on machine learning (ICML-09)
http://www.zjucadcg.cn/dengcai/Data/FaceData.html
Phillips JP, Moon H, Rizvi SA, Rauss PJ (2000) The FERET evaluation methodology for face-recognition algorithms. IEEE Trans Pattern Anal Mach Intell 22(10):1090–1104
Article Google Scholar
http://images.ee.umist.ac.uk/danny/database.html

Download references

Acknowledgments

We would like to thank Sinno Jialin Pan and Sisi for providing the code of transfer component analysis and transfer subspace learning. We would like to express our appreciations to the editors and reviewers for their contributions in improving the quality of our paper. We gratefully acknowledge the supports from National Natural Science Foundation of China, under Grant No. 60975038 and Grant No. 61005003.

Author information

Authors and Affiliations

Department of Mathematics and System Science, National University of Defense Technology, Changsha, 410073, People’s Republic of China
Shizhun Yang, Chenping Hou & Yi Wu
State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology (TNList), Department of Automation, Tsinghua University, Beijing, 100084, People’s Republic of China
Shizhun Yang, Ming Lin & Changshui Zhang

Authors

Shizhun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Lin
View author publications
You can also search for this author in PubMed Google Scholar
Chenping Hou
View author publications
You can also search for this author in PubMed Google Scholar
Changshui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shizhun Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, S., Lin, M., Hou, C. et al. A general framework for transfer sparse subspace learning. Neural Comput & Applic 21, 1801–1817 (2012). https://doi.org/10.1007/s00521-012-1084-1

Download citation

Received: 21 April 2012
Accepted: 07 July 2012
Published: 01 August 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s00521-012-1084-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A general framework for transfer sparse subspace learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Transfer subspace learning joint low-rank representation and feature selection

Unsupervised Domain Adaptation Based on Subspace Alignment

Laplacian regularized low-rank sparse representation transfer learning

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now