Abstract
In real-world applications, we often have to deal with some high-dimensional, sparse, noisy, and non-independent identically distributed data. In this paper, we aim to handle this kind of complex data in a transfer learning framework, and propose a robust non-negative matrix factorization via joint sparse and graph regularization model for transfer learning. First, we employ robust non-negative matrix factorization via sparse regularization model (RSNMF) to handle source domain data and then learn a meaningful matrix, which contains much common information between source domain and target domain data. Second, we treat this learned matrix as a bridge and transfer it to target domain. Target domain data are reconstructed by our robust non-negative matrix factorization via joint sparse and graph regularization model (RSGNMF). Third, we employ feature selection technique on new sparse represented target data. Fourth, we provide novel efficient iterative algorithms for RSNMF model and RSGNMF model and also give rigorous convergence and correctness analysis separately. Finally, experimental results on both text and image data sets demonstrate that our REGTL model outperforms existing start-of-art methods.
Similar content being viewed by others
References
Bellman R (1961) Adaptive control processes: a guided tour [M]. Princeton University Press, Princeton
Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562
Cai D, He X, Han J, Huang TS (2011) Graph regularization non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Liu H, Wu Z, Li X, Cai D, Huang TS (2012) Constrained nonnegative matrix factorization for image representation. IEEE Trans Pattern Anal Mach Intell 34(7):1299–1311
Gu Q, Zhou J (2009) Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 359–368
Gu Q, Ding C, Han J (2011) On trivial solution and scale transfer problems in graph regularized nonnegative matrix factorization. In: International joint conferences on artificial intelligence, pp 1288–1293
Shang F, Jiao LC, Wang F (2012) Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn 45:2237–2250
Zhang L, Chen Z, Zheng M, He X (2011) Robust non-negative matrix factorization. Front Electr Electron Eng China 6(2):192–200
Kong D, Ding C, Huang H (2011) Robust non-negative matrix factorization using L21-norm. CIKM 673–682
Zhang H, Zha Z, Yan S, Wang M, Chua T (2012) Robust non-negative graph embedding: towards noisy data, unreliable graphs, and noisy labels. CVPR, 2464–2471
Kim J, Monteiro R, Park H (2012) Group sparsity in nonnegative matrix factorization. SDM 851–862
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Li SZ, Hou X, Zhang H, Cheng Q (2001) Learning spatially localized, parts-based representation. In 2001 IEEE computer society conference on computer vision and pattern recognition (CVPR’01), pp 207–212
Xu W, Liu X, Gong Y (2003) Document clustering based on nonnegative matrix factorization. In: Proceedings of 2003 international conference on research and development in information retrieval (SIGIR’03), pp 267–273, Toronto, Canada, Aug. 2003
Cichocki A, Lee H, Kim YD, Choi S (2008) Non-negative matrix factorization with alpha-divergence. Pattern Recogn Lett 29:1433–1440
Cichocki A, Zdunek R, Choi S, Plemmons R, Amari-ichi S (2007) Nonnegative tensor factorization using alpha and beta divergences. In: Proceedings IEEE international conference on acoustics, speech, and signal processing (ICASSP07), pp 1393–1396
Dhillon IS, Sra S (2005) Generalized non negative matrix approximations with Bregman divergences. Annual conference on neural information processing systems (NIPS). Vancouver, Canada, pp 283–290
Guan N, Tao D, Luo Z, Shawe-Taylor J (2012) MahNMF: manhattan non-negative matrix factorization. CoRR abs/1207.3438
Zhang ZY (2011) Divergence functions of non negative matrix factorization: a comparison study, communications. Stat Simul Comput 40(10):1594–1612
Wang Y, Zhang Y (2011) Non-negative matrix factorization: a comprehensive review. IEEE Trans Knowl Data Eng 99:1–20
Zhang ZY Non-negative matrix factorization: models, algorithms, and applications. Data Mining: Foundations and Intelligent Paradigms, ISRL 24, pp 99–134
Zhu X, Huang Z, Yang Y, Shen H, Xun C, Luo J (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recogn 46:215–229
Xu Z, Sun S (2011) Multi-view transfer learning with adaboost. In: Proceedings of the 23rd IEEE international conference on tools with artificial intelligence (ICTAI), pp 399–402
Xu Z, Sun S (2012) Multi-source transfer learning with multi-view adaboost. Lect Notes Comput Sci 7665:332–339
Tong B, Gao J, Thach N, Suzuki E (2011) Gaussian process for dimensionality reduction in transfer learning. SDM, pp 783–794
Gao X, Wang X, Li X, Tao D (2011) Transfer latent variable model based on divergence analysis. Pattern Recogn 44(10–11):2358–2366
Mihalkova L, Mooney RJ (2008) Transfer learning by mapping with minimal target data. In Proceedings of the AAAI-2008 workshop on transfer learning for complex tasks, Chicago, Illinois, USA, July 2008
Davis J, Domingos P (2008) Deep transfer via second-order markov logic. In Proceedings of the AAAI-2008 workshop on transfer learning for complex tasks, Chicago, Illinois, USA, July 2008
Blitzer J, Dredze M, Pereira F Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Association for computational linguistics, Prague, Czech Republic
Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In Proceedings of the 2006 conference on empirical methods in natural language processing, EMNLP’06, Association for Computational Linguistics, Stroudsburg, PA, USA, pp 120–128, 2006
Pan SJ, Kwok JT, Yang Q (2008) Transfer learning via dimensionality reduction. In Proceedings of the 23rd AAAI conference on artificial intelligence, Chicago, Illinois, USA, pp. 677–682, July 2008
Pan SJ, Tsang IW, Kwok JT, Yang Q (2009) Domain adaptation via transfer component analysis. In Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California
Wang H, Nie F, Huang H, Ding C (2011) Dyadic transfer learning for cross-domain image classification. IEEE Int Conf Comput Vis, pp 551–556
Wang H, Nie F, Huang H, Ding C (2011) Cross-language web page classification via dual knowledge transfer using non-negative matrix tri-factorization. SIGIR 933–942
Long M, Wang J, Ding G, Cheng W, Zhang X, Wang W (2012) Dual transfer learning. In: Proceedings of the 12th SIAM international conference on data mining (SIAM SDM 2012)
Long M, Wang J, Ding G, Shen D, Yang Q (2012). Transfer learning with graph co-regularization. In: Proceedings of the 26th AAAI conference on artificial intelligence (AAAI-2012)
Tikhonov AN (1963) Regularization of incorrectly posed problems. Soviet Math Dokl 4:1624–1627
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Sun S, Shawe-Taylor J (2010) Sparse semi-supervised learning using conjugate functions. J Mach Learn Res 11:2423–2455
Sun S (2011) Multi-view Laplacian support vector machines. Lect Notes Comput Sci 7121:209–222
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst (T-NNLS) 23(11):1738–1754
Xiang S, Nie F, Pan C, Zhang C (2011) Regression reformulations of LLE and LTSA with locally linear transformation. IEEE Trans Syst Man Cybern B (T-SMC-B) 41(5):1250–1262
Ding C, Zhou D, He X, Zha H (2006) R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization. ICML
Obozinski G, Taskar B, Jordan M (2006) Multi-task feature selection. Technical report, Department of Statistics, University of California, Berkeley
Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. NIPS 41–48
Nie F, Huang H, Cai X, Ding C (2010) Effective and robust feature selection via joint l2,1-norms minimization. In: Proceedings of the annual conference on advances in neural information processing systems (NIPS-10)
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst (T-NNLS) 23(11):1738–1754
Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
http://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/multiclass.html/usps
He X, Niyogi P (2003) Locality preserving projections. Neural Inf Process Syst 197–204
Papadimitriou CH, Steiglitz K (1998) Combinatorial optimization: algorithms and complexity. Dover, New York
Strehl A, Ghosh J (2002) Cluster ensembles-acknowledge reuse framework for combining multiple partitions. J Mach Learn Res (JMLR) 3:583–617
Acknowledgments
We would like to express our appreciations to the editors and reviewers for their contributions in improving the quality of our paper. We gratefully acknowledge the supports from National Natural Science Foundation of China, under Grant No. 61075004, Grant No. 91120301 and Grant No. 61005003. We also acknowledge the support of Hunan Provincial Innovation Foundation for Postgraduate.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, S., Hou, C., Zhang, C. et al. Robust non-negative matrix factorization via joint sparse and graph regularization for transfer learning. Neural Comput & Applic 23, 541–559 (2013). https://doi.org/10.1007/s00521-013-1371-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-013-1371-5