Skip to main content
Log in

A general framework for scalable transductive transfer learning

  • Regular paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Transductive transfer learning is one special type of transfer learning problem, in which abundant labeled examples are available in the source domain and only unlabeled examples are available in the target domain. It easily finds applications in spam filtering, microblogging mining, and so on. In this paper, we propose a general framework to solve the problem by mapping the input features in both the source domain and the target domain into a shared latent space and simultaneously minimizing the feature reconstruction loss and prediction loss. We develop one specific example of the framework, namely latent large-margin transductive transfer learning algorithm, and analyze its theoretic bound of classification loss via Rademacher complexity. We also provide a unified view of several popular transfer learning algorithms under our framework. Experiment results on one synthetic dataset and three application datasets demonstrate the advantages of the proposed algorithm over the other state-of-the-art ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.daviddlewis.com/resources/testcollections/reuters21578/.

  2. http://www.ee.duke.edu/~lcarin/LandmineData.zip.

References

  1. Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. In Proceedings of advances in neural information processing systems, MIT Press

  2. Argyriou A, Evgeniou T, Pontil M (2007) A comparative study of methods for transductive transfer learning, In: Proceedings of IEEE international conference on data miningW

  3. Bahadori MT, Liu Y, Zhang D (2011) Learning with minimum supervision: a general framework for transductive transfer learning. In: Proceedings of IEEE international conference on data mining

  4. Bartlett PL, Mendelson S (2003) Rademacher and Gaussian complexities: risk bounds and structural results. J Mach Learn Res 3:463–482

    Google Scholar 

  5. Ben-David S, Blitzer J, Crammer K, Pereira F (2007) Analysis of representations for domain adaptation. In: Proceedings of advances in neural information processing systems

  6. Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: EMNLP

  7. John B, Mark D, Fernando P (2007) Biographies. Domain adaptation for sentiment classification. In: Proceedings of the annual meeting of the Association for Computational Linguistics, Bollywood, Boom-boxes and Blenders

  8. Bonilla E, Chai KM, Williams C (2008) Multi-task Gaussian process prediction. In: Proceedings of advances in neural information processing systems

  9. Bottou L (1998) Online algorithms and stochastic approximations. In: David S (ed) Online learning and neural networks. Cambridge University Press, Cambridge

    Google Scholar 

  10. Bottou L (2011) Stochastic gradient descent (version 2). http://leon.bottou.org/projects/sgd

  11. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  12. Bradley DM, Andrew BJ (2009) Convex coding. In: Proceedings of conference on uncertainty in artificial intelligence

  13. Chapelle O, Zien A (2005) Semi-supervised classification by low density separation. In proceedings of the Tenth Internationlal Workshop on Artificial Intelligence and Statistics, pp 57–64

  14. Chapelle O, Sindhwani V, Keerthi SS (2008) Optimization techniques for semi-supervised support vector machines. J Mach Learn Res 9:203–233

    Google Scholar 

  15. Chong EKP, Zak SH (2008) An introduction to optimization (Wiley-Interscience series in discrete mathematics and optimization), 3rd edn. Wiley-Interscience, New York

    Google Scholar 

  16. Collobert R, Sinz F, Weston J, Bottou L (2006) Large scale transductive SVMs. J Mach Learn Res 7:1687–1712

    Google Scholar 

  17. Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings of international conference on machine learning

  18. Dai W, Yang Q, Xue G-R, Yu Y (2008) Self-taught clustering. In: Proceedings of international conference on machine learning

  19. Daum H (2007) Frustratingly easy domain adaptation. In: Proceedings of the annual meeting of the Association for Computational Linguistics

  20. Davis J, Domingos P (2009) Deep transfer via second-order markov logic. In: Proceedings of international conference on machine learning

  21. Duan L, Tsang IW, Xu D, Maybank SJ (2009) Domain transfer svm for video concept detection. In: IEEE conference on computer vision and pattern recognition

  22. El-Yaniv R, Pechyony D (2007) Transductive rademacher complexity and its applications. In: Proceedings of conference on learning theory

  23. Gu Z, Rothberg E, Bixby R (2011) Gurobi 4.6.2, 2011. http://www.gurobi.com/

  24. He J, Liu Y, Lawrence R (2009) Graph-based transfer learning. In: Proceedings of conference on information and knowledge management

  25. Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of international conference on machine learning

  26. Lanckriet G, Cristianini N, Bartlett P, Ghaoui LE (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72

    Google Scholar 

  27. Lawrence ND, Platt JC (2004) Learning to learn with the informative vector machine. In: Proceedings of international conference on machine learning

  28. Lee H, Battle A, Raina R, AY Ng (2007) Efficient sparse coding algorithms. In: Proceedings of advances in neural information processing systems

  29. Liu L, Liang Q (2011) A high-performing comprehensive learning algorithm for text classification without pre-labeled training set. Knowl Inf Syst 29(3):727–738

    Google Scholar 

  30. Mihalkova L, Mooney RJ (2008) Transfer learning by mapping with minimal target data. In: Proceedings of AAAI conference on artificial intelligence

  31. Mihalkova L, Huynh T, Mooney RJ (2007) Mapping and revising markov logic networks for transfer learning. In: Proceedings of AAAI conference on artificial intelligence

  32. Pan SJ, Yang Q (2010) A survey on transfer learning. TKDE

  33. Pan SJ, Kwok JT, Yang Q (2008) Transfer learning via dimensionality reduction. In: Proceedings of AAAI conference on artificial intelligence

  34. Quanz B, Huan J (2009) Large margin transductive transfer learning. In: Proceedings of conference on information and knowledge management

  35. Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: Proceedings of international conference on machine learning

  36. Shalev-Shwartz S, Singer Y, Srebro N (2007) Pegasos: primal estimated sub-gradient solver for svm. In: Proceedings of international conference on machine learning

  37. Shao H, Tong B, Suzuki B (2012) Extended MDL principle for feature-based inductive transfer learning. Knowl Inf Syst 35(2):365–369

    Google Scholar 

  38. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  39. Thrun S, Pratt L (eds) (1998) Learning to learn. Kluwer Academic Publishers, Norwell

    MATH  Google Scholar 

  40. Vapnik VN (1995) The nature of statistical learning theory. Springer, New York

    Book  MATH  Google Scholar 

  41. Junhui W, Xiaotong S, Wei P (2007) On transductive support vector machines. In: Prediction and Discovery. American Mathematical Society

  42. Xu W (2011) Towards optimal one pass large scale learning with averaged stochastic gradient descent. CoRR

  43. Xu Z, Jin R, Zhu J, King I, Lyu M (2008) Efficient convex relaxation for transductive support vector machine. In: Proceedings of advances in neural information processing systems

  44. Xue Y, Liao X, Carin L, Krishnapuram B (2007) Multi-task learning for classification with dirichlet process priors. J Mach Learn Res 8:35–63

    Google Scholar 

  45. Zhang D, Liu Y, Lawrence RD, Chenthamarakshan V (2011) Transfer latent semantic learning: Microblog mining with less supervision. In: Proceedings of AAAI conference on artificial intelligence

Download references

Acknowledgments

The work in this paper is sponsored by the US Defense Advanced Research Projects Agency (DARPA) under the Anomaly Detection at Multiple Scales (ADAMS) program, Agreement Number W911NF-11-C-0200. The views and conclusions contained in this document are those of the author(s) and should not be interpreted as representing the official policies, either expressed or implied, of the US Defense Advanced Research Projects Agency or the US Government. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Taha Bahadori.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bahadori, M.T., Liu, Y. & Zhang, D. A general framework for scalable transductive transfer learning. Knowl Inf Syst 38, 61–83 (2014). https://doi.org/10.1007/s10115-013-0647-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0647-5

Keywords

Navigation