Abstract
Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Thrun, S., Mitchell, T.M.: Learning one more thing. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 825–830. Morgan Kaufmann, San Francisco (1995)
Schmidhuber, J.: On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultat fur Informatik, Palo Alto, CA (1994)
Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)
Ben-David, S., Schuller, R.: Exploiting task relatedness for multiple task learning. In: Proceedings of the Sixteenth Annual Conference on Learning Theory, pp. 825–830. Morgan Kaufmann, San Francisco (2003)
DauméIII, H., Marcu, D.: Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26, 101–126 (2006)
Daumé III, H.: Frustratingly easy domain adaptation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, June 2007, pp. 256–263 (2007)
Dai, W., Xue, G., Yang, Q., Yu, Y.: Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA (August 2007)
Dai, W., Xue, G., Yang, Q., Yu, Y.: Transferring naive bayes classifiers for text classification. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence (July 2007)
Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of the Conference on Empirical Methods in Natural Language, Sydney, Australia, pp. 120–128 (2006)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 432–439 (2007)
Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: Transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, Corvalis, Oregon, USA, June 2007, pp. 759–766 (2007)
Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of Twenty-Third International Conference on Machine Learning (2006)
Pan, S.J., Yang, Q.: A survey on transfer learning. Technical Report HKUST-CS08-08, Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China (November 2008)
Raina, R., Ng, A.Y., Koller, D.: Constructing informative priors using transfer learning. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, USA, June 2006, pp. 713–720 (2006)
Wu, P., Dietterich, T.G.: Improving svm accuracy by training on auxiliary data sources. In: Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada. ACM, New York (2004)
Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for transductive transfer learning. In: Proceedings of the 7th IEEE International Conference on Data Mining Workshops, Washington, DC, USA, pp. 77–82. IEEE Computer Society, Los Alamitos (2007)
Raykar, V.C., Krishnapuram, B., Bi, J., Dundar, M., Rao, R.B.: Bayesian multiple instance learning: automatic feature selection and inductive transfer. In: Proceedings of the 25th International Conference on Machine learning, Helsinki, Finland, pp. 808–815. ACM, New York (2008)
Ling, X., Xue, G.R., Dai, W., Jiang, Y., Yang, Q., Yu, Y.: Can chinese web pages be classified with english data source? In: Proceedings of the 17th International Conference on World Wide Web, Beijing, China, pp. 969–978. ACM, New York (2008)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. Intell. Data Anal. 8(3), 281–300 (2004)
Tsymbal, A.: The problem of concept drift: Definitions and related work
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: KDD 2001: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM Press, New York (2001)
Kolter, J., Maloof, M.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 123–130. IEEE Press, Los Alamitos (2003)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 226–235. ACM Press, New York (2003)
Gao, J., Fan, W., Han, J., Yu, P.S.: A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA (2007)
Pan, S.J., Shen, D., Yang, Q., Kwok, J.T.: Transferring localization models across space. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence, pp. 1383–1388 (2008)
Zheng, V.W., Pan, S.J., Yang, Q., Pan, J.J.: Transferring multi-device localization models using latent multi-task learning. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence, Chicago, Illinois, USA, July 2008, pp. 1427–1432 (2008)
Ling, X., Xue, G.R., Dai, W., Jiang, Y., Yang, Q., Yu, Y.: Can chinese web pages be classified with english data source? In: WWW 2008: Proceeding of the 17th International conference on World Wide Web, pp. 969–978. ACM, New York (2008)
Zhuo, H., Yang, Q., Hu, D.H., Li, L.: Transferring knowledge from another domain for learning action models. In: Ho, T.-B., Zhou, Z.-H. (eds.) PRICAI 2008. LNCS (LNAI), vol. 5351, pp. 1110–1115. Springer, Heidelberg (2008)
Li, B., Yang, Q., Xue, X.: Transfer learning for collaborative filtering via a rating-matrix generative model. In: ICML, pp. 617–624 (2009)
Yang, Q., Pan, S.J., Zheng, V.W.: Estimating location using Wi-Fi. IEEE Intelligent Systems 23(1), 8–13 (2008), http://www.cse.ust.hk/~qyang/ICDMDMC07/
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377
Bel, N., Koster, C.H.A., Villegas, M.: Cross-lingual text categorization. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 126–139. Springer, Heidelberg (2003)
Wu, Y., Oard, D.W.: Bilingual topic aspect classification with a few training examples. In: SIGIR 2008: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 203–210. ACM, New York (2008)
Rigutini, L., Maggini, M., Liu, B.: An em based training algorithm for cross-language text categorization. In: WI 2005: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Washington, DC, USA, pp. 529–535. IEEE Computer Society, Los Alamitos (2005)
Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Domingos, P., Kok, S., Lowd, D., Poon, H., Richardson, M., Singla, P., Sumner, M., Wang, J.: Markov logic: A unifying language for structural and statistical pattern recognition. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 3–3. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Q. (2009). Transfer Learning beyond Text Classification. In: Zhou, ZH., Washio, T. (eds) Advances in Machine Learning. ACML 2009. Lecture Notes in Computer Science(), vol 5828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05224-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-05224-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05223-1
Online ISBN: 978-3-642-05224-8
eBook Packages: Computer ScienceComputer Science (R0)