Abstract
Transfer learning provides a solution in real applications of how to learn a target task where a large amount of auxiliary data from source domains are given. Despite numerous research studies on this topic, few of them have a solid theoretical framework and are parameter-free. In this paper, we propose an Extended Minimum Description Length Principle (EMDLP) for feature-based inductive transfer learning, in which both the source and the target data sets contain class labels and relevant features are transferred from the source domain to the target one. Unlike conventional methods, our encoding measure is based on a theoretical background and has no parameter. To obtain useful features to be used in the target task, we design an enhanced encoding length by adopting a code book that stores useful information obtained from the source task. With the code book that builds connections between the source and the target tasks, our EMDLP is able to evaluate the inferiority of the results of transfer learning with the add sum of the code lengths of five components: those of the corresponding two hypotheses, the two data sets with the help of the hypotheses, and the set of the transferred features. The proposed method inherits the nice property of the MDLP that elaborately evaluates the hypotheses and balances the simplicity of the hypotheses and the goodness-of-the-fit to the data. Extensive experiments using both synthetic and real data sets show that the proposed method provides a better performance in terms of the classification accuracy and is robust against noise.
Similar content being viewed by others
References
Adhikari A, Ramachandrarao P, Pedrycz W (2011) Study of select items in different data sources by grouping. Knowl Inf Syst (KAIS) 27(1): 23–43
Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. In Proceedings of NIPS 2007, pp 41–48
Aronovich L, Spiegler I (2010) Bulk construction of dynamic clustered metric trees. Knowl Inf Syst (KAIS) 22(2): 211–244
Bakker B, Heskes T (2003) Task clustering and gating for bayesian multitask learning. J Mach Learn Res 4: 83–99
Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: Proceedings of COLT 2003, pp 567–580
Cao B, Pan SJ, Yang Q (2010) Adaptive transfer learning. In: Proceedings of AAAI 2010, pp 407–412
Caruana R (1997) Multitask learning. Mach Learn 28(1): 41–75
Dai W, Yang Q, Xue G, Yu Y (2007) Boosting for transfer learning. In: Proceedings of ICML 2007, pp 193–200
Dai W, Yang Q, Xue G, Yu Y (2008) Self-taught clustering. In: Proceedings of ICML 2008, pp 550–565
Daume H III, Marcu D (2006) Domain adaption for statistical classifiers. J Artif Intell Res 26: 101–126
Dhillon PS, Ungar L (2009) Transfer learning, feature selection and word sense disambiguation. In: Proceedings of ACL-IJCNLP 2009, pp 257–260
Grünwald PD (2007) The minimum description length principle. MIT Press, Cambridge
Jin R, Breitbart Y, Muoh C (2008) Data discretization unification. Knowl Inf Syst (KAIS) 19: 1–29
Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. In: Proceedigs of KDD 2004, pp 206–215
McAllester DA (2001) PAC-Bayesian stochastic model selection. Mach Learn J 51(1): 5–21
Pan SJ, Yang Q (2008) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10): 1345–1359
Quinlan JR, Rivest RL (1989) Inferring decision trees using the minimum description length principle. Inf Comput 80(3): 227–248
Quinlan JR (1990) Learning logical definitions from relations. Mach Learn 5: 239–266
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, Los Altos
Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: Proceedings of ICML 2007, pp 759–766
Rückert U, Kramer S (2008) Kernel-based inductive transfer. In: Proceedings of ECML/PKDD 2008, pp 220–233
Shannon C (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423, 623–656
Shao H, Tong B, Suzuki E (2011) Compact coding for hyperplane classifiers in heterogeneous environment. In: Proceedings of ECML/PKDD 2011, pp 207–222
Shen B, Yao M, Wu Z, Gao Y (2010) Mining dynamic association rules with comments. Knowl Inf Syst (KAIS) 23(1): 73–98
Shi X, Fan W, Ren J (2008) Actively transfer domain knowledge. In: Proceedings of ECML/PKDD (2) 2008, pp 342–357
Shi X, Fan W, Yang Q, Ren J (2009) Relaxed transfer of different classes via spectral partition. In: Proceedings of ECML/PKDD (2) 2009, pp 366–381
Shi X, Liu Q, Fan W, Yu P, Zhu R (2010) Transfer learning on heterogenous feature spaces vis spectral transformation. In: Proceedings of ICDM 2010, pp 1049–1054
Shi Y (2010) Multiple criteria optimization-based data mining methods and applications: a systematic survey. Knowl Inf Syst (KAIS) 24(3): 369–391
Shi Y, Lan Z, Liu W, Bi W (2009) Extended semi-supervised learning methods for inductive transfer learning. In: Proceedings of ICDM 2009, pp 483–492
Shimodaira H (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J Stat Plan Inference 90: 227–244
Wallace C, Patrick J (1993) Coding decision trees. Mach Learn 11(1): 7–22
Wang Z, Zhang C (2008) Transferred dimensionality reduction. In: Proceedings of ECML/PKDD 2008, pp 550–565
Weng C, Chen Y (2010) Mining fuzzy association rules from uncertain data. Knowl Inf Syst (KAIS) 23(2): 129–152
Wu XD et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst (KAIS) 14(1): 1–37
Xie S, Fan W, Peng J, Verscheure O, Ren J (2009) Latent space domain transfer between high dimensional overlapping distributions. In: Proceedings of WWW 2009, pp 91–100
Zadrozny B (2004) Learning and evaluating classifiers under sample selection bias. In: Proceedings ICML 2004, pp 903–910
Zhang M, Alhajj R (2010) Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space. Knowl Inf Syst (KAIS) 22(1): 1–26
Zhong E, Fan W, Yang Q, Verscheure O, Ren J (2010) Cross validation framework to choose amongst models and datasets for transfer learning. In: Proceedings of ECML/PKDD 2010, pp 547–562
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was partially supported by the grant-in-aid for scientific research on fundamental research (B) 21300053 from the Japanese Ministry of Education, Culture, Sports, Science and Technology, and the Strategic International Cooperative Program funded by the Japan Science and Technology Agency (JST).
Rights and permissions
About this article
Cite this article
Shao, H., Tong, B. & Suzuki, E. Extended MDL principle for feature-based inductive transfer learning. Knowl Inf Syst 35, 365–389 (2013). https://doi.org/10.1007/s10115-012-0505-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0505-x