Abstract
A major assumption in traditional machine leaning is that the training and testing data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. In recent years, transfer learning has emerged as a new learning paradigm to cope with this considerable challenge. It focuses on exploiting previously learnt knowledge by leveraging information from an old source domain to help learning in a new target domain. In this work, we integrate the knowledge-leverage-based Transfer Learning mechanism with a Rank-based Reduce Error ensemble selection approach to fulfill the transfer learning task, called RankRE-TL. Ensemble selection is important for improving both efficiency and predictive accuracy of an ensemble system. It aims to select a proper subset of the whole ensemble, which usually outperforms the whole one. Therefore, we appropriately modify the Reduce Error (RE) pruning technique and design a new Rank-based Reduce Error ensemble selection method (RankRE) to deal with the transfer learning task. The design idea of RankRE is to find the candidate classifier which is expected to improve the classification performance of the extended subensemble the most. In the RankRE-TL algorithm, the initial Support Vector Machine (SVM) ensemble is learnt based upon dynamic training dataset regrouping. And simultaneously, a new construction method of validation set is designed for RankRE-TL, which differs from the method used in conventional ensemble selection paradigm.
Similar content being viewed by others
References
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Dai W, Xue GR, Yang Q, Yu Y (2007) Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07. San Jose, California, USA, pp 210–219
Sarinnapakorn K, Kubat M (2007) Combining subclassifiers in text categorization: a DST-Based solution and a case study. IEEE Trans Knowl Data Eng 19(12):1638–1651
Xue GR, Dai W, Yang Q, Yu Y (2008) Topic-bridged PLSA for cross-domain text classification. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’08. Singapore, pp 627–634
Zhu Y, Chen Y, Lu Z, Pan SJ, Xue GR, Yu Y et al (2011) Heterogeneous transfer learning for image classification. In: Proceedings of the twenty-fifth AAAI conference on artificial intelligence, AAAI 2011. San Francisco, California, USA, pp 7–11
Wu P, Dietterich TG (2004) Improving SVM accuracy by training on auxiliary data sources. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04. Banff, Alberta, Canada, July 04 - 08
Hal Daumé I, Marcu D (2006) Domain adaptation for statistical classifiers. J Artif Intell Res 26(1):101–126
Yang J, Yan R, Hauptmann AG (2007) Cross-domain video concept detection using adaptive svms. In: Proceedings of the 15th ACM international conference on multimedia, MM ’07. Augsburg, Germany, pp 188–197
Jiang W, Zavesky E, Chang SF, Loui A (2008) Cross-domain learning methods for high-level visual concept classification. In: 15th IEEE international conference on image processing, ICIP 2008, pp 161–164
Dai Q (2013) A competitive ensemble pruning approach based on cross-validation technique. Knowl-Based Syst 37(2):394–414
Kamishima T, Hamasaki M, Akaho S (2009) TrBagg: a simple transfer learning method and its application to personalization in collaborative tagging. In: Ninth IEEE international conference on data mining, ICDM ’09, pp 219–228
Zhou Z, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1):239–263
Martínez-Muñoz G, Hernández-Lobato D, Suárez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31(2):245–259
Tsoumakas G, Partalas I, Vlahavas I (2009) An ensemble pruning primer. In: Okun O, Valentini G (eds) Applications of supervised and unsupervised ensemble methods, the series studies in computational intelligence, vol 245. Springer, Heidelberg, pp 1–13
Partalas I, Tsoumakas G, Vlahavas I (2010) An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Mach Learn 81(3):257–282
Martínez-Muñoz G, Suárez A (2004) Aggregation ordering in bagging. In: Proceeding of the IASTED international conference on artificial intelligence and applications. Innsbruck, Austria, pp 258–263
Partalas I, Tsoumakas G, Vlahavas I (2012) A study on greedy algorithms for ensemble pruning, Technical Report TR-LPIS-360-12, Department of Informatics. Aristotle University of Thessaloniki, Greece
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Partalas I, Tsoumakas G, Vlahavas I (2009) Pruning an ensemble of classifiers via reinforcement learning. Neurocomputing 72(7):1900–1909
Duan L, Tsang IW, Xu D (2012) Domain transfer multiple kernel learning. IEEE Trans Pattern Anal Mach Intell 34(3):465–479
Duan H, Shao X, Hou W, He G, Zeng Q (2009) An incremental learning algorithm for Lagrangian support vector machines. Pattern Recogn Lett 30(15):1384–1391
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2 (2):121–167
Gao J, Fan W, Jiang J, Han J (2008) Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08. Las Vegas, Nevada, USA, pp 283–291
Deng Z, Choi KS, Jiang Y, Wang S (2014) Generalized hidden-mapping ridge regression, knowledge-leveraged inductive transfer learning for neural networks, fuzzy systems and kernel methods. IEEE Trans Cybern 44(12):2585–2599
Wang Y, Xiao J (2011) Transfer ensemble model for customer churn prediction with imbalanced class distribution. In: International conference on information technology, computer engineering and management sciences, vol 3, pp 177–181
Margineantu DD, Dietterich TG (1997) Pruning adaptive boosting. In: Proceedings of the fourteenth international conference on machine learning, ICML ’97. Nashville, TN, pp 211–218
Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2016) Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets. Inf Sci 354:178–196
Zhou Z, Wu X, Jiang Y, Chen S (2001) Genetic algorithm based selective neural network ensemble. In: IJCAI-01: proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, Washington
Zhang Y, Burer S, Street WN (2006) Ensemble pruning via semi-definite programming. J Mach Learn Res 7:1315–1338
Fu B, Wang Z, Pan R, Xu G, Dolog P (2013) An integrated pruning criterion for ensemble learning based on classification accuracy and diversity. In: Uden L, Herrera F, Bajo Pérez J, Corchado Rodríguez J (eds) 7th international conference on knowledge management in organizations: service and cloud computing. Advances in intelligent systems and computing, vol 172. Springer, Berlin, Heidelberg
Fern XZ, Lin W (2008) Cluster ensemble selection. Stat Anal Data Min 1(3):128–141
Bakker B, Heskes T (2003) Clustering ensembles of neural network models. Neural Netw 16(2):261–269
Fu Q, Qiang S, Zhao S (2005) Clustering-based selective neural network ensemble. J Zheijang Univ Sci A 6(5):387–392
Martínez-Muñoz G, Suárez A (2007) Using boosting to prune bagging ensembles. Pattern Recogn Lett 28(1):156–165
Martínez-Muñoz G, Suárez A (2006) Pruning in ordered bagging ensembles. In: Proceedings of the 23rd international conference on machine learning, pp 609–616
Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings of the 24th international conference on machine learning, ICML ’07. Corvalis, Oregon, USA , pp 193–200
Bickel S (2006) ECML-PKDD discovery challenge 2006 overview. In: Proceedings of ECML-PKDD discovery challenge workshop at Humboldt-Universität zu Berlin, Germany, pp 1–9
Meng J, Lin H, Yu Y (2010) Transfer learning based on svd for spam filtering. In: 2010 international conference on intelligent computing and cognitive informatics (ICICCI), vol 2010 , pp 491–494
Shi X, Fan W, Ren J (2008) Actively transfer domain knowledge. In: Machine learning and knowledge discovery in databases, pp 342–357
Zhao S, Cao Q, Chen J, Zhang Y, Tang J, Duan Z (2016) A multi-atl method for transfer learning across multiple domains with arbitrarily different distribution. Knowl-Based Syst 94:60–69
Xue G, Dai W, Yang Q, Yu Y (2008) Topic-bridged PLSA for cross-domain text classification. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 627–634
Dai W, Xue G, Yang Q, Yu Y (2007) Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 210–219
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant no. 61473150.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, M., Dai, Q. A novel knowledge-leverage-based transfer learning algorithm. Appl Intell 48, 2355–2372 (2018). https://doi.org/10.1007/s10489-017-1084-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-017-1084-z