Abstract
Transfer learning method has been widely used in machine learning when training data is limited. However, class noise accumulated during learning iterations can lead to negative transfer which can adversely affect performance when more training data is used. In this paper, we propose a novel method to identify noise samples for noise reduction. More importantly, the method can detect the point where negative transfer happens such that transfer learning can terminate at the near top performance point. In this method, we use the sum of the Rademacher distribution to estimate the class noise rate of transferred data. Transferred data having high probability of being labeled wrongly is removed to reduce noise accumulation. This negative sample reduction process can be repeated several times during transfer learning until we find the point where negative transfer occurs. As we can detect the point where negative transfer occurs, our method not only has the ability to delay the point where negative transfer happens, but also the ability to stop transfer learning algorithms at the right place for top performance gain. Evaluation based on cross-lingual/domain opinion analysis evaluation data set shows that our algorithm achieves the state-of-the-art result. Furthermore, our system shows a monotonic increase trend in performance improvement when more training data are used beating the performance degradation curse of most transfer learning methods when training data reaches certain size.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2(4):343–370
Arnold A, Nallapati R, Cohen WW (2007) A comparative study of methods for transductive transfer learning. In: ICDMW, pp 77–82
Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains: a case study. RANLP 1:2-1
Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: EMNLP, pp 120–128
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100
Brodley CE, Friedl MA et al (1996) Identifying and eliminating mislabeled training instances. AAAI IAAI 1:799–805
Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: NIPS, pp 2456–2464
Cheng Y, Li Q (2009) Transfer learning with data edit. In: International conference on ADMA, pp 427–434
Deng C, Guo MZ, Liu Y, Li HF (2008) Participatory learning based semi-supervised classification. In: 2008 Fourth International Conference on Natural Computation, vol 4, pp 207–216
Fukumoto F, Suzuki Y, Matsuyoshi S (2013) Text classification from positive and unlabeled data using misclassified data correction. In: ACL (2), pp 474–478
Gui L, Xu R, Xu J, Yuan L, Yao Y, Zhou J, Qiu Q, Wang S, Wong KF, Cheung R (2013) A mixed model for cross lingual opinion analysis. In: NLPCC, pp 93–104
Gui L, Xu R, Lu Q, Xu J, Xu J, Liu B, Wang X (2014) Cross-lingual opinion analysis via negative transfer detection. In: ACL (2), pp 860–865
Gui L, Lu Q, Xu R, Li M, Wei Q (2015) A novel class noise estimation method and application in classification. In: CIKM, pp 1081–1090
Holmstedt T (1970) Interpolation of quasi-normed spaces. Math Scand 26(1):177–199
Huang J, Gretton A, Borgwardt KM, Schölkopf B, Smola AJ (2006) Correcting sample selection bias by unlabeled data. In: NIPS, pp 601–608
Jiang Y, Zhou ZH (2004) Editing training data for kNN classifiers with neural network ensemble. In: International symposium on neural networks, pp 356–361
Li M, Zhou ZH (2005) Setred: self-training with editing. In: PAKDD, pp 611–621
Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In: IJCAI, pp 2873–2879
Liu P, Qiu X, Chen X, Wu S, Huang X (2015) Multi-timescale long short-term memory neural network for modelling sentences and documents. In: EMNLP, pp 2326–2335
Montgomery-Smith SJ (1990) The distribution of rademacher sums. Proc Am Math Soc 109(2):517–522
Muhlenbach F, Lallich S, Zighed DA (2004) Identifying and handling mislabelled instances. J Intell Inf Syst 22(1):89–109
Sluban B, Gamberger D, Lavra N (2010) Advances in class noise detection. In: ECAI, pp 1105–1106
Sugiyama M, Nakajima S, Kashima H, Buenau PV, Kawanabe M (2008) Direct importance estimation with model selection and its application to covariate shift adaptation. In: NIPS, pp 1433–1440
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR, pp 1–9
Wan X (2009) Co-training for cross-lingual sentiment classification. In: ACL, pp 235–243
Wang X (2015) Learning from big data with uncertainty. J Intell Fuzzy Syst 28(5):2329–2330
Xu J, Xu R, Lu Q, Wang X (2012) Coarse-to-fine sentence-level emotion classification based on the intra-sentence features and sentential context. In: CIKM, pp 2455–2458
Xu J, Zhang Y, Wu Y, Wang J, Dong X, Xu H (2015) Citation sentiment analysis in clinical trial papers. AMIA Annu Symp Proc 2015:1334–1341
Xu R, Gui L, Xu J, Lu Q, Wong K-F (2015) Cross lingual opinion holder extraction based on multi-kernel SVMs and transfer learning. World Wide Web 18(2):299–316
Zhai J, Li T, Wang X (2016) A cross-selection instance algorithm. J Intell Fuzzy Syst 30(2):717–728
Zhang ML, Zhou ZH (2011) Cotrade: confident co-training with data editing. IEEE Trans Syst Man Cybern Part B 41(6):1612–1626
Zhou X, Wan X, Xiao J (2012) Cross-language opinion target extraction in review texts. In: ICDM, pp 1200–1205
Zhu X, Wu X (2004) Cost-guided class noise handling for effective cost-sensitive learning. In: ICDM, pp 297–304
Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. ICML 3:920–927
Zighed DA, Lallich S, Muhlenbach F (2002) Separability index in supervised learning. In: ECML-PKDD, pp 475–487
Acknowledgements
This work was supported by the National Natural Science Foundation of China 61370165, U1636103, 61632011, National 863 Program of China 2015AA015405, Shenzhen Foundational Research Funding JCYJ20150625142543470 and Guangdong Provincial Engineering Technology Research Center for Data Science 2016KF09.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gui, L., Xu, R., Lu, Q. et al. Negative transfer detection in transductive transfer learning. Int. J. Mach. Learn. & Cyber. 9, 185–197 (2018). https://doi.org/10.1007/s13042-016-0634-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0634-8