Abstract
Obtaining high-quality labelled data for training a classifier in a new application domain is often costly. Transfer Learning (a.k.a. “Inductive Transfer”) tries to alleviate these costs by transferring, to the “target” domain of interest, knowledge available from a different “source” domain. In transfer learning the lack of labelled information from the target domain is compensated by the availability at training time of a set of unlabelled examples from the target distribution. Transductive Transfer Learning denotes the transfer learning setting in which the only set of target documents that we are interested in classifying is known and available at training time. Although this definition is indeed in line with Vapnik’s original definition of “transduction”, current terminology in the field is confused. In this article, we discuss how the term “transduction” has been misused in the transfer learning literature, and propose a clarification consistent with the original characterization of this term given by Vapnik. We go on to observe that the above terminology misuse has brought about misleading experimental comparisons, with inductive transfer learning methods that have been incorrectly compared with transductive transfer learning methods. We then, give empirical evidence that the difference in performance between the inductive version and the transductive version of a transfer learning method can indeed be statistically significant (i.e., that knowing at training time the only data one needs to classify indeed gives an advantage). Our clarification allows a reassessment of the field, and of the relative merits of the major, state-of-the-art algorithms for transfer learning in text classification.
- Andrew Arnold, Ramesh Nallapati, and William W. Cohen. 2007. A comparative study of methods for transductive transfer learning. In Proceedings of the 7th IEEE International Conference on Data Mining Workshops. 77–82. DOI:https://doi.org/10.1109/ICDMW.2007.109 Google ScholarDigital Library
- Amar P. Azad, Dinesh Garg, Priyanka Agrawal, and Arun Kumar. 2018. Deep domain adaptation under deep label scarcity. arXiv:1809.08097. Retrieved from https://arxiv.org/abs/1809.08097.Google Scholar
- Mohammad T. Bahadori, Yan Liu, and Dan Zhang. 2011. Learning with minimum supervision: A general framework for transductive transfer learning. In Proceedings of the 11th IEEE International Conference on Data Mining. 61–70. DOI:https://doi.org/10.1109/ICDM.2011.92 Google ScholarDigital Library
- Yang Bao, Nigel Collier, and Anindya Datta. 2013. A partially supervised cross-collection topic model for cross-domain text classification. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management., 239–248. DOI:https://doi.org/10.1145/2505515.2505556 Google ScholarDigital Library
- Vahid Behbood, Jie Lu, and Guangquan Zhang. 2011. Long term bank failure prediction using fuzzy refinement-based transductive transfer learning. In Proceedings of the 20th IEEE International Conference on Fuzzy Systems. 2676–2683. DOI:https://doi.org/10.1109/FUZZY.2011.6007633Google ScholarCross Ref
- Giacomo Berardi, Andrea Esuli, and Fabrizio Sebastiani. 2014. Optimising human inspection work in automated verbatim coding. International Journal of Market Research 56, 4 (2014), 489–512. DOI:https://doi.org/10.2501/ijmr-2014-032Google ScholarCross Ref
- John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. 440–447.Google Scholar
- John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 4th Conference on Empirical Methods in Natural Language Processing. 120–128. DOI:https://doi.org/10.3115/1610075.1610094 Google ScholarDigital Library
- Danushka Bollegala, Tingting Mu, and John Yannis Goulermas. 2016. Cross-domain sentiment classification using sentiment-sensitive embeddings. IEEE Transactions on Knowledge and Data Engineering 28, 2 (2016), 398–410. DOI:https://doi.org/10.1109/tkde.2015.2475761 Google ScholarDigital Library
- Danushka Bollegala, David Weir, and John Carroll. 2013. Cross-domain sentiment classification using a sentiment-sensitive thesaurus. IEEE Transactions on Knowledge and Data Engineering 25, 8 (2013), 1719–1731. DOI:https://doi.org/10.1109/tkde.2012.103 Google ScholarDigital Library
- Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006. A discussion of semi-supervised learning and transduction. In Semi-Supervised Learning, Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien (Eds.). The MIT Press, Cambridge, MA, 457–462. DOI:https://doi.org/10.7551/mitpress/9780262033589.003.0025Google Scholar
- Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006. Introduction to semi-supervised learning. In Semi-Supervised Learning, Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien (Eds.). The MIT Press, Cambridge, MA, 105–117. DOI:https://doi.org/10.7551/mitpress/9780262033589.003.0001Google Scholar
- Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 210–219. DOI:https://doi.org/10.1145/1281192.1281218 Google ScholarDigital Library
- Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. Corvallis, US, 193–200. DOI:https://doi.org/10.1145/1273496.1273521 Google ScholarDigital Library
- Alexander Gammerman, Volodya G. Vovk, and Vladimir Vapnik. 1998. Learning by transduction. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. 148–155. Google ScholarDigital Library
- Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. Journal of Machine Learning Research 17, 1 (2016), 2096–2030. Google ScholarDigital Library
- Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning. 513–520. Google ScholarDigital Library
- Pablo González, Alberto Castaño, Nitesh V. Chawla, and Juan José del Coz. 2017. A review on quantification learning. Computing Surveys 50, 5 (2017), 74:1–74:40. DOI:https://doi.org/10.1145/3117807 Google ScholarDigital Library
- Quanquan Gu and Jie Zhou. 2009. Learning the shared subspace for multi-task clustering and transductive transfer classification. In Proceedings of the 9th IEEE International Conference on Data Mining. 159–168. DOI:https://doi.org/10.1109/ICDM.2009.32 Google ScholarDigital Library
- Zellig S. Harris. 1954. Distributional structure. Word 10, 23 (1954), 146–162. DOI:https://doi.org/10.1007/978-94-009-8467-7_1Google ScholarCross Ref
- Xingchang Huang, Yanghui Rao, Haoran Xie, Tak-Lam Wong, and Fu Lee Wang. 2017. Cross-domain sentiment classification via topic-related TrAdaBoost. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 4939–4940. Google ScholarDigital Library
- Radu T. Ionescu and Andrei M. Butnaru. 2018. Transductive learning with string kernels for cross-domain text classification. In Proceedings of the 25th International Conference on Neural Information Processing. 484–496. DOI:https://doi.org/10.1007/978-3-030-04182-3_42Google Scholar
- Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning. 137–142. DOI:https://doi.org/10.1007/bfb0026683 Google ScholarDigital Library
- Thorsten Joachims. 1999. Transductive inference for text classification using support vector machines. In Proceedings of the 16th International Conference on Machine Learning. 200–209. Google ScholarDigital Library
- Matthew Lease, Gordon V. Cormack, An Thanh Nguyen, Thomas A. Trikalinos, and Byron C. Wallace. 2016. Systematic review is e-discovery in doctor’s clothing. In Proceedings of the SIGIR 2016 Medical Information Retrieval Workshop.Google Scholar
- Lianghao Li, Xiaoming Jin, and Mingsheng Long. 2012. Topic correlation analysis for cross-domain text classification. In Proceedings of the 26th AAAI Conference on Artificial Intelligence. 998–1004. Google ScholarDigital Library
- Zheng Li, Ying Wei, Yu Zhang, and Qiang Yang. 2018. Hierarchical attention transfer network for cross-domain sentiment classification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google Scholar
- Zheng Li, Yu Zhang, Ying Wei, Yuxiang Wu, and Qiang Yang. 2017. End-to-end adversarial memory network for cross-domain sentiment classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2237–2243. DOI:https://doi.org/10.24963/ijcai.2017/311 Google ScholarDigital Library
- Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008. Spectral-domain transfer learning. In Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining. 488–496. DOI:https://doi.org/10.1145/1401890.1401951 Google ScholarDigital Library
- Zachary C. Lipton and Jacob Steinhardt. 2019. Research for practice: Troubling trends in machine-learning scholarship. Communications of the ACM 62, 6 (2019), 45–53. DOI:https://doi.org/doi/10.1145/3316774 Google ScholarDigital Library
- Jose G. Moreno-Torres, Troy Raeder, Rocío Alaíz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognition 45, 1 (2012), 521–530. DOI:https://doi.org/10.1016/j.patcog.2011.06.019 Google ScholarDigital Library
- Alejandro Moreo, Andrea Esuli, and Fabrizio Sebastiani. 2016. Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. Journal of Artificial Intelligence Research 55, 1 (2016), 131–163. DOI:https://doi.org/10.1613/jair.4762 Google ScholarDigital Library
- Alejandro Moreo, Andrea Esuli, and Fabrizio Sebastiani. 2018. Revisiting distributional correspondence indexing: A python reimplementation and new experiments. arXiv:1810.09311 Retrieved from https://arxiv.org/abs/1810.09311.Google Scholar
- Douglas W. Oard, Fabrizio Sebastiani, and Jyothi K. Vinjumur. 2018. Jointly minimizing the expected costs of review for responsiveness and privilege in e-discovery. ACM Transactions on Information Systems 37, 1, Article 11 (2018), 11:1–11:35 pages. DOI:https://doi.org/10.1145/3268928 Google ScholarDigital Library
- Sinno J. Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Zheng Chen. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on the World Wide Web. 751–760. DOI:https://doi.org/10.1145/1772690.1772767 Google ScholarDigital Library
- Sinno J. Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345–1359. DOI:https://doi.org/10.1109/tkde.2009.191 Google ScholarDigital Library
- Weike Pan, Erheng Zhong, and Qiang Yang. 2012. Transfer learning for text mining. In Mining Text Data, Charu C. Aggarwal and ChengXiang Zhai (Eds.). Springer, Heidelberg, 223–258. DOI:https://doi.org/10.1007/978-1-4614-3223-4_7Google Scholar
- Novi Patricia and Barbara Caputo. 2014. Learning to learn, from transfer learning to domain adaptation: A unifying perspective. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1442–1449. DOI:https://doi.org/10.1109/CVPR.2014.187 Google ScholarDigital Library
- Hieu Pham, Thang Luong, and Christopher Manning. 2015. Learning distributed representations for multilingual text sequences. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 88–94. DOI:https://doi.org/10.3115/v1/w15-1512Google ScholarCross Ref
- Peter Prettenhofer and Benno Stein. 2010. Cross-language text classification using structural correspondence learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 1118–1127. Google ScholarDigital Library
- Peter Prettenhofer and Benno Stein. 2011. Cross-lingual adaptation using structural correspondence learning. ACM Transactions on Intelligent Systems and Technology 3, 1 (2011), Article 13. DOI:https://doi.org/10.1145/2036264.2036277 Google ScholarDigital Library
- Brian Quanz and Jun Huan. 2009. Large margin transductive transfer learning. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 1327–1336. DOI:https://doi.org/10.1145/1645953.1646121 Google ScholarDigital Library
- Joaquin Quiñonero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D. Lawrence (Eds.). 2009. Dataset Shift in Machine Learning. The MIT Press, Cambridge, MA. DOI:https://doi.org/10.7551/mitpress/9780262170055.001.0001 Google ScholarDigital Library
- Marcus Rohrbach, Sandra Ebert, and Bernt Schiele. 2013. Transfer learning in a transductive setting. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. 46–54. Google ScholarDigital Library
- Ozan Sener, Hyun Oh Song, Ashutosh Saxena, and Silvio Savarese. 2016. Learning transferrable representations for unsupervised domain adaptation. In Proceedings of the 29th Conference on Advances in Neural Information Processing Systems. 2110–2118. Google ScholarDigital Library
- Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90, 2 (2000), 227–244. DOI:https://doi.org/10.1016/s0378-3758(00)00115-4Google ScholarCross Ref
- Dirk Tasche. 2017. Fisher consistency for prior probability shift. Journal of Machine Learning Research 18, 1 (2017), 95:1–95:32. Google ScholarDigital Library
- Vladimir Vapnik. 1998. Statistical Learning Theory. Wiley, New York, NY. Google ScholarCross Ref
- Ricardo Vilalta, Christophe Giraud-Carrier, Pavel Brazdil, and Carlos Soares. 2011. Inductive transfer. In Encyclopedia of Machine Learning, Claude Sammut and Geoffrey I. Webb (Eds.). Springer, Heidelberg, 545–548.Google Scholar
- Gerhard Widmer and Miroslav Kubat. 1996. Learning in the presence of concept drift and hidden contexts. Machine Learning 23, 1 (1996), 69–101. DOI:https://doi.org/10.1007/bf00116900 Google ScholarDigital Library
- Meng-Sung Wu and Jen-Tzung Chien. 2010. A new topic-bridged model for transfer learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas, US, 5346–5349. DOI:https://doi.org/10.1109/icassp.2010.5494947Google ScholarCross Ref
- Min Xiao and Yuhong Guo. 2013. A novel two-step method for cross-language representation learning. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. Lake Tahoe, US, 1259–1267. Google ScholarDigital Library
- Min Xiao and Yuhong Guo. 2014. Semi-supervised matrix completion for cross-lingual text classification. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. Québec City, CA, 1607–1614. Google ScholarDigital Library
- Kui Xu and Xiaojun Wan. 2017. Towards a universal sentiment classifier in multiple languages. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, DE, 511–520. DOI:https://doi.org/10.18653/v1/d17-1053Google ScholarCross Ref
- Ruochen Xu and Yiming Yang. 2017. Cross-lingual distillation for text classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, CA, 1415–1425. DOI:https://doi.org/10.18653/v1/p17-1130Google ScholarCross Ref
- Gui-Rong Xue, Wenyuan Dai, Qiang Yang, and Yong Yu. 2008. Topic-bridged PLSA for cross-domain text classification. In Proceedings of the 31st ACM International Conference on Research and Development in Information Retrieval. Singapore, SN, 627–634. DOI:https://doi.org/10.1145/1390334.1390441 Google ScholarDigital Library
- Xiaoshan Yang, Tianzhu Zhang, and Changsheng Xu. 2015. Cross-domain feature learning in multimedia. IEEE Transactions on Multimedia 17, 1 (2015), 64–78. DOI:https://doi.org/10.1109/tmm.2014.2375793Google ScholarCross Ref
- Guangyou Zhou, Tingting He, Jun Zhao, and Wensheng Wu. 2015. A subspace learning framework for cross-lingual sentiment classification with partial parallel data. In Proceedings of the 24th International Joint Conference on Artificial Intelligence. Buenos Aires, AR, 1426–1433. Google ScholarDigital Library
- Guangyou Zhou, Yin Zhou, Xiyue Guo, Xinhui Tu, and Tingting He. 2015. Cross-domain sentiment classification via topical correspondence transfer. Neurocomputing 159, C (2015), 298–305. DOI:https://doi.org/10.1016/j.neucom.2014.12.006 Google ScholarDigital Library
- Xinjie Zhou, Xiaojun Wan, and Jianguo Xiao. 2016. Cross-lingual sentiment classification with bilingual document representation learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, DE, 1403–1412. DOI:https://doi.org/10.18653/v1/p16-1133Google ScholarCross Ref
- Fuzhen Zhuang, Ping Luo, Hui Xiong, Qing He, Yuhong Xiong, and Zhongzhi Shi. 2011. Exploiting associations between word clusters and document classes for cross-domain text categorization. Statistical Analysis and Data Mining 4, 1 (2011), 100–114. DOI:https://doi.org/10.1002/sam.10099 Google ScholarDigital Library
Index Terms
- Lost in Transduction: Transductive Transfer Learning in Text Classification
Recommendations
Large margin transductive transfer learning
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementRecently there has been increasing interest in the problem of transfer learning, in which the typical assumption that training and testing data are drawn from identical distributions is relaxed. We specifically address the problem of transductive ...
A robust semi-supervised classification method for transfer learning
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge managementThe transfer learning problem of designing good classifiers with a high generalization ability by using labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine ...
Regularization for Graph-Based Transfer Learning Text Classification
Progress in Pattern Recognition, Image Analysis, Computer Vision, and ApplicationsAbstractIn machine learning classification problems, it is common to assume train and test sets follow a similar underlying distribution. When this is not true, this can be seen as a transfer learning problem. Sometimes, there is a set of already trained ...
Comments