research-article

Lost in Transduction: Transductive Transfer Learning in Text Classification

Authors:
Alejandro Moreo

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
View Profile

,
Andrea Esuli

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
View Profile

,
Fabrizio Sebastiani

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy

Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 16 Issue 1Article No.: 13pp 1–21https://doi.org/10.1145/3453146

Published:20 July 2021Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Obtaining high-quality labelled data for training a classifier in a new application domain is often costly. Transfer Learning (a.k.a. “Inductive Transfer”) tries to alleviate these costs by transferring, to the “target” domain of interest, knowledge available from a different “source” domain. In transfer learning the lack of labelled information from the target domain is compensated by the availability at training time of a set of unlabelled examples from the target distribution. Transductive Transfer Learning denotes the transfer learning setting in which the only set of target documents that we are interested in classifying is known and available at training time. Although this definition is indeed in line with Vapnik’s original definition of “transduction”, current terminology in the field is confused. In this article, we discuss how the term “transduction” has been misused in the transfer learning literature, and propose a clarification consistent with the original characterization of this term given by Vapnik. We go on to observe that the above terminology misuse has brought about misleading experimental comparisons, with inductive transfer learning methods that have been incorrectly compared with transductive transfer learning methods. We then, give empirical evidence that the difference in performance between the inductive version and the transductive version of a transfer learning method can indeed be statistically significant (i.e., that knowing at training time the only data one needs to classify indeed gives an advantage). Our clarification allows a reassessment of the field, and of the relative merits of the major, state-of-the-art algorithms for transfer learning in text classification.

References

Andrew Arnold, Ramesh Nallapati, and William W. Cohen. 2007. A comparative study of methods for transductive transfer learning. In Proceedings of the 7th IEEE International Conference on Data Mining Workshops. 77–82. DOI:https://doi.org/10.1109/ICDMW.2007.109 Google ScholarDigital Library
Amar P. Azad, Dinesh Garg, Priyanka Agrawal, and Arun Kumar. 2018. Deep domain adaptation under deep label scarcity. arXiv:1809.08097. Retrieved from https://arxiv.org/abs/1809.08097.Google Scholar
Mohammad T. Bahadori, Yan Liu, and Dan Zhang. 2011. Learning with minimum supervision: A general framework for transductive transfer learning. In Proceedings of the 11th IEEE International Conference on Data Mining. 61–70. DOI:https://doi.org/10.1109/ICDM.2011.92 Google ScholarDigital Library
Yang Bao, Nigel Collier, and Anindya Datta. 2013. A partially supervised cross-collection topic model for cross-domain text classification. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management., 239–248. DOI:https://doi.org/10.1145/2505515.2505556 Google ScholarDigital Library
Vahid Behbood, Jie Lu, and Guangquan Zhang. 2011. Long term bank failure prediction using fuzzy refinement-based transductive transfer learning. In Proceedings of the 20th IEEE International Conference on Fuzzy Systems. 2676–2683. DOI:https://doi.org/10.1109/FUZZY.2011.6007633Google ScholarCross Ref
Giacomo Berardi, Andrea Esuli, and Fabrizio Sebastiani. 2014. Optimising human inspection work in automated verbatim coding. International Journal of Market Research 56, 4 (2014), 489–512. DOI:https://doi.org/10.2501/ijmr-2014-032Google ScholarCross Ref
John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. 440–447.Google Scholar
John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 4th Conference on Empirical Methods in Natural Language Processing. 120–128. DOI:https://doi.org/10.3115/1610075.1610094 Google ScholarDigital Library
Danushka Bollegala, Tingting Mu, and John Yannis Goulermas. 2016. Cross-domain sentiment classification using sentiment-sensitive embeddings. IEEE Transactions on Knowledge and Data Engineering 28, 2 (2016), 398–410. DOI:https://doi.org/10.1109/tkde.2015.2475761 Google ScholarDigital Library
Danushka Bollegala, David Weir, and John Carroll. 2013. Cross-domain sentiment classification using a sentiment-sensitive thesaurus. IEEE Transactions on Knowledge and Data Engineering 25, 8 (2013), 1719–1731. DOI:https://doi.org/10.1109/tkde.2012.103 Google ScholarDigital Library
Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006. A discussion of semi-supervised learning and transduction. In Semi-Supervised Learning, Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien (Eds.). The MIT Press, Cambridge, MA, 457–462. DOI:https://doi.org/10.7551/mitpress/9780262033589.003.0025Google Scholar
Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006. Introduction to semi-supervised learning. In Semi-Supervised Learning, Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien (Eds.). The MIT Press, Cambridge, MA, 105–117. DOI:https://doi.org/10.7551/mitpress/9780262033589.003.0001Google Scholar
Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 210–219. DOI:https://doi.org/10.1145/1281192.1281218 Google ScholarDigital Library
Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. Corvallis, US, 193–200. DOI:https://doi.org/10.1145/1273496.1273521 Google ScholarDigital Library
Alexander Gammerman, Volodya G. Vovk, and Vladimir Vapnik. 1998. Learning by transduction. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. 148–155. Google ScholarDigital Library
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. Journal of Machine Learning Research 17, 1 (2016), 2096–2030. Google ScholarDigital Library
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning. 513–520. Google ScholarDigital Library
Pablo González, Alberto Castaño, Nitesh V. Chawla, and Juan José del Coz. 2017. A review on quantification learning. Computing Surveys 50, 5 (2017), 74:1–74:40. DOI:https://doi.org/10.1145/3117807 Google ScholarDigital Library
Quanquan Gu and Jie Zhou. 2009. Learning the shared subspace for multi-task clustering and transductive transfer classification. In Proceedings of the 9th IEEE International Conference on Data Mining. 159–168. DOI:https://doi.org/10.1109/ICDM.2009.32 Google ScholarDigital Library
Zellig S. Harris. 1954. Distributional structure. Word 10, 23 (1954), 146–162. DOI:https://doi.org/10.1007/978-94-009-8467-7_1Google ScholarCross Ref
Xingchang Huang, Yanghui Rao, Haoran Xie, Tak-Lam Wong, and Fu Lee Wang. 2017. Cross-domain sentiment classification via topic-related TrAdaBoost. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 4939–4940. Google ScholarDigital Library
Radu T. Ionescu and Andrei M. Butnaru. 2018. Transductive learning with string kernels for cross-domain text classification. In Proceedings of the 25th International Conference on Neural Information Processing. 484–496. DOI:https://doi.org/10.1007/978-3-030-04182-3_42Google Scholar
Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning. 137–142. DOI:https://doi.org/10.1007/bfb0026683 Google ScholarDigital Library
Thorsten Joachims. 1999. Transductive inference for text classification using support vector machines. In Proceedings of the 16th International Conference on Machine Learning. 200–209. Google ScholarDigital Library
Matthew Lease, Gordon V. Cormack, An Thanh Nguyen, Thomas A. Trikalinos, and Byron C. Wallace. 2016. Systematic review is e-discovery in doctor’s clothing. In Proceedings of the SIGIR 2016 Medical Information Retrieval Workshop.Google Scholar
Lianghao Li, Xiaoming Jin, and Mingsheng Long. 2012. Topic correlation analysis for cross-domain text classification. In Proceedings of the 26th AAAI Conference on Artificial Intelligence. 998–1004. Google ScholarDigital Library
Zheng Li, Ying Wei, Yu Zhang, and Qiang Yang. 2018. Hierarchical attention transfer network for cross-domain sentiment classification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google Scholar
Zheng Li, Yu Zhang, Ying Wei, Yuxiang Wu, and Qiang Yang. 2017. End-to-end adversarial memory network for cross-domain sentiment classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2237–2243. DOI:https://doi.org/10.24963/ijcai.2017/311 Google ScholarDigital Library
Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2008. Spectral-domain transfer learning. In Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining. 488–496. DOI:https://doi.org/10.1145/1401890.1401951 Google ScholarDigital Library
Zachary C. Lipton and Jacob Steinhardt. 2019. Research for practice: Troubling trends in machine-learning scholarship. Communications of the ACM 62, 6 (2019), 45–53. DOI:https://doi.org/doi/10.1145/3316774 Google ScholarDigital Library
Jose G. Moreno-Torres, Troy Raeder, Rocío Alaíz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern Recognition 45, 1 (2012), 521–530. DOI:https://doi.org/10.1016/j.patcog.2011.06.019 Google ScholarDigital Library
Alejandro Moreo, Andrea Esuli, and Fabrizio Sebastiani. 2016. Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. Journal of Artificial Intelligence Research 55, 1 (2016), 131–163. DOI:https://doi.org/10.1613/jair.4762 Google ScholarDigital Library
Alejandro Moreo, Andrea Esuli, and Fabrizio Sebastiani. 2018. Revisiting distributional correspondence indexing: A python reimplementation and new experiments. arXiv:1810.09311 Retrieved from https://arxiv.org/abs/1810.09311.Google Scholar
Douglas W. Oard, Fabrizio Sebastiani, and Jyothi K. Vinjumur. 2018. Jointly minimizing the expected costs of review for responsiveness and privilege in e-discovery. ACM Transactions on Information Systems 37, 1, Article 11 (2018), 11:1–11:35 pages. DOI:https://doi.org/10.1145/3268928 Google ScholarDigital Library
Sinno J. Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Zheng Chen. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on the World Wide Web. 751–760. DOI:https://doi.org/10.1145/1772690.1772767 Google ScholarDigital Library
Sinno J. Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345–1359. DOI:https://doi.org/10.1109/tkde.2009.191 Google ScholarDigital Library
Weike Pan, Erheng Zhong, and Qiang Yang. 2012. Transfer learning for text mining. In Mining Text Data, Charu C. Aggarwal and ChengXiang Zhai (Eds.). Springer, Heidelberg, 223–258. DOI:https://doi.org/10.1007/978-1-4614-3223-4_7Google Scholar
Novi Patricia and Barbara Caputo. 2014. Learning to learn, from transfer learning to domain adaptation: A unifying perspective. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1442–1449. DOI:https://doi.org/10.1109/CVPR.2014.187 Google ScholarDigital Library
Hieu Pham, Thang Luong, and Christopher Manning. 2015. Learning distributed representations for multilingual text sequences. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 88–94. DOI:https://doi.org/10.3115/v1/w15-1512Google ScholarCross Ref
Peter Prettenhofer and Benno Stein. 2010. Cross-language text classification using structural correspondence learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 1118–1127. Google ScholarDigital Library
Peter Prettenhofer and Benno Stein. 2011. Cross-lingual adaptation using structural correspondence learning. ACM Transactions on Intelligent Systems and Technology 3, 1 (2011), Article 13. DOI:https://doi.org/10.1145/2036264.2036277 Google ScholarDigital Library
Brian Quanz and Jun Huan. 2009. Large margin transductive transfer learning. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 1327–1336. DOI:https://doi.org/10.1145/1645953.1646121 Google ScholarDigital Library
Joaquin Quiñonero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D. Lawrence (Eds.). 2009. Dataset Shift in Machine Learning. The MIT Press, Cambridge, MA. DOI:https://doi.org/10.7551/mitpress/9780262170055.001.0001 Google ScholarDigital Library
Marcus Rohrbach, Sandra Ebert, and Bernt Schiele. 2013. Transfer learning in a transductive setting. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. 46–54. Google ScholarDigital Library
Ozan Sener, Hyun Oh Song, Ashutosh Saxena, and Silvio Savarese. 2016. Learning transferrable representations for unsupervised domain adaptation. In Proceedings of the 29th Conference on Advances in Neural Information Processing Systems. 2110–2118. Google ScholarDigital Library
Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90, 2 (2000), 227–244. DOI:https://doi.org/10.1016/s0378-3758(00)00115-4Google ScholarCross Ref
Dirk Tasche. 2017. Fisher consistency for prior probability shift. Journal of Machine Learning Research 18, 1 (2017), 95:1–95:32. Google ScholarDigital Library
Vladimir Vapnik. 1998. Statistical Learning Theory. Wiley, New York, NY. Google ScholarCross Ref
Ricardo Vilalta, Christophe Giraud-Carrier, Pavel Brazdil, and Carlos Soares. 2011. Inductive transfer. In Encyclopedia of Machine Learning, Claude Sammut and Geoffrey I. Webb (Eds.). Springer, Heidelberg, 545–548.Google Scholar
Gerhard Widmer and Miroslav Kubat. 1996. Learning in the presence of concept drift and hidden contexts. Machine Learning 23, 1 (1996), 69–101. DOI:https://doi.org/10.1007/bf00116900 Google ScholarDigital Library
Meng-Sung Wu and Jen-Tzung Chien. 2010. A new topic-bridged model for transfer learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas, US, 5346–5349. DOI:https://doi.org/10.1109/icassp.2010.5494947Google ScholarCross Ref
Min Xiao and Yuhong Guo. 2013. A novel two-step method for cross-language representation learning. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems. Lake Tahoe, US, 1259–1267. Google ScholarDigital Library
Min Xiao and Yuhong Guo. 2014. Semi-supervised matrix completion for cross-lingual text classification. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. Québec City, CA, 1607–1614. Google ScholarDigital Library
Kui Xu and Xiaojun Wan. 2017. Towards a universal sentiment classifier in multiple languages. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, DE, 511–520. DOI:https://doi.org/10.18653/v1/d17-1053Google ScholarCross Ref
Ruochen Xu and Yiming Yang. 2017. Cross-lingual distillation for text classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, CA, 1415–1425. DOI:https://doi.org/10.18653/v1/p17-1130Google ScholarCross Ref
Gui-Rong Xue, Wenyuan Dai, Qiang Yang, and Yong Yu. 2008. Topic-bridged PLSA for cross-domain text classification. In Proceedings of the 31st ACM International Conference on Research and Development in Information Retrieval. Singapore, SN, 627–634. DOI:https://doi.org/10.1145/1390334.1390441 Google ScholarDigital Library
Xiaoshan Yang, Tianzhu Zhang, and Changsheng Xu. 2015. Cross-domain feature learning in multimedia. IEEE Transactions on Multimedia 17, 1 (2015), 64–78. DOI:https://doi.org/10.1109/tmm.2014.2375793Google ScholarCross Ref
Guangyou Zhou, Tingting He, Jun Zhao, and Wensheng Wu. 2015. A subspace learning framework for cross-lingual sentiment classification with partial parallel data. In Proceedings of the 24th International Joint Conference on Artificial Intelligence. Buenos Aires, AR, 1426–1433. Google ScholarDigital Library
Guangyou Zhou, Yin Zhou, Xiyue Guo, Xinhui Tu, and Tingting He. 2015. Cross-domain sentiment classification via topical correspondence transfer. Neurocomputing 159, C (2015), 298–305. DOI:https://doi.org/10.1016/j.neucom.2014.12.006 Google ScholarDigital Library
Xinjie Zhou, Xiaojun Wan, and Jianguo Xiao. 2016. Cross-lingual sentiment classification with bilingual document representation learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, DE, 1403–1412. DOI:https://doi.org/10.18653/v1/p16-1133Google ScholarCross Ref
Fuzhen Zhuang, Ping Luo, Hui Xiong, Qing He, Yuhong Xiong, and Zhongzhi Shi. 2011. Exploiting associations between word clusters and document classes for cross-domain text categorization. Statistical Analysis and Data Mining 4, 1 (2011), 100–114. DOI:https://doi.org/10.1002/sam.10099 Google ScholarDigital Library

Index Terms

Lost in Transduction: Transductive Transfer Learning in Text Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
      2. Supervised learning
        Supervised learning by classification
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Large margin transductive transfer learning
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Recently there has been increasing interest in the problem of transfer learning, in which the typical assumption that training and testing data are drawn from identical distributions is relaxed. We specifically address the problem of transductive ...
Read More
A robust semi-supervised classification method for transfer learning
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

The transfer learning problem of designing good classifiers with a high generalization ability by using labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine ...
Read More
Regularization for Graph-Based Transfer Learning Text Classification
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Abstract
In machine learning classification problems, it is common to assume train and test sets follow a similar underlying distribution. When this is not true, this can be seen as a transfer learning problem. Sometimes, there is a set of already trained ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 16, Issue 1
February 2022
475 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3472794
Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA
Issue’s Table of Contents
Copyright © 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2021
- Revised: 1 February 2021
- Accepted: 1 February 2021
- Received: 1 June 2020
Published in tkdd Volume 16, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Transduction
induction
transfer learning
text classification
distributional hypothesis
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 351
  Total Downloads
- Downloads (Last 12 months)69
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Lost in Transduction: Transductive Transfer Learning in Text Classification

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Large margin transductive transfer learning

A robust semi-supervised classification method for transfer learning

Regularization for Graph-Based Transfer Learning Text Classification