Abstract
Transfer learning has gained a lot of attention and interest in the past decade. One crucial research issue in transfer learning is how to find a good representation for instances of different domains such that the divergence between domains can be reduced with the new representation. Recently, deep learning has been proposed to learn more robust or higher-level features for transfer learning. In this article, we adapt the autoencoder technique to transfer learning and propose a supervised representation learning method based on double encoding-layer autoencoder. The proposed framework consists of two encoding layers: one for embedding and the other one for label encoding. In the embedding layer, the distribution distance of the embedded instances between the source and target domains is minimized in terms of KL-Divergence. In the label encoding layer, label information of the source domain is encoded using a softmax regression model. Moreover, to empirically explore why the proposed framework can work well for transfer learning, we propose a new effective measure based on autoencoder to compute the distribution distance between different domains. Experimental results show that the proposed new measure can better reflect the degree of transfer difficulty and has stronger correlation with the performance from supervised learning algorithms (e.g., Logistic Regression), compared with previous ones, such as KL-Divergence and Maximum Mean Discrepancy. Therefore, in our model, we have incorporated two distribution distance measures to minimize the difference between source and target domains in the embedding representations. Extensive experiments conducted on three real-world image datasets and one text data demonstrate the effectiveness of our proposed method compared with several state-of-the-art baseline methods.
- Somnath Banerjee and Martin Scholz. 2008. Leveraging web 2.0 sources for web content classification. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Volume 1. IEEE Computer Society, 300--306. Google ScholarDigital Library
- Yoshua Bengio. 2009. Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1 (2009), 1--127. Google ScholarDigital Library
- John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods on Natural Language Processing. 120--128. Google ScholarDigital Library
- Minmin Chen, Zhixiang Eddie Xu, Kilian Q. Weinberger, and Fei Sha. 2012. Marginalized denoising autoencoders for domain adaptation. In Proceedings of the 29th International Conference on Machine Learning. Google ScholarDigital Library
- Koby Crammer, Mark Dredze, and Fernando Pereira. 2012. Confidence-weighted linear classification for text categorization. J. Mach. Learn. Res. 13, 1 (2012), 1891--1926. Google ScholarDigital Library
- W. Y. Dai, G. R. Xue, Q. Yang, and Y. Yu. 2007a. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarDigital Library
- W. Y. Dai, Q. Yang, G. R. Xue, and Y. Yu. 2007b. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. Google ScholarDigital Library
- Trevor Hastie, Friedman Jerome, and Rob Tibshirani. 2010. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1 (2010), 1.Google Scholar
- Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 1180--1189. Google ScholarDigital Library
- J. Gao, W. Fan, J. Jiang, and J. W. Han. 2008. Knowledge transfer via multiple model local structure mapping. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarDigital Library
- Mingming Gong, Kun Zhang, Tongliang Liu, Dacheng Tao, Clark Glymour, and Bernhard Schölkopf. 2016. Domain adaptation with conditional transferable components. In Proceedings of the 33rd International Conference on Machine Learning. 2839--2848. Google ScholarDigital Library
- Jing Jiang and Chengxiang Zhai. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the 2007 Conference of the Association for Computational Linguistics. 264--271.Google Scholar
- Ivor Tsang, Joey Tianyi Zhou, Sinno Jialin Pan, and Yan Yan. 2014. Hybrid heterogeneous transfer learning through deep learning. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2213--2220. Google ScholarDigital Library
- Solomon Kullback. 1987. Letter to the editor: The Kullback-Leibler distance (1987).Google Scholar
- Andrew R. Liddle, Pia Mukherjee, and David Parkinson. 2010. Model selection and multi-model inference. Bayes. Meth. Cosmol. 1 (2010), 79.Google Scholar
- Tongliang Liu, Dacheng Tao, Mingli Song, and Stephen J. Maybank. 2017. Algorithm-dependent generalization bounds for multi-task learning. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2 (2017), 227--241. Google ScholarDigital Library
- Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I. Jordan. 2015. Learning transferable features with deep adaptation networks. In Proceedings of the International Machine Learning Society (ICML’15). 97--105. Google ScholarDigital Library
- Mingsheng Long, Jianmin Wang, Yue Cao, Jianguang Sun, and Philip S. Yu. 2016. Deep learning of transferable representation for scalable domain adaptation. IEEE Trans. Knowl. Data Min. 28, 8 (2016), 2027--2040.Google ScholarDigital Library
- Yong Luo, Tongliang Liu, Dacheng Tao, and Chao Xu. 2014. Decomposition-based transfer distance metric learning for image classification. IEEE Trans. Image Process. 23, 9 (2014), 3789--3801.Google ScholarCross Ref
- Cope Mallah and Orwell. 2013. Plant leaf classification using probabilistic integration of shape, texture and margin features. Signal Processing, Pattern Recognition and Applications (2013).Google Scholar
- S. J. Pan, J. T. Kwok, and Q. Yang. 2008. Transfer learning via dimensionality reduction. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence. Google ScholarDigital Library
- Sinno Jialin Pan, Ivor W. Tsang, James T. Kwok, Qiang Yang, and others. 2011. Domain adaptation via transfer component analysis. IEEE Trans. Neural Networks 22, 2 (2011), 199--210. Google ScholarDigital Library
- Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359. Google ScholarDigital Library
- Christopher Poultney, Sumit Chopra, Yann L. Cun, and others. 2006. Efficient learning of sparse representations with an energy-based model. In Advances in Neural Information Processing Systems. 1137--1144. Google ScholarDigital Library
- Si Si, Dacheng Tao, and Bo Geng. 2010. Bregman divergence-based regularization for transfer subspace learning. IEEE Trans. Knowl. Data Eng. 22, 7 (2010), 929--942. Google ScholarDigital Library
- Jan Snyman. 2005. Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-based Algorithms. Vol. 97. Springer Science 8 Business Media.Google Scholar
- Eric Tzeng, Judy Hoffman, Trevor Darrell, and Kate Saenko. 2015. Simultaneous deep transfer across domains and tasks. In Proceedings of the IEEE International Conference on Computer Vision. 4068--4076. Google ScholarDigital Library
- Bengio Vincent Larochelle and Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning. ACM, 1096--1103. Google ScholarDigital Library
- Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408. Google ScholarDigital Library
- Antoine Xavier and Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning. 513--520. Google ScholarDigital Library
- D. K. Xing, W. Y. Dai, G. R. Xue, and Y. Yu. 2007. Bridged refinement for transfer learning. In Proceedings of the 10th Pacific Asia Knowledge Discovery and Data Mining.Google Scholar
- Fuzhen Zhuang, Xiaohu Cheng, Ping Luo, Sinno Jialin Pan, and Qing He. 2015. Supervised representation learning: Transfer learning with deep autoencoders. In Proceedings of 24th International Joint Conference on Artificial Intelligence. 4119--4125. Google ScholarDigital Library
- Fuzhen Zhuang, Xiaohu Cheng, Sinno Jialin Pan, Wenchao Yu, Qing He, and Zhongzhi Shi. 2014. Transfer learning with multiple sources via consensus regularized autoencoders. In Machine Learning and Knowledge Discovery in Databases. Springer, 417--431. Google ScholarDigital Library
- Fuzhen Zhuang, Ping Luo, Hui Xiong, Yuhong Xiong, Qing He, and Zhongzhi Shi. 2010. Cross-domain learning from multiple sources: A consensus regularization perspective. IEEE Trans. Knowl. Data Eng. 22, 12 (2010), 1664--1678. Google ScholarDigital Library
Index Terms
- Supervised Representation Learning with Double Encoding-Layer Autoencoder for Transfer Learning
Recommendations
Supervised representation learning: transfer learning with deep autoencoders
IJCAI'15: Proceedings of the 24th International Conference on Artificial IntelligenceTransfer learning has attracted a lot of attention in the past decade. One crucial research issue in transfer learning is how to find a good representation for instances of different domains such that the divergence between domains can be reduced with ...
Supervised representation learning for multi-label classification
AbstractRepresentation learning is one of the most important aspects of multi-label learning because of the intricate nature of multi-label data. Current research on representation learning either fails to consider label knowledge or is affected by the ...
Representation learning via an integrated autoencoder for unsupervised domain adaptation
AbstractThe purpose of unsupervised domain adaptation is to use the knowledge of the source domain whose data distribution is different from that of the target domain for promoting the learning task in the target domain. The key bottleneck in unsupervised ...
Comments