skip to main content
research-article

Supervised Representation Learning with Double Encoding-Layer Autoencoder for Transfer Learning

Authors Info & Claims
Published:23 October 2017Publication History
Skip Abstract Section

Abstract

Transfer learning has gained a lot of attention and interest in the past decade. One crucial research issue in transfer learning is how to find a good representation for instances of different domains such that the divergence between domains can be reduced with the new representation. Recently, deep learning has been proposed to learn more robust or higher-level features for transfer learning. In this article, we adapt the autoencoder technique to transfer learning and propose a supervised representation learning method based on double encoding-layer autoencoder. The proposed framework consists of two encoding layers: one for embedding and the other one for label encoding. In the embedding layer, the distribution distance of the embedded instances between the source and target domains is minimized in terms of KL-Divergence. In the label encoding layer, label information of the source domain is encoded using a softmax regression model. Moreover, to empirically explore why the proposed framework can work well for transfer learning, we propose a new effective measure based on autoencoder to compute the distribution distance between different domains. Experimental results show that the proposed new measure can better reflect the degree of transfer difficulty and has stronger correlation with the performance from supervised learning algorithms (e.g., Logistic Regression), compared with previous ones, such as KL-Divergence and Maximum Mean Discrepancy. Therefore, in our model, we have incorporated two distribution distance measures to minimize the difference between source and target domains in the embedding representations. Extensive experiments conducted on three real-world image datasets and one text data demonstrate the effectiveness of our proposed method compared with several state-of-the-art baseline methods.

References

  1. Somnath Banerjee and Martin Scholz. 2008. Leveraging web 2.0 sources for web content classification. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Volume 1. IEEE Computer Society, 300--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yoshua Bengio. 2009. Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1 (2009), 1--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods on Natural Language Processing. 120--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Minmin Chen, Zhixiang Eddie Xu, Kilian Q. Weinberger, and Fei Sha. 2012. Marginalized denoising autoencoders for domain adaptation. In Proceedings of the 29th International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Koby Crammer, Mark Dredze, and Fernando Pereira. 2012. Confidence-weighted linear classification for text categorization. J. Mach. Learn. Res. 13, 1 (2012), 1891--1926. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Y. Dai, G. R. Xue, Q. Yang, and Y. Yu. 2007a. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Y. Dai, Q. Yang, G. R. Xue, and Y. Yu. 2007b. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Trevor Hastie, Friedman Jerome, and Rob Tibshirani. 2010. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1 (2010), 1.Google ScholarGoogle Scholar
  9. Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15). 1180--1189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Gao, W. Fan, J. Jiang, and J. W. Han. 2008. Knowledge transfer via multiple model local structure mapping. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mingming Gong, Kun Zhang, Tongliang Liu, Dacheng Tao, Clark Glymour, and Bernhard Schölkopf. 2016. Domain adaptation with conditional transferable components. In Proceedings of the 33rd International Conference on Machine Learning. 2839--2848. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jing Jiang and Chengxiang Zhai. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the 2007 Conference of the Association for Computational Linguistics. 264--271.Google ScholarGoogle Scholar
  13. Ivor Tsang, Joey Tianyi Zhou, Sinno Jialin Pan, and Yan Yan. 2014. Hybrid heterogeneous transfer learning through deep learning. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2213--2220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Solomon Kullback. 1987. Letter to the editor: The Kullback-Leibler distance (1987).Google ScholarGoogle Scholar
  15. Andrew R. Liddle, Pia Mukherjee, and David Parkinson. 2010. Model selection and multi-model inference. Bayes. Meth. Cosmol. 1 (2010), 79.Google ScholarGoogle Scholar
  16. Tongliang Liu, Dacheng Tao, Mingli Song, and Stephen J. Maybank. 2017. Algorithm-dependent generalization bounds for multi-task learning. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2 (2017), 227--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mingsheng Long, Yue Cao, Jianmin Wang, and Michael I. Jordan. 2015. Learning transferable features with deep adaptation networks. In Proceedings of the International Machine Learning Society (ICML’15). 97--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mingsheng Long, Jianmin Wang, Yue Cao, Jianguang Sun, and Philip S. Yu. 2016. Deep learning of transferable representation for scalable domain adaptation. IEEE Trans. Knowl. Data Min. 28, 8 (2016), 2027--2040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yong Luo, Tongliang Liu, Dacheng Tao, and Chao Xu. 2014. Decomposition-based transfer distance metric learning for image classification. IEEE Trans. Image Process. 23, 9 (2014), 3789--3801.Google ScholarGoogle ScholarCross RefCross Ref
  20. Cope Mallah and Orwell. 2013. Plant leaf classification using probabilistic integration of shape, texture and margin features. Signal Processing, Pattern Recognition and Applications (2013).Google ScholarGoogle Scholar
  21. S. J. Pan, J. T. Kwok, and Q. Yang. 2008. Transfer learning via dimensionality reduction. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sinno Jialin Pan, Ivor W. Tsang, James T. Kwok, Qiang Yang, and others. 2011. Domain adaptation via transfer component analysis. IEEE Trans. Neural Networks 22, 2 (2011), 199--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Christopher Poultney, Sumit Chopra, Yann L. Cun, and others. 2006. Efficient learning of sparse representations with an energy-based model. In Advances in Neural Information Processing Systems. 1137--1144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Si Si, Dacheng Tao, and Bo Geng. 2010. Bregman divergence-based regularization for transfer subspace learning. IEEE Trans. Knowl. Data Eng. 22, 7 (2010), 929--942. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jan Snyman. 2005. Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-based Algorithms. Vol. 97. Springer Science 8 Business Media.Google ScholarGoogle Scholar
  27. Eric Tzeng, Judy Hoffman, Trevor Darrell, and Kate Saenko. 2015. Simultaneous deep transfer across domains and tasks. In Proceedings of the IEEE International Conference on Computer Vision. 4068--4076. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Bengio Vincent Larochelle and Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning. ACM, 1096--1103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Antoine Xavier and Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning. 513--520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. K. Xing, W. Y. Dai, G. R. Xue, and Y. Yu. 2007. Bridged refinement for transfer learning. In Proceedings of the 10th Pacific Asia Knowledge Discovery and Data Mining.Google ScholarGoogle Scholar
  32. Fuzhen Zhuang, Xiaohu Cheng, Ping Luo, Sinno Jialin Pan, and Qing He. 2015. Supervised representation learning: Transfer learning with deep autoencoders. In Proceedings of 24th International Joint Conference on Artificial Intelligence. 4119--4125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Fuzhen Zhuang, Xiaohu Cheng, Sinno Jialin Pan, Wenchao Yu, Qing He, and Zhongzhi Shi. 2014. Transfer learning with multiple sources via consensus regularized autoencoders. In Machine Learning and Knowledge Discovery in Databases. Springer, 417--431. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Fuzhen Zhuang, Ping Luo, Hui Xiong, Yuhong Xiong, Qing He, and Zhongzhi Shi. 2010. Cross-domain learning from multiple sources: A consensus regularization perspective. IEEE Trans. Knowl. Data Eng. 22, 12 (2010), 1664--1678. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Supervised Representation Learning with Double Encoding-Layer Autoencoder for Transfer Learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Intelligent Systems and Technology
          ACM Transactions on Intelligent Systems and Technology  Volume 9, Issue 2
          Regular Papers
          March 2018
          191 pages
          ISSN:2157-6904
          EISSN:2157-6912
          DOI:10.1145/3154791
          • Editor:
          • Yu Zheng
          Issue’s Table of Contents

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 October 2017
          • Revised: 1 June 2017
          • Accepted: 1 June 2017
          • Received: 1 January 2017
          Published in tist Volume 9, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader