Abstract
Deep neural networks have been successfully applied to numerous machine learning tasks because of their impressive feature abstraction capabilities. However, conventional deep networks assume that the training and test data are sampled from the same distribution, and this assumption is often violated in real-world scenarios. To address the domain shift or data bias problems, we introduce layer-wise domain correction (LDC), a new unsupervised domain adaptation algorithm which adapts an existing deep network through additive correction layers spaced throughout the network. Through the additive layers, the representations of source and target domains can be perfectly aligned. The corrections that are trained via maximum mean discrepancy, adapt to the target domain while increasing the representational capacity of the network. LDC requires no target labels, achieves state-of-the-art performance across several adaptation benchmarks, and requires significantly less training time than existing adaptation methods.
Similar content being viewed by others
References
Ajakan H, Germain P, Larochelle H, et al., 2014. Domainadversarial neural networks. https://arxiv.org/abs/1412.4446
Ben-David S, Blitzer J, Crammer K, et al., 2010. A theory of learning from different domains. Mach Learn, 79(1-2):151–175. https://doi.org/10.1007/s10994-009-5152-4
Blitzer J, McDonald R, Pereira F, 2006. Domain adaptation with structural correspondence learning. Proc Conf on Empirical Methods in Natural Language Processing, p.120–128. https://doi.org/10.3115/1610075.1610094
Borgwardt KM, Gretton A, Rasch MJ, et al., 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, 22(14):e49–e57. https://doi.org/10.1093/bioinformatics/btl242
Chen MM, Weinberger KQ, Blitzer JC, 2011. Co-training for domain adaptation. Advances in Neural Information Processing Systems, p.2456–2464.
Chen MM, Xu ZX, Weinberger K, et al., 2012. Marginalized denoising autoencoders for domain adaptation. https://arxiv.org/abs/1206.4683
Donahue J, Jia YQ, Vinyals O, et al., 2014. Decaf: a deep convolutional activation feature for generic visual recognition. Proc 31st Int Conf on Machine Learning, p.647–655.
Duan LX, Tsang IW, Xu D, et al., 2009. Domain transfer SVM for video concept detection. IEEE Conf on Computer Vision and Pattern Recognition, p.1375–1381. https://doi.org/10.1109/CVPR.2009.5206747
Duan LX, Tsang IW, Xu D, 2012. Domain transfer multiple kernel learning. IEEE Trans Patt Anal Mach Intell, 34(3):465–479. https://doi.org/10.1109/TPAMI.2011.114
Ganin Y, Lempitsky V, 2015. Unsupervised domain adaptation by backpropagation. Proc 32nd Int Conf on Machine Learning, p.1180–1189.
Gardner JR, Upchurch P, Kusner MJ, et al., 2015. Deep manifold traversal: changing labels with convolutional features. https://arxiv.org/abs/1511.06421
Gehring J, Auli M, Grangier D, et al., 2017. Convolutional sequence to sequence learning. https://arxiv.org/abs/1705.03122
Glorot X, Bordes A, Bengio Y, 2011. Domain adaptation for large-scale sentiment classification: a deep learning approach. Proc 28th Int Conf on Machine Learning, p.513–520.
Gong BQ, Shi Y, Sha F, et al., 2012. Geodesic flow kernel for unsupervised domain adaptation. IEEE Conf on Computer Vision and Pattern Recognition, p.2066–2073. https://doi.org/10.1109/CVPR.2012.6247911
Gong BQ, Grauman K, Sha F, 2013. Connecting the dots with landmarks: discriminatively learning domaininvariant features for unsupervised domain adaptation. Proc 30th Int Conf on Machine Learning, p.222–230.
Gretton A, Borgwardt KM, Rasch MJ, et al., 2012. A kernel two-sample test. J Mach Learn Res, 13(1):723–773.
He KM, Zhang XY, Ren SQ, et al., 2015. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. IEEE Int Conf on Computer Vision, p.1026–1034. https://doi.org/10.1109/ICCV.2015.123
He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. IEEE Conf on Computer Vision and Pattern Recognition, p.770–778. https://doi.org/10.1109/CVPR.2016.90
Hoffman J, Tzeng E, Park T, et al., 2017. CyCADA: cycleconsistent adversarial domain adaptation. https://arxiv.org/abs/1711.03213
Ioffe S, Szegedy C, 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. Proc 32nd Int Conf on Machine Learning, p.448–456.
Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980
Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84–90. https://doi.org/10.1145/3065386
LeCun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278–2324. https://doi.org/10.1109/5.726791
Li YJ, Swersky K, Zemel R, 2015. Generative moment matching networks. Proc 32nd Int Conf on Machine Learning, p.1718–1727.
Long MS, Wang JM, Ding GG, et al., 2013. Transfer feature learning with joint distribution adaptation. Proc IEEE Int Conf on Computer Vision, p.2200–2207. https://doi.org/10.1109/ICCV.2013.274
Long MS, Wang JM, Ding GG, et al., 2014. Transfer joint matching for unsupervised domain adaptation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1410–1417. https://doi.org/10.1109/CVPR.2014.183
Long MS, Cao Y, Wang JM, et al., 2015. Learning transferable features with deep adaptation networks. Proc 32nd Int Conf on Machine Learning, p.97–105.
Long MS, Wang JM, Cao Y, et al., 2016a. Deep learning of transferable representation for scalable domain adaptation. IEEE Trans Knowl Data Eng, 28(8):2027–2040. https://doi.org/10.1109/TKDE.2016.2554549
Long MS, Zhu H, Wang JM, et al., 2016b. Unsupervised domain adaptation with residual transfer networks. Advances in Neural Information Processing Systems, p.136–144.
Mikolov T, Sutskever I, Chen K, et al., 2013. Distributed representations of words and phrases and their compositionality. Proc 26th Int Conf on Neural Information Processing Systems, p.3111–3119.
Netzer Y, Wang T, Coates A, et al., 2011. Reading digits in natural images with unsupervised feature learning. NIPS Workshop on Deep Learning and Unsupervised Feature Learning, p.1–9.
Oquab M, Bottou L, Laptev I, et al., 2014. Learning and transferring mid-level image representations using convolutional neural networks. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1717–1724. https://doi.org/10.1109/CVPR.2014.222
Pan SJL, Yang Q, 2010. A survey on transfer learning. IEEE Trans Knowl Data Eng, 22(10):1345–1359. https://doi.org/10.1109/TKDE.2009.191
Pan SJL, Tsang IW, Kwok JT, et al., 2011. Domain adaptation via transfer component analysis. IEEE Trans Neur Netw, 22(2):199–210. https://doi.org/10.1109/TNN.2010.2091281
Russakovsky O, Deng J, Su H, et al., 2015. ImageNet large scale visual recognition challenge. Int J Comput Vis, 115(3):211–252. https://doi.org/10.1007/s11263-015-0816-y
Saenko K, Kulis B, Fritz M, et al., 2010. Adapting visual category models to new domains. LNCS, 6314:213–226. https://doi.org/10.1007/978-3-642-15561-1_16
Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556
Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929–1958.
Sutskever I, Martens J, Dahl G, et al., 2013. On the importance of initialization and momentum in deep learning. Proc 30th Int Conf on Machine Learning, p.1139–1147.
Sutskever I, Vinyals O, Le Q, 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104–3112.
Tzeng E, Hoffman J, Zhang N, et al., 2014. Deep domain confusion: maximizing for domain invariance. https://arxiv.org/abs/1412.3474
van der Maaten L, Hinton G, 2008. Visualizing data using t-SNE. J Mach Learn Res, 9(11):2579–2605.
Yosinski J, Clune J, Bengio Y, et al., 2014. How transferable are features in deep neural networks? Proc 27th Int Conf on Neural Information Processing Systems, p.3320–3328.
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Key R&D Program of China (No. 2016YFB1200203) and the National Natural Science Foundation of China (Nos. 41427806 and 61273233)
Electronic supplementary materials: The online version of this article (https://doi.org/10.1631/FITEE.1700774) contains supplementary materials, which are available to authorized users
Rights and permissions
About this article
Cite this article
Li, S., Song, Sj. & Wu, C. Layer-wise domain correction for unsupervised domain adaptation. Frontiers Inf Technol Electronic Eng 19, 91–103 (2018). https://doi.org/10.1631/FITEE.1700774
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1700774