Abstract
In recent years, convolutional neural networks have received increasing attention from the computer vision and machine learning communities. Due to the differences in the distribution, tone and brightness of the training domain and test domain, researchers begin to focus on cross-domain image recognition. In this paper, we propose a Pairwise Generalization Network (PGN) for addressing the problem of cross-domain image recognition where Instance Normalization and Batch Normalization are added to enhance their abilities in the original domain and to expand to the new domain. Meanwhile, the Siamese architecture is utilized in the PGN to learn an embedding subspace that is discriminative, and map positive sample pairs aligned and negative sample pairs separated, which can work well even with only few labeled target data samples. We also add residual architecture and MMD loss for the PGN model to further improve its performance. Extensive experiments on two different public benchmarks show that our PGN solution significantly outperforms the state-of-the-art methods.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. NIPS
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. NIPS
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille A (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI
Tzeng E, Hoffman J, Darrell T, Saenko K (2015) Simultaneous deep transfer across domains and tasks. In: ICCV
Koniusz P, Tas Y, Porikli F (2017) Domain adaptation by mixture of alignments of second- or higher-order scatter tensors. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Gao Z, Han TT, Zhu L, Zhang H, Wang Y (2018) Exploring the cross-domain action recognition problem by deep feature learning and cross-domain learning. IEEE Access 6:68989–69008. https://doi.org/10.1109/ACCESS.2018.2878313
Liu M-Y, Tuzel O (2016) Coupled generative adversarial networks. In: Advances in neural information processing systems, pp 469–477
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Yao T, Pan Y, Ngo C-W, Li H, Mei T (2015) Semi-supervised domain adaptation with subspace learning for visual recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Haeusser P, Frerix T, Mordvintsev A, et al (2017) Associative domain adaptation. In: ICCV, 2017: 2784–2792
Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210. https://doi.org/10.1109/TNN.2010.2091281
Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: IEEE conference on computer vision and pattern recognition (CVPR)
Long M, Wang J, Ding G, et al. (2013) Transfer feature learning with joint distribution adaptation. In: Proceedings of the IEEE international conference on computer vision, pp 2200–2207
Long M, Wang J, Ding G, Sun J, Yu PS (2014) Transfer joint matching for unsupervised domain adaptation. In: IEEE conference on computer vision and pattern recognition (CVPR)
Daum III H (2009) Frustratingly easy domain adaptation. CoRR, arXiv:0907.1815
Motiian S, Piccirilli M, Adjeroh DA, Doretto G (2017) Unified deep supervised domain adaptation and generalization. In: The IEEE international conference on computer vision (ICCV), pp 5715–5725
Long M, Cao Y, Wang J, et al. (2015) Learning transferable features with deep adaptation networks. In: International conference on machine learning, 97–105
Lanckriet GRG, Cristianini N, Ghaoui LE, Bartlett P, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72
Duan L, Tsang IW, Xu D (2012) Domain transfer multiple kernel learning. IEEE Trans Pattern Anal Mach Intell 34(3):465–479
Borgwardt KM et al (2006) Integrating structured biological data by kernel maximum mean discrepancy. In: Proceedings of international conference on intelligence system molecular biology, Fortaleza, Brazil, 49–57
Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky VS (2016) Domain-adversarial training of neural networks. J Mach Learn Res
Bousmalis K, Trigeorgis G, Silberman N, Erhan D, Krishnan D (2016) Domain separation networks. In: Annual conference on neural information processing systems (NIPS)
Long M, Wang J, Jordan MI (2016) Deep transfer learning with joint adaptation networks. CoRR arXiv:1605.06636
Gopalan R, Li R, Chellappa R (2011) Domain adaptation for object recognition: an unsupervised approach. In: ICCV 2011, vol, 24, no. 4, 999–1006 (2011)
Fernando B, Habrard A, Sebban M, et al (2014) Unsupervised visual domain adaptation using subspace alignment. In: IEEE international conference on computer vision. IEEE, 2960–2967
Kulis B, Saenko K, Darrell T (2011) What you saw is not what you get: domain adaptation using asymmetric kernel transforms. In: IEEE conference on computer vision and pattern recognition. IEEE Computer Society, 1785–1792
Baktashmotlagh M, Harandi MT, Lovell BC, et al (2013) Unsupervised domain adaptation by domain invariant projection. In: IEEE International conference on computer vision. IEEE Computer Society, 769–776
Aytar Y, Zisserman A (2011) Tabula rasa: model transfer for object category detection. In: Computer vision (ICCV), 2011 IEEE International Conference on. IEEE, 2252–2259
Becker CJ, Christoudias CM, Fua P (2013) Non-linear domain adaptation with boosting. In: Advances in neural information processing systems, 485–493
Bergamo A, Torresani L (2010) Exploiting weakly-labeled web images to improve object classification: a domain adaptatio napproach. In: Advances in neural information processing systems, 181–189
Chopra S, Hadsell R, LeCun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In: Computer vision and pattern recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol 1, 539–546. IEEE
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR, San Diego, California, USA
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human reidentification. In: European conference on computer vision. Springer, Berlin, 135–153
Kumar B, Carneiro G, Reid I, et al (2016) Learning local image descriptors with deep siamese and triplet convolutional networks by minimising global loss functions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 5385–5394
Sun B, Saenko K (2016) Deep coral: correlation alignment for deep domain adaptation. In: Computer vision–ECCV 2016 workshops, Springer, Berlin, 443–450
Rozantsev A, Salzmann M, Fua P (2016) Beyond sharing weights for deep domain adaptation. arXiv preprint arXiv:1603.06432
Rozantsev A, Salzmann M, Fua P (2018) Residual parameter transfer for deep domain adaptation. In: CVPR
Blanchard G, Lee G, Scott C (2011) Generalizing from several related classification tasks to a new unlabeled sample. In: Advances in neural information processing systems, 2178–2186
Muandet K, Balduzzi D, Scholkopf B (2013) Domain generalization via invariant feature representation. In: ICML(1), 10–18
Ghifary M, Bastiaan Kleijn W, Zhang M, Balduzzi D (2015) Domain generalization for object recognition with multi-task autoencoders. In: Proceedings of the IEEE international conference on computer vision, 2551–2559
Ghifary M, Balduzzi D, Kleijn WB, Zhang M (2017) Scatter component analysis: a unified framework for domain adaptation and domain generalization. IEEE Trans Pattern Anal Mach Intell
Xu Z, Li W, Niu L, Xu D (2014) Exploiting low-rank structure from latent domains for domain generalization. In: ECCV, 628–643
Niu L, Li W, Xu D, Cai J (2017) An exemplar-based multiview domain generalization framework for visual recognition. IEEE Trans Neural Netw Learn Syst 29(2):259–272
Niu L, Li W, Xu D (2016) Multi-view domain generalization for visual recognition. In: IEEE International conference on computer vision. IEEE, 4193–4201 (2016)
Khosla A, Zhou T, Malisiewicz T, Efros AA, Torralba A (2012) Undoing the damage of dataset bias. In: European conference on computer vision. Springer, Berlin, 158–171
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, et al (2015) Going deeper with convolutions. CVPR
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. CVPR
Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. CVPR
He X, He Z, Song J et al. (2018) NAIS: neural attentive item similarity model for recommendation. IEEE Trans Knowl Data Eng 1–1
Zhang H, Kyaw Z, Yu J, et al (2017) PPR-FCN: weakly supervised visual relation detection via parallel pairwise R-FCN
Chen J, Zhang H, He X, et al (2017) Attentive collaborative filtering: multimedia recommendation with item- and component-level attention. In: International ACM SIGIR conference on research and development in information retrieval
Cheng Z, Chang X, Zhu L et al (2018) MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst
Gao Z, Wang DY, Xue YB, Xu GP, Zhang H, Wang YL (2018) 3D object recognition based on pairwise multi-view convolutional neural networks. J Vis Commun Image Represent 56:305–315
Gao Z, Wang D, Wan SH, Zhang H, Wang YL (2019) Cognitive-inspired class-statistic matching with triple-constrain for camera free 3D object retrieval. Future Gener Comput Syst 94:641–653
Nie W, Liu A, Gao Y, Su Y (2018) Hyper-clique graph matching and applications. In: IEEE transactions on circuits and systems for video technology. https://doi.org/10.1109/TCSVT.2018.2852310
Nie W, Cheng H, Su Y (2017) Modeling temporal information of mitotic for mitotic event detection. IEEE Trans Big Data, (99): 1–1
Liu AA, Nie WZ, Yue G et al (2017) View-based 3-D model retrieval: a benchmark. IEEE Trans Cybern 48(3):916–928
Gao Z, Zhang H, Xu GP, Xue YB, Hauptmannc AG (2015) Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition. Signal Process 112:83–97
Ulyanov D, Vedaldi A, Lempitsky V (2017) Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. CVPR
Dumoulin V, Shlens J, Kudlur M (2017) A learned representation for artistic style. ICLR
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. ICCV
Pan X, Luo P, Shi J, et al (2018) Two at once: enhancing learning and generalization capacities via IBN-Net
Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: ECCV, 213–226 (2010)
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Pattern Anal Mach Intell 16(5):550–554
Fernando B, Tommasi T, Tuytelaarsc T (2015) Joint cross-domain classification and subspace learning for unsupervised adaptation. Pattern Recogit Lett 65:60–66
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. IJCV
Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by the National Natural Science Foundation of China (Nos. 61872270, 61572357, 61202168), Opening Foundation of Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, China. Tianjin Municipal Natural Science Foundation (No. 18JCYBJC85500).
Rights and permissions
About this article
Cite this article
Liu, Y.B., Han, T.T. & Gao, Z. Pairwise Generalization Network for Cross-Domain Image Recognition. Neural Process Lett 52, 1023–1041 (2020). https://doi.org/10.1007/s11063-019-10041-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-019-10041-9