Boosting unsupervised domain adaptation: A Fourier approach
Introduction
Deep learning has made significant progress in various vision tasks, such as object detection [1], [2] and semantic segmentation [3], [4]. High-quality training data are required to achieve impressive performance gains. However, in practical scenarios, manually labeling sufficient training data frequently requires considerable manpower and resources costs. Another disadvantage of deep neural networks is the lack of sufficient generalization ability for new datasets of the problem of domain shift [5], [6].
To solve the problem of domain shift, unsupervised domain adaptation (UDA) [7], [8], [9] is typically used as an effective method. The two main types of UDA include discrepancy-based and consensus-based UDA [10], [11], [12], which primarily aim to align the domains distribution by minimizing a well-designed statistical metric. The second one is an adversarial-based method [13], [14] that distinguishes between the two domains using a domain discriminator, and confuses the domain discriminator using a feature extractor. However, these discrepancy-based and adversarial-based methods all directly input the original image into the model, ignoring the processing of the original image.
To address the aforementioned problems, in this paper, we adopted the Fourier approach to boost the performance of Unsupervised Domain Adaptation, dubbed FUDA. Our motivation comes from a well-known property of the Fourier transformation [15], [16], [17], [18]: the phase component of Fourier spectrum preserves high-level semantics of the original signal, while the amplitude component contains low-level statistics. For better understanding, we present example of the images reconstructed from only amplitude information and only phase information, as well as the original image in Fig. 1, Fig. 2. According to Fig. 2, we find that different images have different amplitude components. Meanwhile, from Fig. 1, we find that the amplitude is mainly related to the semantic information of the image. Based on this observations, FDA [19] have recently developed a Fourier-based method for domain adaptation. They propose a simple image translation strategy by replacing the amplitude spectrum of a source image with that of a random target image. By simply training on the amplitude-transferred source images, their method achieves a remarkable performance. Inspired by above work, we further explore Fourier-based methods for domain adaptation, which consists Fourier transform and Fourier channel attention. (1) Fourier transform: we extract the amplitude of the target domain and fuse the amplitude of the two domains, we find that the augmented new image can capture the color and style information of the target domain as shown in Fig. 2. Thus, we fuse the amplitude of the two domains and generate augmented source domain image towards target domain image by inverse Fourier transform. (2) to effectively focus on the core information of the feature, we propose to leverage Fourier transform channel attention instead of the typical attention that is based on global average pooling (GAP) to better capture rich input pattern information. Notably, our proposed FUDA is a versatile approach that can be incorporated into large amount of exiting UDA methods. In experiment section, we incorporate FUDA with the current state-of-the-art UDA methods called SCDA [20] on multiple cross-domain benchmarks to verify the effectiveness of our proposed FUDA approach. On four widely used benchmarks include Office-31, Office-Home, VisDA-2017 and DomainNet, comprehensive experiments validate that our proposed FUDA approach can largely boost the performance of existing algorithms for UDA.
Thus far, the contributions of this paper are summarized as follows:
- •
We leverage the Fourier approach to boost the performance of Unsupervised Domain Adaptation (UDA), which solves the domain shift problem in UDA.
- •
We reveal that fusing the amplitude of the target domain into the source domain can capture the style information of the target domain, and thus develop a new Fourier transform to augment the source domain and improve the performance of the UDA.
- •
We propose a Fourier transform channel attention mechanism that can capture rich input pattern information, which is more suitable for UDA.
- •
We conduct extensive experiments to verify our proposed FUDA, which achieve a new SOTA performance on four standard domain adaptation benchmarks.
Section snippets
Related work
Fourier-based Method. The Fourier transform has wide applications in the field of machine learning [21]. Several works have revealed the low-level information of an image where the amplitude is the main concern, such as the color and style of the image. The phase is primarily concerned with the high-level information of the image, such as the object of the image. [19] introduced the Fourier transform perspective into domain adaptation for the first time and trained the model by simply replacing
Methodology
In unsupervised domain adaptation, we have two domains, one is the labeled source domain, denoted as , where is the labels corresponding to the source domain, and denote the target domain. The source domain and the target domain share the same label space, however, their data probability distributions are not the same. When the model trained on the source domain is directly used on the target domain, the performance is often degraded owing to the difference in the
Benchmarks and experimental settings
Office-31 [33] contains 31 types of data, all of which are office data, and the data sources are Amazon (A), Webcam (W) and DSLR (D). It contains 31 categories from 4,110 images shared by three domains. To test our FUDA, we construct all six domain adaptation tasks, i.e., A W, …, A D
Office-Home [34] is a new dataset released in 2017, containing 65 objects, mainly for research in the field of domain adaptation, including Artistic images (A), Clipart Art (C), Product images (P) and
Conclusion
We have proposed a simple method for domain alignment that can be easily integrated into a learning system that transforms unsupervised domain adaptation into supervised domain adaptation. It is important to pay attention to proper attention, which is why we propose a Fourier channel attention paradigm.
We found our method, despite being simple, outperformed both the baseline and the current state of the art, which is considerably more complex. This suggests that a fast Fourier transform can
CRediT authorship contribution statement
Mengzhu Wang: Conceptualization, Methodology, Software. Shanshan Wang: Visualization, Investigation. Ye Wang: Data curation. Wei Wang: Data curation, Writing – original draft. Tianyi Liang: Software, Validation. Junyang Chen: Supervision. Zhigang Luo: Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (NSFC) under Grants No. 62106003 and the University Synergy Innovation Program of Anhui Province (GXXT-2021-005).
References (55)
- et al.
Reducing bi-level feature redundancy for unsupervised domain adaptation
Pattern Recognit.
(2023) - et al.
BP-triplet net for unsupervised domain adaptation: A Bayesian perspective
Pattern Recognit.
(2023) - et al.
Rapid and quantitative detection of the microbial spoilage of beef by Fourier transform infrared spectroscopy and machine learning
Anal. Chim. Acta
(2004) - et al.
Unsupervised domain adaptation for person re-identification with iterative soft clustering
Knowl.-Based Syst.
(2021) - E. Xie, J. Ding, W. Wang, X. Zhan, H. Xu, P. Sun, Z. Li, P. Luo, Detco: Unsupervised contrastive learning for object...
- et al.
A trainable system for object detection
Int. J. Comput. Vis.
(2000) - et al.
A review of semantic segmentation using deep neural networks
Int. J. Multimed. Inf. Retr.
(2018) - J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE...
- et al.
Analysis of representations for domain adaptation
Adv. Neural Inf. Process. Syst.
(2007) - et al.
A theory of learning from different domains
Mach. Learn.
(2010)
Interbn: Channel fusion for adversarial unsupervised domain adaptation
TFC: Transformer fused convolution for adversarial domain adaptation
IEEE Trans. Comput. Soc. Syst.
Unsupervised domain adaptation by backpropagation
Semantic data augmentation based distance metric learning for domain generalization
Conditional adversarial domain adaptation
Domain-adversarial training of neural networks
J. Mach. Learn. Res.
The importance of phase in signals
Proc. IEEE
Phase in speech and pictures
Structural sparseness and spatial phase alignment in natural scenes
J. Opt. Soc. Amer. A
A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase
Perception
RDA: Robust domain adaptation via Fourier adversarial attacking
A survey of transfer learning
J. Big Data
Deep domain confusion: Maximizing for domain invariance
Cited by (5)
Video Generalized Semantic Segmentation via Non-Salient Feature Reasoning and Consistency
2024, Knowledge-Based SystemsWCAL: Weighted and center-aware adaptation learning for partial domain adaptation
2024, Engineering Applications of Artificial IntelligenceCasting a BAIT for offline and online source-free domain adaptation
2023, Computer Vision and Image Understanding
- 1
These authors contributed to the work equally and should be regarded as co-first authors.