Conferences >2023 IEEE 32nd International ...

Unifying Local and Global Fourier Features for Image Classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In the last decade, Convolutional Neural Networks (CNNs) have become a dominant algorithm in solving various domains such as computer vision, self-driving cars, medical i...Show More

Metadata

Abstract:

In the last decade, Convolutional Neural Networks (CNNs) have become a dominant algorithm in solving various domains such as computer vision, self-driving cars, medical imaging, and natural language processing. The core operation of the CNNs is convolution layer that can aggregate input features around local windows in a short-range manner and learn relative positions inside each window. For long-range modeling, common CNNs stack a bunch of convolutional layers that result in high computational costs to enlarge receptive field. Recently, Vision Transformers (ViTs) and its improvements have outperformed CNNs in the rankings of language, vision, and audio research. The main goal of the ViTs is that the model can extract short-range and long-range features in one layer. With this strategy, the network structure of the ViTs is simpler than CNNs. However, ViTs have quadratic complexity with the spatial length of the input feature. In the last year, many methods are proposed to relax the cost of ViTs and bring complicated designs of CNNs into ViT-based models. Inspired by the insightful properties of the ViTs and CNNs, this paper introduces a Local and Global Fourier Network (LGFNet) that jointly learns local and global receptive fields in the frequency domain rather than the spatial or time domain in conventional CNNs and ViTs. The input features, local, and global kernels are transformed to the frequency domain through Fast Fourier Transform. The local features are learned by a convolution between the input feature and local kernels. Concurrently, matrix multiplication between the input feature and global kernels is performed to extract low frequencies from the input Fourier feature. Since local and global Fourier features are complementary, the LGFNet efficiently fuses these information by summation operation based on the similarity degrees of the input signals. Therefore, our LGFNet performs unified representation from the input feature. To evaluate the effectiveness ...

Published in: 2023 IEEE 32nd International Symposium on Industrial Electronics (ISIE)

Date of Conference: 19-21 June 2023

Date Added to IEEE Xplore: 31 August 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/ISIE51358.2023.10227936

Conference Location: Helsinki, Finland

Funding Agency:

Contents

References is not available for this document.

Unifying Local and Global Fourier Features for Image Classification

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Unifying Local and Global Fourier Features for Image Classification

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?