Abstract:
Recently learned image compression methods have achieved better rate-distortion performance than traditional non-learning image compression standards. Some previous image...Show MoreMetadata
Abstract:
Recently learned image compression methods have achieved better rate-distortion performance than traditional non-learning image compression standards. Some previous image compression methods combine the local modeling capability of CNN with the long-range attention of Transformer to generate the latent representation. However, previous methods ignored the fact that Transformer pays attention to low-frequency feature learning while CNN focuses on high-frequency feature learning, resulting in insufficient fusion of these two structures. In this paper, we propose a novel image compression method with Frequency Decomposition Network (FDNet), which processes low-frequency and high-frequency components in different ways. More specifically, FDNet initially implements a dynamic frequency filter to adaptively decompose the features into low-frequency and high-frequency components. As invertible neural networks do not lose any information during the feature transformation and can be implemented by CNN residual networks, the invertible neural network block (INNB) is used to extract high-frequency local information. Then FDNet takes a hybrid attention block (HAB), which is composed of window-based multi-head self-attention (W-MSA) and channel attention, to extract window-based and global spatial low-frequency information. Besides, previous channel entropy models adopt CNN networks to remove high-frequency redundancy of the latent representation. However, there exists low-frequency redundancy between different channels of the latent representation. To solve this issue, FDNet further introduces the hybrid attention block to the channel entropy model. W-MSA and channel attention of the hybrid attention block can remove the window-based and global low-frequency redundancy, respectively. Extensive experiments demonstrate that FDNet achieves promising rate-distortion performance on the Kodak, CLIC and Tecnick datasets.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 11, November 2024)