Abstract:
In recent years, Convolutional Neural Networks (CNNs) and Visual Transformers have shown remarkable performance in image deraining tasks. However, these state-of-the-art ...Show MoreMetadata
Abstract:
In recent years, Convolutional Neural Networks (CNNs) and Visual Transformers have shown remarkable performance in image deraining tasks. However, these state-of-the-art (SOTA) methods exhibit high computational costs in addition to excellent performance. This would hinder the analytical comparison of methods and limit their practical application. We argue that the high computational cost mainly stems from the explosion in the number of parameters due to the surge of feature dimensions. To achieve better results with fewer parameters. By reconstructing the multi-head attention mechanism and feed-forward network, we propose a multi-scale hierarchical Transformer network with a change of width resembling a pyramid, called CPTransNet. The key idea of CPTransNet is to slowly increase the feature dimension during the feature extraction process. This avoids parameter wastage due to feature dimension surge. CPTransNet achieves 33.25 dB PSNR on the classical dataset of image deraining, exceeding the previous state-of-the-art 0.22 dB PSNR with only 19.4% of its computational cost.
Published in: IEEE Signal Processing Letters ( Volume: 30)