Abstract
Most of the existing dehazing methods are based on learning and statistical priors. The convolutional neural network (CNN) is used in most learning-based dehazing methods. Due to the inherent characteristics of CNNs, its ability to express the interconnection of image information is limited, so CNN-based dehazing networks tend to be complex in structure but poor in robustness. Many prior-based methods fail in some cases due to limitations of their statistical priors. To deal with these issues and achieve end-to-end dehazing, a multi-scale Transformer fusion dehazing network (MSTFDN) is proposed, which includes three modules: multi-scale Transformer fusion module (MSTFM), feature enhancement module (FEM), and color restoration module (CRM). MSTFM consists of multi-scale Transformer blocks for capturing long-range dependencies of image information in space. FEM enhances the front features and obtains features of different depths. CRM gets clear images and restores the fidelity color. Ablation studies have been performed to illustrate each module’s effectiveness and to select the best multi-scale Transformer combination. Extensive experiments on synthetic and real-world hazy images demonstrate that the proposed method has strong robustness, outperforms the state-of-the-art methods in qualitative evaluation, and performs well in quantitative evaluation.
Similar content being viewed by others
References
Kim J-H, Jang W-D, Sim J-Y, Kim C-S (2013) Optimized contrast enhancement for real-time image and video dehazing. J Vis Commun Image Represent 24(3):410–425
Petro AB, Sbert C, Morel J-M (2014) Multiscale retinex. Image Processing On Line, pp 71–88
Gu Z, Li F, Fang F, Zhang G (2019) A novel retinex-based fractional-order variational model for images with severely low light. IEEE Trans Image Process 29:3239–3253
Nayar SK, Narasimhan SG (1999) Vision in bad weather. In: Proceedings of the Seventh IEEE international conference on computer vision, vol 2, IEEE, pp 820–827
Singh D, Kumar V, Kaur M (2019) Single image dehazing using gradient channel prior. Appl Intell 49(12):4276–4293
Yang Y, Wang Z (2020) Haze removal: Push dcp at the edge. IEEE Signal Process Lett 27:1405–1409
He K, Sun J, Tang X (2010) Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell 33(12):2341–2353
Zhu Q, Mai J, Shao L (2015) A fast single image haze removal algorithm using color attenuation prior. IEEE Trans Image Process 24(11):3522–3533
Wang W, Yuan X, Wu X, Liu Y (2017) Fast image dehazing method based on linear transformation. IEEE Transactions on Multimedia 19(6):1142–1155
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929
Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y (2021) Transformer in transformer. arXiv:2103.00112
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. arXiv:2103.14030
Kumar H, Gupta S, Venkatesh KS (2019) Realtime dehazing using colour uniformity principle. IET Image Process 13(11):1931–1939
Fan G, Hua Z, Li J (2021) Multi-scale depth information fusion network for image dehazing. Applied Intelligence, pp 1–19
Yang Y, Zhang C, Jiang P, Yue H (2020) Attention-based end-to-end image defogging network. Electron Lett 56(15):759–761
Cai B, Xu X, Jia K, Qing C, Tao D (2016) Dehazenet: An end-to-end system for single image haze removal. IEEE Trans Image Process 25(11):5187–5198
Li B, Peng X, Wang Z, Xu J, Feng D (2017) Aod-net: All-in-one dehazing network. In: Proceedings of the IEEE international conference on computer vision, pp 4770–4778
Ren W, Liu S, Zhang H, Pan J, Cao X, Yang M-h (2016) Single image dehazing via multi-scale convolutional neural networks. In: European conference on computer vision, Springer, pp 154–169
Qin X, Wang Z, Bai Y, Xie X, Jia H (2020) Ffa-net: Feature fusion attention network for single image dehazing. Proceedings of the AAAI Conference on Artificial Intelligence 34(07):11908–11915
Wu H, Liu J, Xie Y, Qu Y, Ma L (2020) Knowledge transfer dehazing network for nonhomogeneous dehazing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 478–479
Tang G, Müller M, Rios A, Sennrich R (2018) Why self-attention? a targeted evaluation of neural machine translation architectures. arXiv:1808.08946
Hendrycks D, Gimpel K (2016) Gaussian error linear units (gelus). arXiv:1606.08415
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Li B, Ren W, Fu D, Tao D, Feng D, Zeng W, Wang Z (2018) Benchmarking single-image dehazing and beyond. IEEE Trans Image Process 28(1):492–505
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: From error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Chen D, He M, Fan Q, Liao J, Zhang L, Hou D, Yuan L, Hua G (2019) Gated context aggregation network for image dehazing and deraining. In: 2019 IEEE winter conference on applications of computer vision (WACV), IEEE, pp 1375–1383
Ancuti C, Ancuti CO, Timofte R (2018) Ntire 2018 challenge on image dehazing: Methods and results. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 891–901
Ancuti C, Ancuti CO, Timofte R, De Vleeschouwer C (2018) i-haze: A dehazing benchmark with real hazy and haze-free indoor images. In: International conference on advanced concepts for intelligent vision systems, Springer, pp 620–631
Ancuti CO, Ancuti C, Timofte R, De Vleeschouwer C (2018) O-haze: a dehazing benchmark with real hazy and haze-free outdoor images. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 754–762
Dong H, Pan J, Xiang L, Hu Z, Zhang X, Wang F, Yang M-H (2020) Multi-scale boosted dehazing network with dense feature fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2157–2167
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, Y., Zhang, H., Wu, X. et al. MSTFDN: Multi-scale transformer fusion dehazing network. Appl Intell 53, 5951–5962 (2023). https://doi.org/10.1007/s10489-022-03674-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03674-2