Abstract:
Image fusion synthesizes a new image from multiple images of the same scene. The synthesized image should be suitable for human visual perception and follow-up high-level...Show MoreMetadata
Abstract:
Image fusion synthesizes a new image from multiple images of the same scene. The synthesized image should be suitable for human visual perception and follow-up high-level image-processing tasks. However, existing methods focus on fusing low-level features, ignoring high-level semantic perception information. We propose a new end-to-end model to obtain a more semantically consistent image in infrared and visible image fusion, termed semantic-supervised dual-discriminator generative adversarial network (SDDGAN). In particular, we design an information quantity discrimination (IQD) block to guide fusion progress. For each source image, the block determines the weight for preserving each semantic object’s feature. By this way, the generator learns to fuse various semantic objects via different weights to preserve their characteristics. Moreover, the dual discriminator is employed to identify the distribution of infrared and visible information in the fused image. Each discriminator acts on a certain modality (infrared/visible) of different semantic objects in the fused image to preserve and enhance their modality features. Thus, our fused image is more informative. Both the thermal radiation in the infrared image and the visible image texture details can be well preserved. Qualitative and quantitative experiments demonstrate the superiority of our SDDGAN over state-of-the-art methods in terms of visual effects, efficiency, and quantitative metrics.
Published in: IEEE Transactions on Multimedia ( Volume: 25)