Abstract:
Both infrared and visible images have advantages for object detection, since infrared images (IRs) can capture thermal characteristics of objects and visible images can p...Show MoreMetadata
Abstract:
Both infrared and visible images have advantages for object detection, since infrared images (IRs) can capture thermal characteristics of objects and visible images can provide high spatial resolution and clear texture details of objects. Combining infrared and visible images for object detection has many advantages, but how to fully utilize the inherent characteristics of these two data is still a challenging issue. To address this issue, a deblurring dictionary encoding fusion network (DDFN) is proposed for infrared and visible image object detection. First, a dual-stream feature extraction backbone is structured, which aims to learn features based on the characteristics of different modalities. Then, pooling operations are applied to filter out key information and reduce the complexity of the network. Afterward, a fuzzy compensation module (FCM) is proposed, which aims to minimize the information loss of the pooling process. Finally, a dictionary encoding fusion module (DEFM) is proposed to robustly excavate potential interactions between infrared and visible images, which can obtain fusion features by aggregating the local information of infrared features and the long-term-dependent information of visible features. The proposed DDFN exhibits excellent performance on two benchmark bimodal datasets and shows superior capabilities in object detection of infrared–visible images.
Published in: IEEE Geoscience and Remote Sensing Letters ( Volume: 20)