Loading [MathJax]/extensions/MathMenu.js
HDNet: Multi-Modality Hierarchy-Aware Decision Network for RGB-D Salient Object Detection | IEEE Journals & Magazine | IEEE Xplore

HDNet: Multi-Modality Hierarchy-Aware Decision Network for RGB-D Salient Object Detection


Abstract:

RGB-D Salient object detection (SOD) is a pixel-level dense prediction task, which can highlight the prominent object in the scene. Recently, Convolution Neural Network (...Show More

Abstract:

RGB-D Salient object detection (SOD) is a pixel-level dense prediction task, which can highlight the prominent object in the scene. Recently, Convolution Neural Network (CNN) is widely applied in SOD to generate multi-level features, which are complementary to each other. However, most methods ignore the unique characteristics of multi-level features (high-level and low-level features). Given the effective employment of multi-level features, we propose a novel multi-modality hierarchy-aware decision network (HDNet) by embedding a Swin Transformer as an encoder. The proposed HDNet contains three primary designs: (1) a Swin Transformer encoder is employed instead of a CNN to learn long-range dependencies; (2) a hierarchy-aware feature decision mechanism (HFDM) is proposed to exploit effective local detail cues of low-level features and global semantic information of high-level features, which consists of two sub-modules, namely low-hierarchy edge module (LEM) and high-hierarchy region module (HRM); (3) a decision-based fusion module (DFM) is designed to fuse RGB and depth features under the attribute of multi-level features generated from HFDM. Experiments on five public benchmarks verify that our framework has better performance than the other 18 state-of-the-art algorithms.
Published in: IEEE Signal Processing Letters ( Volume: 29)
Page(s): 2577 - 2581
Date of Publication: 15 December 2022

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.