Abstract:
The key success of existing Video Super-Resolution (VSR) methods lies primarily in exploring temporal and spatial information, typically relying on feature alignment and ...Show MoreMetadata
Abstract:
The key success of existing Video Super-Resolution (VSR) methods lies primarily in exploring temporal and spatial information, typically relying on feature alignment and reconstruction. However, existing alignment methods are generally based on optical flow warping, where linear interpolation may not be accurate in some cases. Additionally, existing reconstruction modules usually employ stacked residual blocks, which are constrained by the locality inductive bias of CNNs and struggle to model global spatial features effectively. To address these issues, we propose a novel Dual Propagation Spatial-Temporal Attention Network (DPSTAN) for VSR. Specifically, we propose a Coarse-to-Fine Cross-Attention (CFCA) feature alignment module, incorporating a learnable resampling structure to achieve better alignment. Moreover, we propose a Residual Efficient Extended Channel Attention Block (RE2CAB), which extends the traditional channel attention mechanism to effectively explore the spatial information. Experiments on public datasets have demonstrated that our proposed method improves both quantitative evaluation and visual effects compared to existing methods.
Published in: 2024 16th International Conference on Wireless Communications and Signal Processing (WCSP)
Date of Conference: 24-26 October 2024
Date Added to IEEE Xplore: 14 January 2025
ISBN Information: