Abstract:
Most popular visual trackers for natural scenarios always adopt handcraft features or deep features to track the target in a video. However, they face difficulties in dis...Show MoreMetadata
Abstract:
Most popular visual trackers for natural scenarios always adopt handcraft features or deep features to track the target in a video. However, they face difficulties in discriminative feature representation and usually suffer from severe model drift for satellite videos, especially when encountering challenges of dim and small targets, low contrast, or similar target interference. To overcome these difficulties, we propose a contourlet-based Siamese learning tracker (CSLT), which mainly aims at tracking dim and small objects in satellite videos. In contrast to conventional methods, the contourlet transform (CT) enriches directional multiresolution information which is crucial to discriminative feature representation for dim and small targets in satellite video frames that lack distinguishable appearance features. We jointly use multiresolution features with deep features by spatial-attention fusion strategy and then track the targets by a Siamese structure network. To further improve the accuracy and robustness, a model drift alarm and calibration (MDC) module, including translation drifting penalty and rotation drifting penalty, is employed during tracking. We conduct extensive comparisons with 16 popular state-of-the-art trackers on three satellite video datasets. The experimental results validate the effectiveness of the proposed tracker.
Published in: IEEE Transactions on Geoscience and Remote Sensing ( Volume: 61)