Loading [MathJax]/extensions/MathMenu.js
FI-WSOD: Foreground Information Guided Weakly Supervised Object Detection | IEEE Journals & Magazine | IEEE Xplore

FI-WSOD: Foreground Information Guided Weakly Supervised Object Detection


Abstract:

Existing solutions for weakly supervised object detection (WSOD) generally follow the multiple instance learning (MIL) paradigm to formulate WSOD as a multi-class classif...Show More

Abstract:

Existing solutions for weakly supervised object detection (WSOD) generally follow the multiple instance learning (MIL) paradigm to formulate WSOD as a multi-class classification problem over a set of region proposals. However, without the supervision signal of ground-truth boxes, the training objective of multi-class classification makes the detectors devote main efforts to finding the most common pattern of each class, as the common pattern is always the most discriminative evidence for classification. In addition, although learning from distinguishing multiple foreground classes, the detectors can still ignore to differentiate foreground regions from the background ones, which causes false alarm in prediction. These two points account for the limited localization capability of MIL-based WSOD methods. To this end, we propose foreground information guided WSOD (FI-WSOD), a novel framework that introduces an extra foreground-background binary classification (F-BBC) sub-task to the original MIL-based WSOD paradigm. At the training stage, the involvement of F-BBC task not only improves the feature representation of the network, but also provides extra information from the foreground-background perspective. By leveraging the learnt foreground information, a Foreground Guided Self-Training (FGST) module is further proposed to filter out noisy samples, and to mine representative seeds from the remaining proposals. Moreover, a Multi-Seed Training strategy is performed to reduce the impact of noisy labels when training the self-training networks in FGST. We have conducted extensive experiments on the prevalent Pascal VOC 2007, Pascal VOC 2012 and MSCOCO datasets, and report a series of state-of-the-art records achieved by our proposed framework.
Published in: IEEE Transactions on Multimedia ( Volume: 25)
Page(s): 1890 - 1902
Date of Publication: 10 August 2022

ISSN Information:


Contact IEEE to Subscribe

References

References is not available for this document.