Skip to main content
Log in

Weakly Supervised Object Detection Based on Active Learning

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Weakly supervised object detection which reduces the need for strong supersivison during training has recently made significant achievements. However, it remains a challenging issue due to the time-consuming and labor-intensive problems in application. To further reduce the label cost, we introduce a new fusion method of weakly supervised learning and active learning in a unied framework for object detection. Weakly supervised learning based on min-entropy latent model is used to weaken the labels by image-label, while active learning is used to reduce the quantity of labeled images. The fusion method proposed can effectively reduce the dependency of object detection on manual annotation. In this paper, we introduce three strategies of active learning, including least confidence sampling, margining sampling and weighted classification sampling. To validate the effectiveness of each strategy and different sample compositions in weakly supervised learning object detection, we conducted lots of experiments. Extensive experiments show that the combination of image-level labeling and active learning can achieve comparable results with the previous state-of-the-art methods with much lower label cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv:1904.07850 [cs.CV]

  2. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. arXiv:1904.08189 [cs.CV]

  3. Law H, Teng Y, Russakovsky O, Deng J (2020) Cornernet-lite: efficient keypoint based object detection. arXiv:1904.08900 [cs.CV]

  4. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection, pp 779–788

  5. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp 7263–7271

  6. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934 [cs.CV]

  7. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  8. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards realtime object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99

    Google Scholar 

  9. Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. 15:7

  10. Gokberk Cinbis R, Verbeek J, Schmid C (2014) Multi-fold mil training for weakly supervised object localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2409–2416

  11. Pandey M, Lazebnik S (2011) Scene recognition and weakly supervised object localization with deformable part-based models. In: International conference on computer vision, pp 1307–1314

  12. Diba A, Sharma V, Pazandeh A, Pirsiavash H, Van Gool L (2017) Weakly supervised cascaded convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 914–922

  13. Li Y, Liu L, Shen C, van den Hengel A (2016) Image co-localization by mimicking a good detector’s confidence score distribution. In: European Conference on Computer Vision, pp. 19–34

  14. Wang X, Zhu Z, Yao C, Bai X (2015) Relaxed multiple-instance svm with application to object discovery. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1224–1232

  15. Cinbis RG, Verbeek J, Schmid C (2016) Weakly supervised object localization with multi-fold multiple instance learning. IEEE Trans Pattern Anal Mach Intell 39(1):189–203

    Article  Google Scholar 

  16. Bilen H, Pedersoli M, Tuytelaars T (2014) Weakly supervised object detection with posterior regularization. In: Proceedings BMVC 2014, pp 1–12

  17. Bilen H, Pedersoli M, Tuytelaars T (2015) Weakly supervised object detection with convex clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1081–1089

  18. Wang C, Ren W, Huang K, Tan T (2014) Weakly supervised object localization with latent category learning. In: Computer vision—ECCV 2014. Springer, Cham, pp 431–445

  19. Song HO, Lee YJ, Jegelka S, Darrell T (2014) Weakly-supervised discovery of visual pattern configurations. Adv Neural Inf Process Syst 2:1637–1645

    Google Scholar 

  20. Bilen H, Vedaldi A (2016) Weakly supervised deep detection networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2846–2854

  21. Wan F, Wei P, Jiao J, Han Z, Ye Q (2018) Min-entropy latent model for weakly supervised object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1297–1306

  22. Aghdam H.H, Gonzalez-Garcia A, Joost V, López A (2019) Active learning for deep detection neural networks. In: International conference on computer vision, pp 3671–3679

  23. Zhang B, Li L, Yang S, Wang S, Zha Z-J, Huang Q (2020) State-relabeling adversarial active learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8756–8765

  24. Yoo D, Kweon IS (2019) Learning loss for active learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 93–102

  25. Gilad-Bachrach R, Navot A, Tishby N (2006) Query by committee made real. In: Advances in neural information processing systems, pp 443–450

  26. Lewis DD, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In: Machine learning proceedings 1994, pp 148–156

  27. Culotta A, McCallum A (2005) Reducing labeling effort for structured prediction tasks. In: AAAI conference on artificial intelligence, pp 2921–2929

  28. Roth D, Small K (2006) Margin-based active learning for structured output spaces. In: European conference on machine learning. Springer, pp 413–424

  29. Joshi AJ, Porikli F, Papanikolopoulos N (2009) Multi-class active learning for image classification. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 2372–2379

  30. Qu Z, Du J, Cao Y, Guan Q, Zhao P (2020) Deep active learning for remote sensing object detection. arXiv:2003.08793

  31. Goodfellow IJ, Pouget J, Mirza M, Xu B, Warde D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Adv Neural Inf Process Syst 3:2672–2680

    Google Scholar 

  32. Saatci Y, Wilson A (2017) Bayesian gans. In: Advances in neural information processing systems, pp 3624–3633

  33. Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196

  34. Jie Z, Wei Y, Jin X, Feng J, Liu W (2017) Deep self-taught learning for weakly supervised object localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1377–1385

  35. Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545

    Article  Google Scholar 

  36. Zhang X, Feng J, Xiong H, Tian Q (2018) Zigzag learning for weakly supervised object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4262–4270

  37. Inoue N, Furuta R, Yamasaki T, Aizawa K (2018) Cross-domain weakly-supervised object detection through progressive domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5001–5009

  38. Alexe B, Deselaers T, Ferrari V (2010) Classcut for unsupervised class segmentation. In: European conference on computer vision, pp 380–393

  39. Vicente S, Rother C, Kolmogorov V (2011) Object cosegmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2217–2224

  40. Zhou Z-H (2018) A brief introduction to weakly supervised learning. Natl Sci Rev 5(1):44–53

    Article  Google Scholar 

  41. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  42. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

  43. Luo W, Schwing A, Urtasun R (2013) Latent structured active learning. Adv Neural Inf Process Syst 26:728–736

    Google Scholar 

  44. Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the 2008 conference on empirical methods in natural language processing, pp 1070–1079

  45. Li X, Guo Y (2013) Adaptive active learning for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 859–866

  46. Zhang D, Lin L, Li Y, Zhang R, Wang K (2017) Cost-effective active learning for deep image classification. IEEE Trans Circuits Syst Video Technol 27(12):2591–2600

    Article  Google Scholar 

  47. Zhou Z, Shin J, Zhang L, Gurudu S, Gotway M, Liang J (2017) Fine-tuning convolutional neural networks for biomedical image analysis: actively and incrementally. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7340–7351

  48. Ranganathan H, Venkateswara H, Chakraborty S, Panchanathan S (2017) Deep active learning for image classification. In: IEEE international conference on image processing, pp 3934–3938

  49. Li M, Liu X, van de Weijer J, Raducanu B (2021) Learning to rank for active learning: a listwise approach. In: 2020 25th international conference on pattern recognition, pp 5587–5594

  50. Yuan T, Wan F, Fu M, Liu J, Xu S, Ji X, Ye Q (2021) Multiple instance active learning for object detection. arXiv:2104.02324 [cs.CV]

  51. Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  52. Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision, pp 391–405

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China under Grant 61903018, Science and Technology on Space Intelligent Control Laboratory (No. ZDSYS-2020-04), and Shanghai Aerospace Science and Technology Innovation Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baochang Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Xiang, X., Zhang, B. et al. Weakly Supervised Object Detection Based on Active Learning. Neural Process Lett 54, 5169–5183 (2022). https://doi.org/10.1007/s11063-022-10855-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10855-0

Keywords

Navigation