Multi-scale energy optimization for object proposal generation

Wang, Congchao; Yang, Jufeng; Wang, Kai; Lai, Shang-Hong

doi:10.1007/s11042-016-3616-7

Multi-scale energy optimization for object proposal generation

Published: 23 May 2016

Volume 76, pages 10481–10499, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Congchao Wang¹,
Jufeng Yang¹,
Kai Wang¹ &
…
Shang-Hong Lai²

349 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we present an object proposal generation method by applying energy optimization into superpixel merging algorithms in a multiscale framework, which could generate possible object locations in one image. As images in object detection datasets always enjoy high diversity, we adopt two different energy functions with multi-scales. Thus, our method enjoys the strength of global search, which is strong in locating salient object by concerning the whole image at one merge iteration, as well as the strength of local search which is more likely to recall the un-salient instances. What’s more, unlike most superpixel merging algorithms that are based on diversified segmentation results, our approach takes advantage of robust edge detection and segments each image only once, which greatly reduces the number of proposals. Experiments on PASCAL VOC 2007 test set show that the proposed method outperforms most previous superpixel merging based methods and also could compete with state-of-the-art proposal generators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Microsoft COCO: Common Objects in Context

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Article Open access 12 April 2024

Notes

Intersection-over-Union is to measure the overlap rate between the intersection of a candidate box and the ground truth box and the area of their union.
In practice we set 𝜖 _e = 0.05.
In practice we set 𝜖 _s = 0.1.
Here we use the fast version of [41], which performs better than their Quality version with less proposals.

References

Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Susstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Article Google Scholar
Alexe B, Deselaers T, Ferrari V (2010) What is an object?. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):73–80
Alexe B, Deselaers T, Ferrari V (2012) Measuring the objectness of image windows. IEEE Trans Pattern Anal Mach Intell 34(11):2189–2202
Article Google Scholar
Arbelaez P, Pont-Tuset J, Barron J, Marques F, Malik J (2014) Multiscale combinatorial grouping. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):328–335
Branson S, Beijbom O, Belongie S (2013) Efficient large-scale structured learning. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):1806–1813
Bruce N, Tsotsos J (2006) Saliency based on information maximization. Advances in Neural Information Processing Systems (NIPS):155–162
Carreira J, Sminchisescu C (2012) Cpmc: Automatic object segmentation using constrained parametric min-cuts. IEEE Trans Pattern Anal Mach Intell 34(7):1312–1328
Article Google Scholar
Cheng MM, Mitra NJ, Huang X, Torr PH, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582
Article Google Scholar
Cheng MM, Zhang Z, Lin WY, Torr PHS (2014) BING: Binarized Normed gradients for objectness estimation at 300fps. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3286–3293
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):886–893
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L Imagenet large scale visual recognition competition 2012 (ilsvrc2012). http://www.image-net.org/challenges/LSVRC/2012/
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):248–255
Dollár P, Zitnick CL (2013) Structured forests for fast edge detection. IEEE International Conference on Computer Vision (ICCV):1841–1848
Endres I, Hoiem D (2010) Category independent object proposals. pp 575–588
Endres I, Hoiem D (2014) Category-independent object proposals with diverse ranking. IEEE Trans Pattern Anal Mach Intell 36(2):222–234
Article Google Scholar
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Article Google Scholar
Fidler S, Mottaghi R, Yuille A, Urtasun R (2013) Bottom-up segmentation for top-down detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3294–3301
Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Article Google Scholar
Gonzalez-Garcia A, Vezhnevets A, Ferrari V (2015) An active search strategy for efficient object class detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3022–3031
Han J, He S, Qian X, Wang D, Guo L, Liu T (2013) An object-oriented visual saliency detection framework based on sparse coding representations. IEEE Trans Circ Syst Video Technol 23(12):2009–2021
Article Google Scholar
Han J, Zhang D, Hu X, Guo L, Ren J, Wu F (2015) Background prior-based salient object detection via deep reconstruction residual. IEEE Trans Circ Syst Video Technol 25(8):1309–1321
Article Google Scholar
Han J, Zhang D, Wen S, Guo L, Liu T, Li X (2016) Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans Cybern 46(2):487–498
Article Google Scholar
Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks S, Torr P (2016) Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence
Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. pp 297–312
Hariharan B, Malik J, Ramanan D (2012) Discriminative decorrelation for clustering and classification. European Conference on Computer Vision (ECCV):459–472
Hosang J, Benenson R, Dollár P, Schiele B (2016) What makes for effective detection proposals?. IEEE Trans Pattern Anal Mach Intell 38(4):814–830
Article Google Scholar
Hosang J, Benenson R, Schiele B (2014) How good are detection proposals, really? British Machine Vision Conference (BMVC)
Humayun A, Li F, Rehg JM (2014) RIGOR: Reusing inference in graph cuts for generating object regions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):336–343
Itti L, Koch C, Niebur E, et al. (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Karianakis N, Fuchs TJ, Soatto S (2015) Boosting convolutional features for robust object proposals. arXiv preprint arXiv:1503.06350
Krähenbühl P, Koltun V (2014) Geodesic object proposals. pp 725–739
Li N, Ye J, Ji Y, Ling H, Yu J (2014) Saliency detection on light field. pp 2806–2813
Li X, Lu H, Zhang L, Ruan X, Yang MH (2013) Saliency detection via dense and sparse reconstruction. pp 2976–2983
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. pp 740–755
Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. IEEE International Conference on Computer Vision (ICCV):89–96
Manen S, Guillaumin M, Gool LV (2013) Prime object proposals with randomized prim’s algorithm. IEEE International Conference on Computer Vision (ICCV):2536–2543
Rantalankila P, Kannala J, Rahtu E (2014) Generating object segmentation proposals using global and local search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):2417–2424
Van de Sande KE, Uijlings JR, Gevers T, Smeulders AW (2011) Segmentation as selective search for object recognition. IEEE International Conference on Computer Vision (ICCV):1879–1886
Uijlings JR, Van de Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Article Google Scholar
Valenti R, Sebe N, Gevers T (2009) Image saliency by isocentric curvedness and color. IEEE International Conference on Computer Vision (ICCV):2185–2192
Wang L, Lu H, Ruan X, Yang MH (2015) Deep networks for saliency detection via local estimation and global search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3183–3192
Wei Y, Wen F, Zhu W, Sun J (2012) Geodesic saliency using background priors. pp 29–42
Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):3166–3173
Zhang Z, Warrell J, Torr PH (2011) Proposal generation for object detection using cascaded ranking svms. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):1497–1504
Zhu W, Liang S, Wei Y, Sun J (2014) Saliency optimization from robust background detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR):2814–2821
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. pp 391–405

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China(No.61301238, 61201424), China Scholarship Council(No.201506205024) and the Natural Science Foundation of Tianjin, China(No.14ZCDZGX00831).

Author information

Authors and Affiliations

College of Computer and Control Engineering, Nankai University, Tianjin, China
Congchao Wang, Jufeng Yang & Kai Wang
Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
Shang-Hong Lai

Authors

Congchao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jufeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shang-Hong Lai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jufeng Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, C., Yang, J., Wang, K. et al. Multi-scale energy optimization for object proposal generation. Multimed Tools Appl 76, 10481–10499 (2017). https://doi.org/10.1007/s11042-016-3616-7

Download citation

Received: 10 August 2015
Revised: 20 March 2016
Accepted: 12 May 2016
Published: 23 May 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s11042-016-3616-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale energy optimization for object proposal generation

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Microsoft COCO: Common Objects in Context

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-scale energy optimization for object proposal generation

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Microsoft COCO: Common Objects in Context

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation