Abstract
Currently, the most prominent object recognition and image labeling techniques are based on the region proposal algorithms. One of the significant challenges of the region proposal algorithms is to achieve high Recall at high overlaps. This paper proposes a new region proposal algorithm using perceptual grouping to generate fitting regions to enhance the Recall at high overlaps. The proposed method comprises segmentation, region merging, based on texture descriptors, and similarity measurement. Furthermore, the algorithm introduces a hybrid approach to compute an efficient threshold. To fully assess the proposed algorithm, well-known metrics such as overlap and Recall are measured. Experimental results are reported on MSRC, VOC2007, VOC2012, and COCO 2017 datasets. The results are compared with segmentation algorithms, and several classical and deep learning-based region proposals. The evaluation results indicate a good improvement of the Recall at high overlaps, such as 0.8 and 0.9, with a reasonable number of regions.
Similar content being viewed by others
Data Availability Statement
All datasets that support the findings of this study are available from their generators. The generators are cited in the References of this article.
References
Abbasi S, Tajeripour F (2017) Detection of brain tumor in 3d mri images using local binary patterns and histogram orientation gradient. Neurocomputing 219:526–535
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Alexe B, Deselaers T, Ferrari V (2010) What is an object? In: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 73–80
Arbelaez P, Maire M, Fowlkes C, Malik J (2010) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916
Bashar F, Khan A, Ahmed F, Kabir MH (2014) Robust facial expression recognition based on median ternary pattern (mtp). In: 2013 International conference on electrical information and communication technology (EICT), IEEE, pp 1–5
Bonechi S, Bianchini M, Scarselli F, Andreini P (2020) Weak supervision for generating pixel-level annotations in scene text segmentation. Pattern Recogn Lett 138:1–7
Carreira J, Sminchisescu C (2012) Cpmc: automatic object segmentation using constrained parametric min-cuts. IEEE Trans Pattern Anal Mach Intell 34(7):1312–1328
Chen X, Ma H, Wang X, Zhao Z (2015) Improving object proposals with multi-thresholding straddling expansion. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2587–2595
Chen LC, Hermans A, Papandreou G, Schroff F, Wang P, Adam H (2018) Masklab: Instance segmentation by refining object detection with semantic and direction features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4013–4022
Chen W, Qiao Y, Li Y (2020) Inception-ssd: an improved single shot detector for vehicle detection. J Ambient Intell Hum Comput:1–7
Cheng MM, Zhang Z, Lin WY, Torr P (2014) Bing: Binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3286–3293
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 5:603–619
Dai J, He K, Sun J (2015) Convolutional feature masking for joint object and stuff segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3992–4000
de Geus D, Meletis P, Dubbelman G (2019) Single network panoptic segmentation for street scene understanding. In: 2019 IEEE intelligent vehicles symposium (IV), IEEE, pp 709–715
Endres I, Hoiem D (2013) Category-independent object proposals with diverse ranking. IEEE Trans Pattern Anal Mach Intell 36(2):222–234
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2154
Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Fu K, Chang Z, Zhang Y, Xu G, Zhang K, Sun X (2020) Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images. ISPRS J Photogramm Remote Sens 161:294–308
Ghodrati A, Diba A, Pedersoli M, Tuytelaars T, Van Gool L (2015) Deepproposal: hunting objects by cascading deep convolutional layers. In: Proceedings of the IEEE international conference on computer vision, pp 2578–2586
Gidaris S, Komodakis N (2016) Attend refine repeat: active box proposal generation via in-out localization. In: BMVC
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. In: European conference on computer vision. Springer, pp 297–312
Haripriya P, Porkodi R (2020) Parallel deep convolutional neural network for content based medical image retrieval. J Ambient Intell Hum Comput:1–15
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
He W, Zhang XY, Yin F, Luo Z, Ogier JM, Liu CL (2020) Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recogn 98:107026
Hosang J, Benenson R, Dollár P, Schiele B (2015) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830
Hosang J, Benenson R, Schiele B (2017) Learning non-maximum suppression. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4507–4515
Hu Z, Liu Z, Li G, Ye L, Zhou L, Wang Y (2020) Weakly supervised instance segmentation using multi-stage erasing refinement and saliency-guided proposals ordering. J Vis Commun Image Represent 73:102957
Huang D, Shan C, Ardabilian M, Wang Y, Chen L (2011) Local binary patterns and its application to facial image analysis: a survey. IEEE Trans Syst Man Cybern Part C (Applications and Reviews) 41(6):765–781
Humayun A, Li F, Rehg JM (2014) Rigor: Reusing inference in graph cuts for generating object regions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 336–343
Jie Z, Lu WF, Sakhavi S, Wei Y, Tay EHF, Yan S (2016) Object proposal generation with fully convolutional networks. IEEE Trans Circuits Syst Video Technol 28(1):62–75
Khan Z, Yang J (2020) Bottom-up unsupervised image segmentation using fc-dense u-net based deep representation clustering and multidimensional feature fusion based region merging. Image Vis Comput:103871
Kim J, Grauman K (2012) Shape sharing for object segmentation. In: European conference on computer vision. Springer, pp 444–458
Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853
Krähenbühl P, Koltun V (2014) Geodesic object proposals. In: European conference on computer vision. Springer, pp 725–739
Ku J, Mozifian M, Lee J, Harakeh A, Waslander SL (2018) Joint 3d proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 1–8
Lan L, Ye C, Wang C, Zhou S (2019) Deep convolutional neural networks for wce abnormality detection: Cnn architecture, region proposal and transfer learning. IEEE Access 7:30017–30032
Li S, Zhang H, Zhang J, Ren Y, Kuo CCJ (2017) Box refinement: Object proposal enhancement and pruning. In: 2017 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 979–988
Li H, Liu Y, Ouyang W, Wang X (2019) Zoom out-and-in network with map attention decision for region proposal and object detection. Int J Comput Vis 127(3):225–238
Liang X, Lin L, Wei Y, Shen X, Yang J, Yan S (2017) Proposal-free network for instance-level object segmentation. IEEE Trans Pattern Anal Mach Intell 40(12):2978–2991
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Manen S, Guillaumin M, Van Gool L (2013) Prime object proposals with randomized prim’s algorithm. In: Proceedings of the IEEE international conference on computer vision, pp 2536–2543
Maninis KK, Pont-Tuset J, Arbeláez P, Van Gool L (2017) Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans Pattern Anal Mach Intell 40(4):819–833
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 7:971–987
Pinheiro PO, Collobert R, Dollár P (2015) Learning to segment object candidates. In: Advances in neural information processing systems, pp 1990–1998
Pont-Tuset J, Arbelaez P, Barron JT, Marques F, Malik J (2017) Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Trans Pattern Anal Mach Intell 39(1):128–140
Rahtu E, Kannala J, Blaschko M (2011) Learning a category independent object detection cascade. In: 2011 international conference on computer vision, IEEE, pp 1052–1059
Rantalankila P, Kannala J, Rahtu E (2014) Generating object segmentation proposals using global and local search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2417–2424
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Rivera AR, Castillo JR, Chae OO (2012) Local directional number pattern for face analysis: face and expression recognition. IEEE Trans Image Process 22(5):1740–1752
Rivera AR, Castillo JR, Chae O (2015) Local directional texture pattern image descriptor. Pattern Recogn Lett 51:94–100
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Shotton J, Winn J, Rother C, Criminisi A (2006) Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: European conference on computer vision. Springer, pp 1–15
Tabatabaei SM, Chalechale A (2019) Local binary patterns for noise-tolerant semg classification. SIViP 13(3):491–498
Taghizadeh M, Chalechale A (2018) Region expansion algorithm: a well-quality region proposal generation. In: 2018 8th International conference on computer and knowledge engineering (ICCKE), IEEE, pp 250–255
Taghizadeh M, Chalechale A (2021) A class-independent flexible algorithm to generate region proposals. Multimedia Tools Appl
Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Valstar M, Pantic M (2006) Fully automatic facial action unit detection and temporal analysis. In: 2006 conference on computer vision and pattern recognition workshop (CVPRW’06). IEEE, p 149
Vedaldi A, Soatto S (2008) Quick shift and kernel methods for mode seeking. In: European conference on computer vision. Springer, pp 705–718
Vu T, Jang H, Pham TX, Yoo C (2019) Cascade rpn: Delving into high-quality region proposal network with adaptive convolution. In: Advances in neural information processing systems, pp 1430–1440
Xu H, Yao L, Zhang W, Liang X, Li Z (2019) Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In: Proceedings of the IEEE international conference on computer vision, pp 6649–6658
Zhang Z, Liu Y, Chen X, Zhu Y, Cheng MM, Saligrama V, Torr PH (2017) Sequential optimization for efficient high-quality object proposal generation. IEEE Trans Pattern Anal Mach Intell 40(5):1209–1223
Zhang W, Wang K, Wang Y, Yan L, Wang FY (2021) A loss-balanced multi-task model for simultaneous detection and segmentation. Neurocomputing 428:65–78
Zhu H, Meng F, Cai J, Lu S (2016) Beyond pixels: a comprehensive survey from bottom-up to semantic image segmentation and cosegmentation. J Vis Commun Image Represent 34:12–27
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European conference on computer vision. Springer, pp 391–405
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Taghizadeh, M., Chalechale, A. & Jannesari, A. A region proposal algorithm using texture similarity and perceptual grouping. J Ambient Intell Human Comput 14, 271–288 (2023). https://doi.org/10.1007/s12652-021-03296-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03296-5