Abstract
Most objects with regular regions could be detected as Maximally Stable Extremal Regions (MSER) [20]. In this paper, We formulate object detection as a bi-label (object and non-object regions) segmentation problem, and propose a graph-based object detection method using edge-enhanced MSER. Specifically, we focus on detecting text in natural images, which is a special kind of object. First, edge-enhanced MSERs are detected as basic letter components; non-text MSERs are then efficiently eliminated by minimizing the cost function which combines both region-based and context-relevant information; and finally, mean-shift clustering is used to group text components into regions. The proposed method is naturally context-relevant, scale-insensitive and readily to be applied on detecting other objects. Experimental results on the ICDAR 2011 competition dataset show that the proposed approach outperforms state-of-the-art methods both in recall and precision.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Boykov, Y., Jolly, M.: Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In: 8th IEEE International Conference on Computer Vision, vol. 1, pp. 105–112. IEEE (2001)
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(9), 1124–1137 (2004)
Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)
Chen, H., Tsai, S., Schroth, G., Chen, D., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: International Conference on Image Processing, pp. 2609–2612 (2011)
Chen, X., Yuille, A.: Detecting and reading text in natural scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II–366. IEEE (2004)
Chum, J., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, vol. 72 (2002)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970 (2010)
Jung, K., In Kim, K., Jain, A.K.: Text information extraction in images and video: a survey. Pattern recognition 37(5), 977–997 (2004)
Lee, C., Jung, K., Kim, H.: Automatic text detection and removal in video sequences. Pattern Recognition Letters 24(15), 2607–2623 (2003)
Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. International Journal on Document Analysis and Recognition 7(2), 84–104 (2005)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.: A comparison of affine region detectors. International Journal of Computer Vision 65(1), 43–72 (2005)
Pan, Y., Hou, X., Liu, C.: A hybrid approach to detect and localize texts in natural scene images. IEEE Transactions on Image Processing (99), 1–1 (2011)
Shahab, A., Shafait, F., Dengel, A.: Icdar 2011 robust reading competition challenge 2: Reading text in scene images. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1491–1496. IEEE (2011)
Shivakumara, P., Phan, T., Tan, C.: A laplacian approach to multi-oriented text detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(2), 412–419 (2011)
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing 23(6), 565–576 (2005)
Zhang, J., Kasturi, R.: Extraction of text objects in video documents: Recent progress. In: 8th IAPR International Workshop on Document Analysis Systems, DAS 2008, pp. 5–17. IEEE (2008)
Fukunaga, K., Hostetler, L.: The estimation of the gradient of density function, with applications in pattern recognition. IEEE Transactions on Information Theory 21(1), 32–40 (1975)
Zhang, Z., Liang, X., Ganesh, A., Ma, Y.: Tilt: transform invariant low-rank textures. In: Computer Vision–ACCV 2010, pp. 314–328 (2011)
Kimmel, R., Zhang, C., Bronstein, A.M., Bronstein, M.M.: Are MSER features really interesting? IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shi, C., Wang, C., Xiao, B., Zhang, Y. (2012). Graph-Based Detection of Objects with Regular Regions. In: Su, CY., Rakheja, S., Liu, H. (eds) Intelligent Robotics and Applications. ICIRA 2012. Lecture Notes in Computer Science(), vol 7508. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33503-7_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-33503-7_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33502-0
Online ISBN: 978-3-642-33503-7
eBook Packages: Computer ScienceComputer Science (R0)