BOB: a bi-level overlapped binning procedure for scene word binarization

Dutta, Indra Narayan; Chakraborty, Neelotpal; Mollah, Ayatullah Faruk; Basu, Subhadip; Sarkar, Ram

doi:10.1007/s11042-020-09785-7

BOB: a bi-level overlapped binning procedure for scene word binarization

Published: 29 October 2020

Volume 80, pages 7609–7635, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Indra Narayan Dutta¹,
Neelotpal Chakraborty¹,
Ayatullah Faruk Mollah²,
Subhadip Basu¹ &
…
Ram Sarkar¹

187 Accesses
Explore all metrics

Abstract

Scene text analysis involves detecting and processing text/words in natural scene images for serving various purposes. This problem domain intrigues the research fraternity due to challenges like dealing with noise, blur, heterogeneous intensity variation, etc. The ultimate goal is making detected scene word recognizable by any standard Optical Character Recognition system, thereby necessitating effective scene word binarization. Several methods address scene text detection, but comparatively few addresses scene word binarization. These binarization methods, however, have limitations in robustness against image quality-based complexities thus causing low precision. Here, a novel approach is proposed for scene word binarization called Bi-level Overlapped Binning where intensities of color channels R, G and B are grouped or binned to generate several solutions in the form of binary images. The stable binary images are identified such that the image solutions from them can be classified as text or non-text using a standard classifier trained with some popular features. Finally, the resultant text solutions are combined probabilistically to get the binarized output. The proposed method is evaluated on standard datasets such as SVT, ICDAR-2003, ICDAR-2011 (Scene), ICDAR-2011 (BDI), KAIST and Total-Text achieving precisions 0.76, 0.87, 0.89, 0.85, 0.84 and 0.87 respectively, which are mostly better than that of the state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised refinement of color and stroke features for text binarization

Article 03 April 2017

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

A novel method for binarization of scene text images and its application in text identification

Article 14 February 2018

References

Bai X, Yao C, Liu W (2016) Strokelets: a learned multi-scale mid-level representation for scene text recognition. IEEE Trans Image Process 25(6):2789–2802
Article MathSciNet Google Scholar
Bai, B, Yin, F and Liu, CL (2014). A seed-based segmentation method for scene text extraction. In 2014 11th IAPR International Workshop on Document Analysis Systems (pp. 262-266). IEEE
Bhowmik S, Sarkar R, Das B, Doermann D (2018) GiB: a ${G} $ ame theory ${I} $ nspired ${B} $ inarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455
Article MathSciNet Google Scholar
Bhunia AK, Kumar G, Roy PP, Balasubramanian R, Pal U (2018) Text recognition in scene image and video frame using Color Channel selection. Multimed Tools Appl 77(7):8551–8578
Article Google Scholar
Bonechi, S, Andreini, P, Bianchini, M and Scarselli, F (2019). COCO_TS dataset: pixel–level annotations based on weak supervision for scene text segmentation. In International Conference on Artificial Neural Networks (pp. 238-250). Springer, Cham
Chen, H, Tsai, SS, Schroth, G, Chen, DM, Grzeszczuk, R and Girod, B (2011). Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In 2011 18th IEEE International Conference on Image Processing (pp. 2609-2612). IEEE
Dai, Y, Huang, Z, Gao, Y, Xu, Y, Chen, K, Guo, J and Qiu, W (2018). Fused text segmentation networks for multi-oriented scene text detection. In 2018 24th International Conference on Pattern Recognition (ICPR) (pp. 3604-3609). IEEE
Dutta, IN, Chakraborty, N, Mollah, AF, Basu, S and Sarkar, R (2019). Multi-lingual text localization from camera captured images based on foreground homogenity analysis. In Recent Developments in Machine Learning and Data Analytics (pp. 149–158). Springer, Singapore
Epshtein, B, Ofek, E and Wexler, Y (2010). Detecting text in natural scenes with stroke width transform. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 2963-2970). IEEE
Fan, DP, Cheng, MM, Liu, Y, Li, T and Borji, A (2017). Structure-measure: a new way to evaluate foreground maps. In Proceedings of the IEEE international conference on computer vision (pp. 4548-4557)
Fan, DP, Gong, C, Cao, Y, Ren, B, Cheng, MM and Borji, A (2018). Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421
Feild, J and Learned-Miller, E (2012). Scene text recognition with bilateral regression. Department of Computer Science, University of Massachusetts Amherst, Tech. Rep. UM-CS-2012-021
Ghoshal R, Roy A, Banerjee A, Dhara BC, Parui SK (2019) A novel method for binarization of scene text images and its application in text identification. Pattern Anal Applic 22(4):1361–1375
Article MathSciNet Google Scholar
Howe NR (2013) Document binarization with automatic parameter tuning. International journal on document analysis and recognition (ijdar) 16(3):247–258
Article Google Scholar
Kasar, T, Kumar, J and Ramakrishnan, AG (2007). Font and background color independent text binarization. In Second international workshop on camera-based document analysis and recognition (pp. 3-9)
Kittler J, Illingworth J, Föglein J (1985) Threshold selection based on a simple image statistic. Computer vision, graphics, and image processing 30(2):125–147
Article Google Scholar
Kumar, D, Prasad, MA and Ramakrishnan, AG (2012). Benchmarking recognition results on camera captured word image data sets. In Proceeding of the workshop on Document Analysis and Recognition (pp. 100-107)
Li Y, Jia W, Shen C, van den Hengel A (2014) Characterness: an indicator of text in the wild. IEEE Trans Image Process 23(4):1666–1677
Article MathSciNet Google Scholar
Liao, M, Wan, Z, Yao, C, Chen, K and Bai, X (2020). Real-time scene text detection with differentiable Binarization. In AAAI (pp. 11474-11481)
Lin H, Yang P, Zhang F (2020) Review of scene text detection and recognition. Archives of Computational Methods in Engineering 27(2):433–454
Article Google Scholar
Malakar S, Ghosh M, Bhowmik S, Sarkar R, Nasipuri M (2020) A GA based hierarchical feature selection approach for handwritten word recognition. Neural Comput & Applic 32(7):2533–2552
Article Google Scholar
Margolin, R, Zelnik-Manor, L and Tal, A (2014). How to evaluate foreground maps?. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 248-255)
Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767
Article Google Scholar
Milyaev S, Barinova O, Novikova T, Kohli P, Lempitsky V (2015) Fast and accurate scene text understanding with image binarization and off-the-shelf OCR. International Journal on Document Analysis and Recognition (IJDAR) 18(2):169–182
Article Google Scholar
Mishra A, Alahari K, Jawahar CV (2017) Unsupervised refinement of color and stroke features for text binarization. International Journal on Document Analysis and Recognition (IJDAR) 20(2):105–121
Article Google Scholar
Mukhopadhyay A, Kumar S, Chowdhury SR, Chakraborty N, Mollah AF, Basu S, Sarkar R (2019) Multi-lingual scene text detection using one-class classifier. International Journal of Computer Vision and Image Processing (IJCVIP) 9(2):48–65
Article Google Scholar
Niblack W (1985) An introduction to digital image processing, 215 Strandberg publishing company. Copenhagen, Denmark
Google Scholar
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE transactions on systems, man, and cybernetics 9(1):62–66
Article Google Scholar
Paul, S, Saha, S, Basu, S and Nasipuri, M (2015). Text localization in camera captured images using adaptive stroke filter. In Information Systems Design and Intelligent Applications (pp. 217–225). Springer, New Delhi
Paul S, Saha S, Basu S, Saha PK, Nasipuri M (2019) Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter. Multimed Tools Appl 78(13):18017–18036
Article Google Scholar
Sauvola J, Pietikäinen M (2000) Adaptive document image binarization. Pattern Recogn 33(2):225–236
Article Google Scholar
Tian C, Xia Y, Zhang X, Gao X (2017) Natural scene text detection with MC–MR candidate extraction and coarse-to-fine filtering. Neurocomputing 260:112–122
Article Google Scholar
Weinman JJ, Butler Z, Knoll D, Feild J (2013) Toward integrated scene text reading. IEEE Trans Pattern Anal Mach Intell 36(2):375–387
Article Google Scholar
Wolf, C and Doermann, D (2002). Binarization of low quality text using a markov random field model. In Object recognition supported by user interaction for service robots (Vol. 3, pp. 160-163). IEEE
Xie, E, Zang, Y, Shao, S, Yu, G, Yao, C and Li, G (2019). Scene text detection with supervised pyramid context network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 9038-9045)
Yin XC, Pei WY, Zhang J, Hao HW (2015) Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell 37(9):1930–1937
Article Google Scholar
Zhang H, Zhao K, Song YZ, Guo J (2013) Text extraction from natural scene image: a survey. Neurocomputing 122:310–323
Article Google Scholar

Download references

Acknowledgements

This work is partially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, PURSE-II and UPE-II, project. SB is partially funded by DBT grant (BT/PR16356/BID/7/596/2016). RS, SB and AFM are partially funded by DST grant (EMR/2016/007213).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Jadavpur University, Kolkata, 700032, India
Indra Narayan Dutta, Neelotpal Chakraborty, Subhadip Basu & Ram Sarkar
Department of Computer Science and Engineering, Aliah University, Kolkata, 700160, India
Ayatullah Faruk Mollah

Authors

Indra Narayan Dutta
View author publications
You can also search for this author inPubMed Google Scholar
Neelotpal Chakraborty
View author publications
You can also search for this author inPubMed Google Scholar
Ayatullah Faruk Mollah
View author publications
You can also search for this author inPubMed Google Scholar
Subhadip Basu
View author publications
You can also search for this author inPubMed Google Scholar
Ram Sarkar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Neelotpal Chakraborty.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Lemma 1: If the difference between the pixel with the highest PixelValue and lowest PixelValue in foreground region R is D and let S be any bin size such that S ≥ D then for all (i, j) ∈ R, ($ {\varDelta}_k^s $)_ij = 1 for some bin number $ k\in \left[1,2\ast m-1\right], where\ m=\left\lfloor \frac{PV_{max}}{S}\right\rfloor +1 $.

Given Definitions:

1.
Definition of a bin image $ {B}_k^s $ where $ {\left({B}_k^s\right)}_{ij} $ is the value at coordinate (i, j) of the image:

$$ {\left({\mathrm{B}}_k^s\right)}_{ij}\stackrel{\scriptscriptstyle\mathrm{def}}{=}\left\{\begin{array}{c}1,k\in \left[1,m\right]\kern0.50em and\ \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1=k\ \\ {}1,k\in \left[m+1,2\times m-1\right]\ and\ \left\lfloor \frac{PixelValue\left(i,j\right)-\frac{S}{2}}{S}\right\rfloor +1=k-m\\ {}0, otherwise\end{array}\right. $$

(12)

2.
Definition of a Delta image $ {\varDelta}_i^s $ :

$$ {\displaystyle \begin{array}{cc}\mathbf{Level}\ \mathbf{1}\ \boldsymbol{\Delta}\ \mathbf{bins}:& \kern3.25em {\Delta}_i^s={\mathrm{B}}_i^s\cup {\mathrm{B}}_{i+m}^s,\kern0.5em \mathrm{for}\ 1\le i<m\\ {}\mathbf{Level}\ \mathbf{2}\ \boldsymbol{\Delta}\ \mathbf{bins}:& \kern3.25em {\Delta}_i^s={\mathrm{B}}_i^s\cup {\mathrm{B}}_{i-m+1}^s,\mathrm{for}\ m<i<2\times m\end{array}} $$

(13)

3.
Definition of a foreground region R:

A foreground region R is a set of two or more points in an image I such that for every point p in R, there exists a point q in R such that q is in N₈(p) where N₈ is the set of 8-connected neighbors of a point.

Proof: Let the minimum PixelValue for the region R be P_min for pixel coordinate (x_a, y_a) and maximum be P_max for pixel coordinate (x_b, y_b) such that P_max − P_min = D.

Let,

$$ k\stackrel{\scriptscriptstyle\mathrm{def}}{=}\left\lfloor \frac{P_{min}}{S}\right\rfloor +1 $$

(14)

From the definition of ($ {B}_k^s $)_ij, in Eq (12) $ {\left({B}_k^s\right)}_{XaYa}=1 $. All pixels with PixelValue equal to P_min in R will occur as a positive (value = 1) in binary image $ {B}_k^s $

(Now)

$$ {P}_{max}-{P}_{min}=D\Rightarrow {\boldsymbol{P}}_{\boldsymbol{max}}={\boldsymbol{P}}_{\boldsymbol{min}}+\boldsymbol{D} $$

(Thus,)

$$ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1=\left\lfloor \frac{P_{min}+D}{S}\right\rfloor +1=\left\lfloor \frac{P_{min}}{S}+\frac{D}{S}\ \right\rfloor +1\le \left\lfloor \frac{P_{min}}{S}+1\ \right\rfloor +1=\left\lfloor \frac{P_{min}}{S}\right\rfloor +1+1=k+1 $$

(Since,)

$$ D\le S $$

$$ \therefore \kern2.25em \left\lfloor \frac{P_{max}}{S}\right\rfloor +1\le \mathrm{k}+1 $$

(15)

Also,

$$ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1\ge \left\lfloor \frac{P_{min}}{S}\right\rfloor +1=k $$

(16)

Thus, from Eq. (15) and Eq. (16) we get,

$$ {\displaystyle \begin{array}{c}k\le \frac{P_{max}}{S}+1\le k+1\\ {}\Rightarrow \frac{P_{max}}{S}+1=k\kern0.5em \mathrm{OR}\kern0.5em \Rightarrow \frac{P_{max}}{S}+1=k+1\end{array}} $$

(17)

Since k is an integer and $ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1 $ is an integer.

The pixels in foreground region R with value P_max will fall in the same bin as the pixels with value P_min (bin number k) or the immediate next bin (bin number k + 1). Since difference is less than the size of each bin, the bins where the region R is spread over, are limited to a single bin or two adjacent bins:

$$ \mathbf{Case}\ \mathbf{1}:\left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k. $$

In that case, both $ \left\lfloor \frac{P_{min}}{S}\right\rfloor \kern0.5em +1=k\kern0.5em $from Eq. (3) and $ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k $

For any (i, j) ∈ R,

$$ {P}_{min}\le PixelValue\left(i,j\right)\le {P}_{max} $$

$$ \Rightarrow \kern0.75em \left\lfloor \frac{P_{min}}{S}\right\rfloor +1\le \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1\le \kern0.75em \left\lfloor \frac{P_{max}}{S}\right\rfloor +1 $$

$$ \Rightarrow \kern2.5em k\le \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1\le k $$

$$ \therefore \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1=k $$

Since k is an integer and$ \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1 $ is an integer, this is the only possible solution

$$ \Rightarrow {\left({\mathrm{B}}_k^s\right)}_{\mathrm{ij}}=1\ for\ all\ \left(i,j\right)\in R $$

(Now,)

$$ {\Delta}_k^s={\mathrm{B}}_k^s\cup {\mathrm{B}}_{k+m}^s $$

Since

$$ {\left({\mathrm{B}}_k^s\right)}_{\mathrm{ij}}=1\ \mathrm{for}\ \mathrm{all}\ \left(i,j\right)\in R $$

$$ \therefore {\left({\Delta}_k^s\right)}_{\mathrm{ij}}=1\ \mathrm{for}\ \mathrm{all}\ \left(i,j\right)\in R $$

$$ \mathbf{Case}\ \mathbf{2}:\left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k+1 $$

(18)

We have to consider two different sub cases depending on the value of P_max

$$ \mathbf{Case}\ \mathbf{2.1}:{P}_{max}-\left(k\ast S\right)<\frac{S}{2} $$

(19)

Let us consider bin number k + m , which is the bin that overlaps with the right half of bin number k at Level 2. $ m=\left\lfloor \frac{PV_{max}}{S}\right\rfloor $ +1 from lemma statement.

(Now,)

$$ k+m>m $$

From Eq. (19),

$$ {P}_{max}-\left(k\ast S\right)<\frac{S}{2}\kern0.75em \Rightarrow \kern0.75em \frac{\kern0.5em {P}_{max}-\frac{S}{2}}{S}<k $$

(20)

and from Eq. (18): $ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k+1 $

$$ {\displaystyle \begin{array}{c}\Rightarrow {P}_{max}\ge \mathrm{k}\ast S\\ {}\Rightarrow {P}_{max}\hbox{--} \frac{S}{2}\ge k\ast S-\frac{S}{2}\\ {}\Rightarrow \frac{P_{max}\hbox{--} \frac{S}{2}}{S}\ge k-\frac{1}{2}>k-1\end{array}} $$

(21)

From Eq. (20) and Eq. (21)

$$ {\displaystyle \begin{array}{c}k-1<\frac{P_{max}\hbox{--} \frac{S}{2}}{S}\kern0.5em <k\\ {}\Rightarrow \left\lfloor \frac{P_{max}\hbox{--} \frac{S}{2}}{S}\right\rfloor =k-1\\ {}\Rightarrow \left\lfloor \frac{P_{max}\hbox{--} \frac{S}{2}}{S}\right\rfloor +1=k\end{array}} $$

(22)

From definition of ($ {\mathrm{B}}_k^s $)_ij in Eq. (12), we get:

$$ {\left({\mathrm{B}}_{k+m}^s\right)}_{\mathrm{XbYb}}=1 $$

(23)

Bin k has values ranging from (k − 1) ∗ S to k ∗ S and Bin k + m has the values $ \left(k-1\right)\ast S+\frac{S}{2}\ to\ k\ast S+\frac{S}{2} $

$$ {\displaystyle \begin{array}{c}\left(k-1\right)\ast S\le Range\ of\ Bin\ k\le k\ast S\\ {}\left(k-1\right)\ast S+\frac{S}{2}\kern0.5em \le Range\ of\ Bin\ \left(k+m\right)\le k\ast S+\frac{S}{2}\\ {}\Rightarrow \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}\mathbf{\le}\boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \boldsymbol{k}\cup \boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \left(\boldsymbol{k}+\boldsymbol{m}\right)<\boldsymbol{k}\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\end{array}} $$

(24)

From definition of $ {\Delta}_k^s $:

$$ {\Delta}_k^s={\mathrm{B}}_k^s\cup {\mathrm{B}}_{k+\left\lfloor \frac{PV_{max}}{S}\right\rfloor \kern0.5em +1}^s={\mathrm{B}}_k^s\cup {\mathrm{B}}_{k+m}^s $$

Thus, from Eq. (24)

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}\le \boldsymbol{Range}\ \boldsymbol{of}\ {\boldsymbol{\Delta}}_{\boldsymbol{k}}^{\boldsymbol{s}}<\boldsymbol{k}\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}} $$

(25)

Rearranging Eq. (19) we get

$$ {P}_{max}<k\ast S+\frac{S}{2} $$

(26)

And because P_min lies in Bin k from Eq. (14):

$$ \left(k-1\right)\ast S\le {P}_{min} $$

(27)

Now, any PixelValue in region R will lie between P_max and P_min.

From (26) and (27) we get:

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}\le {\boldsymbol{P}}_{\boldsymbol{min}}\le \boldsymbol{PixelValue}\left(\boldsymbol{i},\boldsymbol{j}\right)\le {\boldsymbol{P}}_{\boldsymbol{max}}<\boldsymbol{k}\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}} $$

(28)

From Eqs (25) and (28), we thus prove that $ {\left({\boldsymbol{\Delta}}_{\boldsymbol{k}}^{\boldsymbol{s}}\right)}_{\boldsymbol{ij}}=\mathbf{1} $ for all (i, j) ∈ R

$$ \mathbf{Case}\ \mathbf{2.2}:{\displaystyle \begin{array}{c}{P}_{max}-\left(k\ast S\right)\ge \frac{S}{2}\\ {}{P}_{min}={P}_{max}-D\end{array}} $$

(29)

Since D ≤ S and $ {P}_{min}\ge {P}_{max}-S\ge \left(k\ast S\right)+\frac{S}{2}-S=\left(k-1\right)\ast S+\frac{S}{2} $, hence,

$$ {P}_{min}\ge \left(k-1\right)\ast S+\frac{S}{2} $$

(30)

And from Eq. (14):

$$ k=\left\lfloor \frac{P_{min}}{S}\right\rfloor +1\Rightarrow {P}_{min}<k\ast S $$

(31)

Hence, from (28) and (29) we get:

$$ \Rightarrow \left(k-1\right)\ast S+\frac{S}{2}\le \kern0.5em {P}_{min}<k\ast S $$

(32)

$$ Subtracting\ \frac{S}{2} we\ get,\kern0.75em \left(k-1\right)\ast S+\frac{S}{2}-\frac{S}{2}\le \kern0.5em {P}_{min}-\frac{S}{2}<k\ast S-\frac{S}{2} $$

(33)

$$ \Rightarrow \kern0.5em \left(k-1\right)\kern0.5em \le \frac{P_{min}-\frac{S}{2}}{S}<k-\frac{1}{2}<k $$

(34)

$$ k-1\le \frac{P_{min}-\frac{S}{2}}{S}<k $$

(35)

$$ {\displaystyle \begin{array}{c}\Rightarrow \left\lfloor \frac{P_{min}-\frac{S}{2}}{S}\right\rfloor =k-1\\ {}\Rightarrow \left\lfloor \frac{P_{min}-\frac{S}{2}}{S}\right\rfloor +1=k\kern1em \Rightarrow \left\lfloor \frac{P_{min}-\frac{S}{2}}{S}\right\rfloor +1=\left(k+m\right)-m\end{array}} $$

(36)

From definition of ($ {\mathrm{B}}_k^s $)_ij: $ {\left({\mathrm{B}}_{k+m}^s\right)}_{\mathrm{XbYb}}\kern0.75em =1 $

Bin k + 1 has values ranging from k ∗ S to (k + 1) ∗ S and bin k + m has the values $ \left(k-1\right)\ast S+\frac{S}{2}\ to\ k\ast S+\frac{S}{2} $

$$ k\ast S\le Range\ of\ Bin\ \left(k+1\right)\le \left(k+1\right)\ast S $$

$$ \left(k-1\right)\ast S+\frac{S}{2}\kern0.5em \le Range\ of\ Bin\ \left(k+m\right)\le k\ast S+\frac{S}{2} $$

$$ \Rightarrow \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\kern0.5em \le \boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \left(\boldsymbol{k}+\mathbf{1}\right)\kern0.5em \cup \kern0.5em \boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \left(\boldsymbol{k}+\boldsymbol{m}\right)\le \left(\boldsymbol{k}+\mathbf{1}\right)\ast \boldsymbol{S} $$

And from the definition of $ {\Delta}_{k+m}^s $:

$$ {\Delta}_{k+m}^s={\mathrm{B}}_{k+m}^s\cup {\mathrm{B}}_{k+m-m+1}^s={\mathrm{B}}_{k+m}^s\cup {\mathrm{B}}_{k+1}^s $$

Thus,

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\le \boldsymbol{Range}\ \boldsymbol{of}\ {\boldsymbol{\Delta}}_{\boldsymbol{k}+\boldsymbol{m}}^{\boldsymbol{s}}\le \left(\boldsymbol{k}+\mathbf{1}\right)\ast \boldsymbol{S} $$

(37)

From Eq. (18) we have

$$ {P}_{max}\le k\ast S\le \left(k+1\right)\ast S $$

(38)

And P_min from Eq. (30) we get,

$$ \left(k-1\right)\ast S+\frac{S}{2}\le {P}_{min} $$

Now, any PixelValue in region R will lie between P_max and P_min

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\le {\boldsymbol{P}}_{\boldsymbol{min}}\le \boldsymbol{PixelValue}\left(\boldsymbol{i},\boldsymbol{j}\right)\le {\boldsymbol{P}}_{\boldsymbol{max}}<\left(\boldsymbol{k}+\mathbf{1}\right)\ast \boldsymbol{S} $$

(39)

From Eqs. (37) and (39), we thus prove that $ {\left({\boldsymbol{\Delta}}_{\boldsymbol{k}}^{\boldsymbol{s}}\right)}_{\boldsymbol{ij}}=\mathbf{1} $ for all (i, j) ∈ R

Thus, for all the cases the lemma gets proved.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dutta, I.N., Chakraborty, N., Mollah, A.F. et al. BOB: a bi-level overlapped binning procedure for scene word binarization. Multimed Tools Appl 80, 7609–7635 (2021). https://doi.org/10.1007/s11042-020-09785-7

Download citation

Received: 30 September 2019
Revised: 23 August 2020
Accepted: 28 August 2020
Published: 29 October 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11042-020-09785-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BOB: a bi-level overlapped binning procedure for scene word binarization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Unsupervised refinement of color and stroke features for text binarization

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

A novel method for binarization of scene text images and its application in text identification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now