Skip to main content
Log in

BOB: a bi-level overlapped binning procedure for scene word binarization

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Scene text analysis involves detecting and processing text/words in natural scene images for serving various purposes. This problem domain intrigues the research fraternity due to challenges like dealing with noise, blur, heterogeneous intensity variation, etc. The ultimate goal is making detected scene word recognizable by any standard Optical Character Recognition system, thereby necessitating effective scene word binarization. Several methods address scene text detection, but comparatively few addresses scene word binarization. These binarization methods, however, have limitations in robustness against image quality-based complexities thus causing low precision. Here, a novel approach is proposed for scene word binarization called Bi-level Overlapped Binning where intensities of color channels R, G and B are grouped or binned to generate several solutions in the form of binary images. The stable binary images are identified such that the image solutions from them can be classified as text or non-text using a standard classifier trained with some popular features. Finally, the resultant text solutions are combined probabilistically to get the binarized output. The proposed method is evaluated on standard datasets such as SVT, ICDAR-2003, ICDAR-2011 (Scene), ICDAR-2011 (BDI), KAIST and Total-Text achieving precisions 0.76, 0.87, 0.89, 0.85, 0.84 and 0.87 respectively, which are mostly better than that of the state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bai X, Yao C, Liu W (2016) Strokelets: a learned multi-scale mid-level representation for scene text recognition. IEEE Trans Image Process 25(6):2789–2802

    Article  MathSciNet  Google Scholar 

  2. Bai, B, Yin, F and Liu, CL (2014). A seed-based segmentation method for scene text extraction. In 2014 11th IAPR International Workshop on Document Analysis Systems (pp. 262-266). IEEE

  3. Bhowmik S, Sarkar R, Das B, Doermann D (2018) GiB: a ${G} $ ame theory ${I} $ nspired ${B} $ inarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455

    Article  MathSciNet  Google Scholar 

  4. Bhunia AK, Kumar G, Roy PP, Balasubramanian R, Pal U (2018) Text recognition in scene image and video frame using Color Channel selection. Multimed Tools Appl 77(7):8551–8578

    Article  Google Scholar 

  5. Bonechi, S, Andreini, P, Bianchini, M and Scarselli, F (2019). COCO_TS dataset: pixel–level annotations based on weak supervision for scene text segmentation. In International Conference on Artificial Neural Networks (pp. 238-250). Springer, Cham

  6. Chen, H, Tsai, SS, Schroth, G, Chen, DM, Grzeszczuk, R and Girod, B (2011). Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In 2011 18th IEEE International Conference on Image Processing (pp. 2609-2612). IEEE

  7. Dai, Y, Huang, Z, Gao, Y, Xu, Y, Chen, K, Guo, J and Qiu, W (2018). Fused text segmentation networks for multi-oriented scene text detection. In 2018 24th International Conference on Pattern Recognition (ICPR) (pp. 3604-3609). IEEE

  8. Dutta, IN, Chakraborty, N, Mollah, AF, Basu, S and Sarkar, R (2019). Multi-lingual text localization from camera captured images based on foreground homogenity analysis. In Recent Developments in Machine Learning and Data Analytics (pp. 149–158). Springer, Singapore

  9. Epshtein, B, Ofek, E and Wexler, Y (2010). Detecting text in natural scenes with stroke width transform. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 2963-2970). IEEE

  10. Fan, DP, Cheng, MM, Liu, Y, Li, T and Borji, A (2017). Structure-measure: a new way to evaluate foreground maps. In Proceedings of the IEEE international conference on computer vision (pp. 4548-4557)

  11. Fan, DP, Gong, C, Cao, Y, Ren, B, Cheng, MM and Borji, A (2018). Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421

  12. Feild, J and Learned-Miller, E (2012). Scene text recognition with bilateral regression. Department of Computer Science, University of Massachusetts Amherst, Tech. Rep. UM-CS-2012-021

  13. Ghoshal R, Roy A, Banerjee A, Dhara BC, Parui SK (2019) A novel method for binarization of scene text images and its application in text identification. Pattern Anal Applic 22(4):1361–1375

    Article  MathSciNet  Google Scholar 

  14. Howe NR (2013) Document binarization with automatic parameter tuning. International journal on document analysis and recognition (ijdar) 16(3):247–258

    Article  Google Scholar 

  15. Kasar, T, Kumar, J and Ramakrishnan, AG (2007). Font and background color independent text binarization. In Second international workshop on camera-based document analysis and recognition (pp. 3-9)

  16. Kittler J, Illingworth J, Föglein J (1985) Threshold selection based on a simple image statistic. Computer vision, graphics, and image processing 30(2):125–147

    Article  Google Scholar 

  17. Kumar, D, Prasad, MA and Ramakrishnan, AG (2012). Benchmarking recognition results on camera captured word image data sets. In Proceeding of the workshop on Document Analysis and Recognition (pp. 100-107)

  18. Li Y, Jia W, Shen C, van den Hengel A (2014) Characterness: an indicator of text in the wild. IEEE Trans Image Process 23(4):1666–1677

    Article  MathSciNet  Google Scholar 

  19. Liao, M, Wan, Z, Yao, C, Chen, K and Bai, X (2020). Real-time scene text detection with differentiable Binarization. In AAAI (pp. 11474-11481)

  20. Lin H, Yang P, Zhang F (2020) Review of scene text detection and recognition. Archives of Computational Methods in Engineering 27(2):433–454

    Article  Google Scholar 

  21. Malakar S, Ghosh M, Bhowmik S, Sarkar R, Nasipuri M (2020) A GA based hierarchical feature selection approach for handwritten word recognition. Neural Comput & Applic 32(7):2533–2552

    Article  Google Scholar 

  22. Margolin, R, Zelnik-Manor, L and Tal, A (2014). How to evaluate foreground maps?. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 248-255)

  23. Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767

    Article  Google Scholar 

  24. Milyaev S, Barinova O, Novikova T, Kohli P, Lempitsky V (2015) Fast and accurate scene text understanding with image binarization and off-the-shelf OCR. International Journal on Document Analysis and Recognition (IJDAR) 18(2):169–182

    Article  Google Scholar 

  25. Mishra A, Alahari K, Jawahar CV (2017) Unsupervised refinement of color and stroke features for text binarization. International Journal on Document Analysis and Recognition (IJDAR) 20(2):105–121

    Article  Google Scholar 

  26. Mukhopadhyay A, Kumar S, Chowdhury SR, Chakraborty N, Mollah AF, Basu S, Sarkar R (2019) Multi-lingual scene text detection using one-class classifier. International Journal of Computer Vision and Image Processing (IJCVIP) 9(2):48–65

    Article  Google Scholar 

  27. Niblack W (1985) An introduction to digital image processing, 215 Strandberg publishing company. Copenhagen, Denmark

    Google Scholar 

  28. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE transactions on systems, man, and cybernetics 9(1):62–66

    Article  Google Scholar 

  29. Paul, S, Saha, S, Basu, S and Nasipuri, M (2015). Text localization in camera captured images using adaptive stroke filter. In Information Systems Design and Intelligent Applications (pp. 217–225). Springer, New Delhi

  30. Paul S, Saha S, Basu S, Saha PK, Nasipuri M (2019) Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter. Multimed Tools Appl 78(13):18017–18036

    Article  Google Scholar 

  31. Sauvola J, Pietikäinen M (2000) Adaptive document image binarization. Pattern Recogn 33(2):225–236

    Article  Google Scholar 

  32. Tian C, Xia Y, Zhang X, Gao X (2017) Natural scene text detection with MC–MR candidate extraction and coarse-to-fine filtering. Neurocomputing 260:112–122

    Article  Google Scholar 

  33. Weinman JJ, Butler Z, Knoll D, Feild J (2013) Toward integrated scene text reading. IEEE Trans Pattern Anal Mach Intell 36(2):375–387

    Article  Google Scholar 

  34. Wolf, C and Doermann, D (2002). Binarization of low quality text using a markov random field model. In Object recognition supported by user interaction for service robots (Vol. 3, pp. 160-163). IEEE

  35. Xie, E, Zang, Y, Shao, S, Yu, G, Yao, C and Li, G (2019). Scene text detection with supervised pyramid context network. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 9038-9045)

  36. Yin XC, Pei WY, Zhang J, Hao HW (2015) Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell 37(9):1930–1937

    Article  Google Scholar 

  37. Zhang H, Zhao K, Song YZ, Guo J (2013) Text extraction from natural scene image: a survey. Neurocomputing 122:310–323

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, PURSE-II and UPE-II, project. SB is partially funded by DBT grant (BT/PR16356/BID/7/596/2016). RS, SB and AFM are partially funded by DST grant (EMR/2016/007213).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neelotpal Chakraborty.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Lemma 1: If the difference between the pixel with the highest PixelValue and lowest PixelValue in foreground region R is D and let S be any bin size such that S ≥ D then for all (i, j) ∈ R, (\( {\varDelta}_k^s \))ij = 1 for some bin number \( k\in \left[1,2\ast m-1\right], where\ m=\left\lfloor \frac{PV_{max}}{S}\right\rfloor +1 \).

Given Definitions:

  1. 1.

    Definition of a bin image \( {B}_k^s \) where \( {\left({B}_k^s\right)}_{ij} \) is the value at coordinate (i, j) of the image:

$$ {\left({\mathrm{B}}_k^s\right)}_{ij}\stackrel{\scriptscriptstyle\mathrm{def}}{=}\left\{\begin{array}{c}1,k\in \left[1,m\right]\kern0.50em and\ \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1=k\ \\ {}1,k\in \left[m+1,2\times m-1\right]\ and\ \left\lfloor \frac{PixelValue\left(i,j\right)-\frac{S}{2}}{S}\right\rfloor +1=k-m\\ {}0, otherwise\end{array}\right. $$
(12)
  1. 2.

    Definition of a Delta image \( {\varDelta}_i^s \) :

$$ {\displaystyle \begin{array}{cc}\mathbf{Level}\ \mathbf{1}\ \boldsymbol{\Delta}\ \mathbf{bins}:& \kern3.25em {\Delta}_i^s={\mathrm{B}}_i^s\cup {\mathrm{B}}_{i+m}^s,\kern0.5em \mathrm{for}\ 1\le i<m\\ {}\mathbf{Level}\ \mathbf{2}\ \boldsymbol{\Delta}\ \mathbf{bins}:& \kern3.25em {\Delta}_i^s={\mathrm{B}}_i^s\cup {\mathrm{B}}_{i-m+1}^s,\mathrm{for}\ m<i<2\times m\end{array}} $$
(13)
  1. 3.

    Definition of a foreground region R:

A foreground region R is a set of two or more points in an image I such that for every point p in R, there exists a point q in R such that q is in N8(p) where N8 is the set of 8-connected neighbors of a point.

Proof: Let the minimum PixelValue for the region R be Pmin for pixel coordinate (xa, ya) and maximum be Pmax for pixel coordinate (xb, yb) such that Pmax − Pmin = D.

Let,

$$ k\stackrel{\scriptscriptstyle\mathrm{def}}{=}\left\lfloor \frac{P_{min}}{S}\right\rfloor +1 $$
(14)

From the definition of (\( {B}_k^s \))ij, in Eq (12) \( {\left({B}_k^s\right)}_{XaYa}=1 \). All pixels with PixelValue equal to Pmin in R will occur as a positive (value = 1) in binary image \( {B}_k^s \)

(Now)

$$ {P}_{max}-{P}_{min}=D\Rightarrow {\boldsymbol{P}}_{\boldsymbol{max}}={\boldsymbol{P}}_{\boldsymbol{min}}+\boldsymbol{D} $$

(Thus,)

$$ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1=\left\lfloor \frac{P_{min}+D}{S}\right\rfloor +1=\left\lfloor \frac{P_{min}}{S}+\frac{D}{S}\ \right\rfloor +1\le \left\lfloor \frac{P_{min}}{S}+1\ \right\rfloor +1=\left\lfloor \frac{P_{min}}{S}\right\rfloor +1+1=k+1 $$

(Since,)

$$ D\le S $$
$$ \therefore \kern2.25em \left\lfloor \frac{P_{max}}{S}\right\rfloor +1\le \mathrm{k}+1 $$
(15)

Also,

$$ \left\lfloor \frac{P_{max}}{S}\right\rfloor +1\ge \left\lfloor \frac{P_{min}}{S}\right\rfloor +1=k $$
(16)

Thus, from Eq. (15) and Eq. (16) we get,

$$ {\displaystyle \begin{array}{c}k\le \frac{P_{max}}{S}+1\le k+1\\ {}\Rightarrow \frac{P_{max}}{S}+1=k\kern0.5em \mathrm{OR}\kern0.5em \Rightarrow \frac{P_{max}}{S}+1=k+1\end{array}} $$
(17)

Since k is an integer and \( \left\lfloor \frac{P_{max}}{S}\right\rfloor +1 \) is an integer.

The pixels in foreground region R with value Pmax will fall in the same bin as the pixels with value Pmin (bin number k) or the immediate next bin (bin number k + 1). Since difference is less than the size of each bin, the bins where the region R is spread over, are limited to a single bin or two adjacent bins:

$$ \mathbf{Case}\ \mathbf{1}:\left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k. $$

In that case, both \( \left\lfloor \frac{P_{min}}{S}\right\rfloor \kern0.5em +1=k\kern0.5em \)from Eq. (3) and \( \left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k \)

For any (i, j) ∈ R,

$$ {P}_{min}\le PixelValue\left(i,j\right)\le {P}_{max} $$
$$ \Rightarrow \kern0.75em \left\lfloor \frac{P_{min}}{S}\right\rfloor +1\le \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1\le \kern0.75em \left\lfloor \frac{P_{max}}{S}\right\rfloor +1 $$
$$ \Rightarrow \kern2.5em k\le \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1\le k $$
$$ \therefore \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1=k $$

Since k is an integer and\( \left\lfloor \frac{PixelValue\left(i,j\right)}{S}\right\rfloor +1 \) is an integer, this is the only possible solution

$$ \Rightarrow {\left({\mathrm{B}}_k^s\right)}_{\mathrm{ij}}=1\ for\ all\ \left(i,j\right)\in R $$

(Now,)

$$ {\Delta}_k^s={\mathrm{B}}_k^s\cup {\mathrm{B}}_{k+m}^s $$

Since

$$ {\left({\mathrm{B}}_k^s\right)}_{\mathrm{ij}}=1\ \mathrm{for}\ \mathrm{all}\ \left(i,j\right)\in R $$
$$ \therefore {\left({\Delta}_k^s\right)}_{\mathrm{ij}}=1\ \mathrm{for}\ \mathrm{all}\ \left(i,j\right)\in R $$
$$ \mathbf{Case}\ \mathbf{2}:\left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k+1 $$
(18)

We have to consider two different sub cases depending on the value of Pmax

$$ \mathbf{Case}\ \mathbf{2.1}:{P}_{max}-\left(k\ast S\right)<\frac{S}{2} $$
(19)

Let us consider bin number k + m , which is the bin that overlaps with the right half of bin number k at Level 2. \( m=\left\lfloor \frac{PV_{max}}{S}\right\rfloor \) +1 from lemma statement.

(Now,)

$$ k+m>m $$

From Eq. (19),

$$ {P}_{max}-\left(k\ast S\right)<\frac{S}{2}\kern0.75em \Rightarrow \kern0.75em \frac{\kern0.5em {P}_{max}-\frac{S}{2}}{S}<k $$
(20)

and from Eq. (18): \( \left\lfloor \frac{P_{max}}{S}\right\rfloor +1=k+1 \)

$$ {\displaystyle \begin{array}{c}\Rightarrow {P}_{max}\ge \mathrm{k}\ast S\\ {}\Rightarrow {P}_{max}\hbox{--} \frac{S}{2}\ge k\ast S-\frac{S}{2}\\ {}\Rightarrow \frac{P_{max}\hbox{--} \frac{S}{2}}{S}\ge k-\frac{1}{2}>k-1\end{array}} $$
(21)

From Eq. (20) and Eq. (21)

$$ {\displaystyle \begin{array}{c}k-1<\frac{P_{max}\hbox{--} \frac{S}{2}}{S}\kern0.5em <k\\ {}\Rightarrow \left\lfloor \frac{P_{max}\hbox{--} \frac{S}{2}}{S}\right\rfloor =k-1\\ {}\Rightarrow \left\lfloor \frac{P_{max}\hbox{--} \frac{S}{2}}{S}\right\rfloor +1=k\end{array}} $$
(22)

From definition of (\( {\mathrm{B}}_k^s \))ij in Eq. (12), we get:

$$ {\left({\mathrm{B}}_{k+m}^s\right)}_{\mathrm{XbYb}}=1 $$
(23)

Bin k has values ranging from (k − 1) ∗ S to k ∗ S and Bin k + m has the values \( \left(k-1\right)\ast S+\frac{S}{2}\ to\ k\ast S+\frac{S}{2} \)

$$ {\displaystyle \begin{array}{c}\left(k-1\right)\ast S\le Range\ of\ Bin\ k\le k\ast S\\ {}\left(k-1\right)\ast S+\frac{S}{2}\kern0.5em \le Range\ of\ Bin\ \left(k+m\right)\le k\ast S+\frac{S}{2}\\ {}\Rightarrow \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}\mathbf{\le}\boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \boldsymbol{k}\cup \boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \left(\boldsymbol{k}+\boldsymbol{m}\right)<\boldsymbol{k}\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\end{array}} $$
(24)

From definition of \( {\Delta}_k^s \):

$$ {\Delta}_k^s={\mathrm{B}}_k^s\cup {\mathrm{B}}_{k+\left\lfloor \frac{PV_{max}}{S}\right\rfloor \kern0.5em +1}^s={\mathrm{B}}_k^s\cup {\mathrm{B}}_{k+m}^s $$

Thus, from Eq. (24)

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}\le \boldsymbol{Range}\ \boldsymbol{of}\ {\boldsymbol{\Delta}}_{\boldsymbol{k}}^{\boldsymbol{s}}<\boldsymbol{k}\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}} $$
(25)

Rearranging Eq. (19) we get

$$ {P}_{max}<k\ast S+\frac{S}{2} $$
(26)

And because Pmin lies in Bin k from Eq. (14):

$$ \left(k-1\right)\ast S\le {P}_{min} $$
(27)

Now, any PixelValue in region R will lie between Pmax and Pmin.

From (26) and (27) we get:

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}\le {\boldsymbol{P}}_{\boldsymbol{min}}\le \boldsymbol{PixelValue}\left(\boldsymbol{i},\boldsymbol{j}\right)\le {\boldsymbol{P}}_{\boldsymbol{max}}<\boldsymbol{k}\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}} $$
(28)

From Eqs (25) and (28), we thus prove that \( {\left({\boldsymbol{\Delta}}_{\boldsymbol{k}}^{\boldsymbol{s}}\right)}_{\boldsymbol{ij}}=\mathbf{1} \) for all (i, j) ∈ R

$$ \mathbf{Case}\ \mathbf{2.2}:{\displaystyle \begin{array}{c}{P}_{max}-\left(k\ast S\right)\ge \frac{S}{2}\\ {}{P}_{min}={P}_{max}-D\end{array}} $$
(29)

Since D ≤ S and \( {P}_{min}\ge {P}_{max}-S\ge \left(k\ast S\right)+\frac{S}{2}-S=\left(k-1\right)\ast S+\frac{S}{2} \), hence,

$$ {P}_{min}\ge \left(k-1\right)\ast S+\frac{S}{2} $$
(30)

And from Eq. (14):

$$ k=\left\lfloor \frac{P_{min}}{S}\right\rfloor +1\Rightarrow {P}_{min}<k\ast S $$
(31)

Hence, from (28) and (29) we get:

$$ \Rightarrow \left(k-1\right)\ast S+\frac{S}{2}\le \kern0.5em {P}_{min}<k\ast S $$
(32)
$$ Subtracting\ \frac{S}{2} we\ get,\kern0.75em \left(k-1\right)\ast S+\frac{S}{2}-\frac{S}{2}\le \kern0.5em {P}_{min}-\frac{S}{2}<k\ast S-\frac{S}{2} $$
(33)
$$ \Rightarrow \kern0.5em \left(k-1\right)\kern0.5em \le \frac{P_{min}-\frac{S}{2}}{S}<k-\frac{1}{2}<k $$
(34)
$$ k-1\le \frac{P_{min}-\frac{S}{2}}{S}<k $$
(35)
$$ {\displaystyle \begin{array}{c}\Rightarrow \left\lfloor \frac{P_{min}-\frac{S}{2}}{S}\right\rfloor =k-1\\ {}\Rightarrow \left\lfloor \frac{P_{min}-\frac{S}{2}}{S}\right\rfloor +1=k\kern1em \Rightarrow \left\lfloor \frac{P_{min}-\frac{S}{2}}{S}\right\rfloor +1=\left(k+m\right)-m\end{array}} $$
(36)

From definition of (\( {\mathrm{B}}_k^s \))ij: \( {\left({\mathrm{B}}_{k+m}^s\right)}_{\mathrm{XbYb}}\kern0.75em =1 \)

Bin k + 1 has values ranging from k ∗ S to (k + 1) ∗ S and bin k + m has the values \( \left(k-1\right)\ast S+\frac{S}{2}\ to\ k\ast S+\frac{S}{2} \)

$$ k\ast S\le Range\ of\ Bin\ \left(k+1\right)\le \left(k+1\right)\ast S $$
$$ \left(k-1\right)\ast S+\frac{S}{2}\kern0.5em \le Range\ of\ Bin\ \left(k+m\right)\le k\ast S+\frac{S}{2} $$
$$ \Rightarrow \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\kern0.5em \le \boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \left(\boldsymbol{k}+\mathbf{1}\right)\kern0.5em \cup \kern0.5em \boldsymbol{Range}\ \boldsymbol{of}\ \boldsymbol{Bin}\ \left(\boldsymbol{k}+\boldsymbol{m}\right)\le \left(\boldsymbol{k}+\mathbf{1}\right)\ast \boldsymbol{S} $$

And from the definition of \( {\Delta}_{k+m}^s \):

$$ {\Delta}_{k+m}^s={\mathrm{B}}_{k+m}^s\cup {\mathrm{B}}_{k+m-m+1}^s={\mathrm{B}}_{k+m}^s\cup {\mathrm{B}}_{k+1}^s $$

Thus,

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\le \boldsymbol{Range}\ \boldsymbol{of}\ {\boldsymbol{\Delta}}_{\boldsymbol{k}+\boldsymbol{m}}^{\boldsymbol{s}}\le \left(\boldsymbol{k}+\mathbf{1}\right)\ast \boldsymbol{S} $$
(37)

From Eq. (18) we have

$$ {P}_{max}\le k\ast S\le \left(k+1\right)\ast S $$
(38)

And Pmin from Eq. (30) we get,

$$ \left(k-1\right)\ast S+\frac{S}{2}\le {P}_{min} $$

Now, any PixelValue in region R will lie between Pmax and Pmin

$$ \left(\boldsymbol{k}-\mathbf{1}\right)\ast \boldsymbol{S}+\frac{\boldsymbol{S}}{\mathbf{2}}\le {\boldsymbol{P}}_{\boldsymbol{min}}\le \boldsymbol{PixelValue}\left(\boldsymbol{i},\boldsymbol{j}\right)\le {\boldsymbol{P}}_{\boldsymbol{max}}<\left(\boldsymbol{k}+\mathbf{1}\right)\ast \boldsymbol{S} $$
(39)

From Eqs. (37) and (39), we thus prove that \( {\left({\boldsymbol{\Delta}}_{\boldsymbol{k}}^{\boldsymbol{s}}\right)}_{\boldsymbol{ij}}=\mathbf{1} \) for all (i, j) ∈ R

Thus, for all the cases the lemma gets proved.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dutta, I.N., Chakraborty, N., Mollah, A.F. et al. BOB: a bi-level overlapped binning procedure for scene word binarization. Multimed Tools Appl 80, 7609–7635 (2021). https://doi.org/10.1007/s11042-020-09785-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09785-7

Keywords