Abstract
Document layout analysis (DLA) is an irreplaceable pre-requisite for the development of a comprehensive document image processing and analysis system. The main purpose of DLA is to segment an input document image into its constituent and coherent regions and identify their classes. In this paper, we propose a competent DLA system, named as BINYAS, based on the connected component (CC) and pixel analysis based approach. Here, we initially identify the regions and then classify these regions as paragraph, separator, graphic, image, table, chart, and inverted text etc. The proposed system is evaluated on four publicly available standard datasets, namely ICDAR 2009, 2015, 2017 and 2019 page segmentation competition datasets, and the performance is compared with many contemporary methods, which also include some well-known software products and deep learning based methods. Experimental results show that our method performs significantly better than state-of-the-art methods in terms of the evaluation metrics considered by the research community of this domain.






















Similar content being viewed by others
References
Acharyya M, Kundu MK (2001) Multiscale segmentation of document images using m-band wavelets, In International Conference on Computer Analysis of Images and Patterns, pp. 510–517
Ackley HS (2020) Optical character recognition systems and methods. Google Patents, 09-Jun
Antonacopoulos A, Clausner C, Papadopoulos C, Pletschacher S (2013) Icdar 2013 competition on historical newspaper layout analysis (hnla 2013), In 2013 12th International Conference on Document Analysis and Recognition, pp. 1454–1458
Antonacopoulos A, Clausner C, Papadopoulos C, Pletschacher S (2015) ICDAR2015 competition on recognition of documents with complex layouts-RDCL2015, In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 1151–1155
Antonacopoulos A, Pletschacher S, Bridson D, Papadopoulos C (2009) ICDAR 2009 page segmentation competition, In 2009 10th International Conference on Document Analysis and Recognition, pp. 1370–1374
Antonacopoulos A, Ritchings RT (1994) Flexible page segmentation using the background, In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3-Conference C: Signal Processing (Cat. No. 94CH3440–5), vol. 2, pp. 339–344
Basic Book Design/Indentation, WIKIBOOKS1
Bhowmik S, Sarkar R, Das B, Doermann D (2019) GiB: a game theory inspired Binarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455
Bhowmik S, Sarkar R, Nasipuri M, Doermann D (2018) Text and non-text separation in offline document images: a survey. Int J Doc Anal Recognit 21(1–2):1–20
Binmakhashen GM, Mahmoud SA (2019) Document layout analysis: a comprehensive survey. ACM Comput Surv 52(6):1–36
Bloomberg DS (1991) Multiresolution morphological approach to document image analysis, In Proceedings of the International Conference on Document Analysis and Recognition, Saint-Malo
Bukhari SS, Azawi A, Ali MI, Shafait F, Breuel TM (2010) Document image segmentation using discriminative learning over connected components, In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 183–190
Bukhari SS, Shafait F, Breuel TM (2011) Improved document image segmentation algorithm using multiresolution morphology, In IS&T/SPIE Electronic Imaging, pp. 78740D-78740D
Chen K, Yin F, Liu CL (2013) Hybrid page segmentation with efficient whitespace rectangles extraction and grouping, In 2013 12th International Conference on Document Analysis and Recognition, pp. 958–962
Clausner SPC, Antonacopoulos A (2019) ICDAR2019 Competition on Recognition of Documents with Complex Layouts – RDCL2019, In Proceedings of the 15th International Conference on Document Analysis and Recognition (ICDAR2019), pp. 1521–1526
Clausner C, Antonacopoulos A, Pletschacher S (2017) ICDAR2017 Competition on Recognition of Documents with Complex Layouts-RDCL2017, in Document Analysis and Recognition (ICDAR), 2017 14th IAPR international conference on, vol. 1, pp. 1404–1410
Clausner C, Antonacopoulos A, Pletschacher S (2019) ICDAR2019 Competition on Recognition of Documents with Complex Layouts-RDCL2019, In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1521–1526
Clausner C, Pletschacher S, Antonacopoulos A (2011) Scenario driven in-depth performance evaluation of document layout analysis methods, In 2011 International Conference on Document Analysis and Recognition, pp. 1404–1408
Convert Inch to Pixel, unitconverters.net
Dai-Ton H, Duc-Dung N, Duc-Hieu L (2016) An adaptive over-split and merge algorithm for page segmentation. Pattern Recogn Lett 80:137–143
Eskenazi S, Gomez-Krämer P, Ogier J-M (2017) A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recogn 64:1–14
FineReader Engine 10, ABBYY Technology Portal
FineReader Engine 11 (2015) ABBYY Technology Portal
FineReader Engine 12, ABBYY Technology Portal
Kaur RP, Jindal MK, Kumar M (2020) Text and graphics segmentation of newspapers printed in Gurmukhi script: a hybrid approach, Vis Comput, pp. 1–23
Kise K (2014) Page segmentation techniques in document analysis, in Handbook of Document Image Processing and Recognition, Springer, pp. 135–175
Kise K, Sato A, Iwata M (1998) Segmentation of page images using the area Voronoi diagram. Comput Vis Image Underst 70(3):370–382
Kise K, Yanagida O, Takamatsu S (1996) Page segmentation based on thinning of background, In Proceedings of 13th International Conference on Pattern Recognition, vol. 3, pp. 788–792
Le VP, Nayef N, Visani M, Ogier JM, De Tran C (2015) Text and non-text segmentation based on connected component features, In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 1096–1100
Lin MW, Tapamo J-R, Ndovie B (2006) A texture-based method for document segmentation and classification. South African Computer Journal 36(1):49–56
Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. Computer (Long Beach Calif) 25(7):10–22
Nestor T et al (2020) A multidimensional hyperjerk oscillator: Dynamics analysis, analogue and embedded systems implementation, and its application as a cryptosystem. Sensors 20(1):83
Normand N, Viard-Gaudin C (1995) A background based adaptive page segmentation algorithm, In Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 138–141
Olszewska JI (2015) Active contour based optical character recognition for automated scene understanding. Neurocomputing 161:65–71
Oyedotun OK, Khashman A (2016) Document segmentation using textural features summarization and feedforward neural network, Appl Intell, pp. 1–15
Pavlidis T, Zhou J (1992) Page segmentation and classification. CVGIP Graph Model image Process 54(6):484–496
Sauvola J, Pietikäinen M (2000) Adaptive document image binarization. Pattern Recogn 33(2):225–236
Shih FY, Chen S-S (1996) Adaptive document block segmentation and classification. IEEE Transactions on Systems Man Cybernetics Part B 26(5):797–802
Smith RW (2009) Hybrid page layout analysis via tab-stop detection, In 2009 10th International Conference on Document Analysis and Recognition, pp. 241–245
Smith RW (2013) History of the Tesseract OCR engine: what worked and what didn’t, In IS&T/SPIE Electronic Imaging, p. 865802
Sun HM (2005) Page segmentation for Manhattan and non-Manhattan layout documents via selective CRLA, in Eighth International Conference on Document Analysis and Recognition (ICDAR’05), pp. 116–120
“Tesseract-OCR.” [Online]. Available: https://github.com/tesseract-ocr/tesseract/wiki
Tran T-A, Na I-S, Kim S-H (2015) Separation of text and non-text in document layout analysis using a recursive filter. KSII Transactions on Internet and Information Systems 9(10):4072–4091
Tran TA, Na IS, Kim SH (2016) Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology. Int J Doc Anal Recognit 19(3):191–209
Tran TA, Oh K, Na IS, Lee GS, Yang HJ, Kim SH (2017) A robust system for document layout analysis using multilevel homogeneity structure, Expert Systems and Applications
Vasilopoulos N, Kavallieratou E (2017) Unified layout analysis and text localization framework. J Electron Imaging 26(1):13009
Vasilopoulos N, Kavallieratou E (2017) Complex layout analysis based on contour classification and morphological operations. Eng Appl Artif Intell 65:220–229
Zagoris K, Chatzichristofis SA, Papamarkos N (2011) Text localization using standard deviation analysis of structure elements and support vector machines. EURASIP Journal on Advances in Signal Processing 2011(1):1–12
Zlatopolsky AA (1994) Automated document segmentation. Pattern Recogn Lett 15(7):699–704
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bhowmik, S., Kundu, S. & Sarkar, R. BINYAS: a complex document layout analysis system. Multimed Tools Appl 80, 8471–8504 (2021). https://doi.org/10.1007/s11042-020-09832-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09832-3