MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System

Poddar, Arnab; Mukherjee, Rohan; Mukhopadhyay, Jayanta; Biswas, Prabir Kumar

doi:10.1007/978-981-13-9361-7_1

Arnab Poddar⁹,
Rohan Mukherjee⁹,
Jayanta Mukhopadhyay⁹ &
…
Prabir Kumar Biswas⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1020))

Included in the following conference series:

Workshop on Document Analysis and Recognition

302 Accesses
1 Citations

Abstract

Content of the document images are often shows hierarchical multi-layered tree structure. Further, the algorithms for document image applications like line detection, paragraph detection, word recognition, layout analysis etc. require pixel level annotation. In this paper, a Multi-layered Document Image Annotation System (MultiDIAS) has been introduced. The proposed system simultaneously provide a platform for hierarchical and pixel level annotation of document. MultiDIAS label the document image in four hierarchical layers (layout type, entity type, line type, word type) assigned by the user. The output generated are four ground-truth images and an XML file representing the metadata information. The MultiDIAS is tested on a complex handwritten manuscript written by renowned film director Satyajit Ray for the movie ‘Goopi Gyne Bagha Byne’. This annotated data generated using MultiDIAS can further be used in a wide range of applications of document image understanding and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bhowmik, S., Sarkar, R., Nasipuri, M., Doermann, D.: Text and non-text separation in offline document images: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 1–20 (2018)
Google Scholar
Chaudhuri, B., Pal, U.: Skew angle detection of digitized indian script documents. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 182–186 (1997)
Article Google Scholar
Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: Document Recognition and Retrieval XXII, vol. 9402, p. 940204. International Society for Optics and Photonics (2015)
Google Scholar
Dey, S., Mukherjee, J., Sural, S., Nandedkar, A.V.: Anveshak - a groundtruth generation tool for foreground regions of document images. In: Mukherjee, S., Mukherjee, S., Mukherjee, D.P., Sivaswamy, J., Awate, S., Setlur, S., Namboodiri, A.M., Chaudhury, S. (eds.) ICVGIP 2016. LNCS, vol. 10481, pp. 255–264. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68124-5_22
Chapter Google Scholar
Doermann, D., Zotkina, E., Li, H.: GEDI-a groundtruthing environment for document images. In: Ninth IAPR International Workshop on Document Analysis Systems (DAS 2010). Citeseer (2010)
Google Scholar
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geog. Inf. Geovisualization 10(2), 112–122 (1973)
Article Google Scholar
Gonzalez, R.C., Woods, R.E., et al.: Digital image processing (2002)
Google Scholar
Hormann, K., Agathos, A.: The point in polygon problem for arbitrary polygons. Comput. Geometry 20(3), 131–144 (2001)
Article MathSciNet Google Scholar
Lee, C.H., Kanungo, T.: The architecture of trueviz: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)
Article Google Scholar
Moll, M.A., Baird, H.S., An, C.: Truthing for pixel-accurate segmentation. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 379–385. IEEE (2008)
Google Scholar
Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. Pattern Recognit. 37(9), 1887–1899 (2004)
Article Google Scholar
Saleh, Z., Zhang, K., Calvo-Zaragoza, J., Vigliensoni, G., Fujinaga, I.: Pixel. js: web-based pixel classification correction platform for ground truth creation. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 39–40. IEEE (2017)
Google Scholar
Saund, E., Lin, J., Sarkar, P.: Pixlabeler: user interface for pixel-level labeling of elements in document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 646–650. IEEE (2009)
Google Scholar
Shafait, F., Keysers, D., Breuel, T.M.: Pixel-accurate representation and evaluation of page segmentation in document images, pp. 872–875. IEEE (2006)
Google Scholar
Strecker, T., Van Beusekom, J., Albayrak, S., Breuel, T.M.: Automated ground truth data generation for newspaper document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 1275–1279. IEEE (2009)
Google Scholar
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)
Article Google Scholar
Thoma, G.: Ground truth data for document image analysis. In: Symposium on Document Image Understanding and Technology (SDIUT), pp. 199–205 (2003)
Google Scholar
Wenyin, L., Dori, D.: A protocol for performance evaluation of line detection algorithms. Mach. Vis. Appl. 9(5–6), 240–250 (1997)
Article Google Scholar
Yacoub, S., Saxena, V., Sami, S.N.: Perfectdoc: a ground truthing environment for complex documents. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 452–456. IEEE (2005)
Google Scholar
Yang, L., Huang, W., Tan, C.L.: Semi-automatic ground truth generation for chart image recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 324–335. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_29
Chapter Google Scholar
Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)
Article Google Scholar

Download references

Acknowledgement

This research was partially supported and funded by IMPRINT, Government of India, through the research project titled “Information Access from Document Images of Indian Languages”.

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, India
Arnab Poddar, Rohan Mukherjee, Jayanta Mukhopadhyay & Prabir Kumar Biswas

Authors

Arnab Poddar
View author publications
You can also search for this author in PubMed Google Scholar
Rohan Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Jayanta Mukhopadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Prabir Kumar Biswas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnab Poddar .

Editor information

Editors and Affiliations

Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, India
Suresh Sundaram
Computer Science and Engineering, Indian Institute of Technology Jodhpur, Karwar, Rajasthan, India
Gaurav Harit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Poddar, A., Mukherjee, R., Mukhopadhyay, J., Biswas, P.K. (2019). MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System. In: Sundaram, S., Harit, G. (eds) Document Analysis and Recognition. DAR 2018. Communications in Computer and Information Science, vol 1020. Springer, Singapore. https://doi.org/10.1007/978-981-13-9361-7_1

Download citation

DOI: https://doi.org/10.1007/978-981-13-9361-7_1
Published: 05 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9360-0
Online ISBN: 978-981-13-9361-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics