Abstract
Content of the document images are often shows hierarchical multi-layered tree structure. Further, the algorithms for document image applications like line detection, paragraph detection, word recognition, layout analysis etc. require pixel level annotation. In this paper, a Multi-layered Document Image Annotation System (MultiDIAS) has been introduced. The proposed system simultaneously provide a platform for hierarchical and pixel level annotation of document. MultiDIAS label the document image in four hierarchical layers (layout type, entity type, line type, word type) assigned by the user. The output generated are four ground-truth images and an XML file representing the metadata information. The MultiDIAS is tested on a complex handwritten manuscript written by renowned film director Satyajit Ray for the movie ‘Goopi Gyne Bagha Byne’. This annotated data generated using MultiDIAS can further be used in a wide range of applications of document image understanding and analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bhowmik, S., Sarkar, R., Nasipuri, M., Doermann, D.: Text and non-text separation in offline document images: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 1–20 (2018)
Chaudhuri, B., Pal, U.: Skew angle detection of digitized indian script documents. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 182–186 (1997)
Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: Document Recognition and Retrieval XXII, vol. 9402, p. 940204. International Society for Optics and Photonics (2015)
Dey, S., Mukherjee, J., Sural, S., Nandedkar, A.V.: Anveshak - a groundtruth generation tool for foreground regions of document images. In: Mukherjee, S., Mukherjee, S., Mukherjee, D.P., Sivaswamy, J., Awate, S., Setlur, S., Namboodiri, A.M., Chaudhury, S. (eds.) ICVGIP 2016. LNCS, vol. 10481, pp. 255–264. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68124-5_22
Doermann, D., Zotkina, E., Li, H.: GEDI-a groundtruthing environment for document images. In: Ninth IAPR International Workshop on Document Analysis Systems (DAS 2010). Citeseer (2010)
Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geog. Inf. Geovisualization 10(2), 112–122 (1973)
Gonzalez, R.C., Woods, R.E., et al.: Digital image processing (2002)
Hormann, K., Agathos, A.: The point in polygon problem for arbitrary polygons. Comput. Geometry 20(3), 131–144 (2001)
Lee, C.H., Kanungo, T.: The architecture of trueviz: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)
Moll, M.A., Baird, H.S., An, C.: Truthing for pixel-accurate segmentation. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 379–385. IEEE (2008)
Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. Pattern Recognit. 37(9), 1887–1899 (2004)
Saleh, Z., Zhang, K., Calvo-Zaragoza, J., Vigliensoni, G., Fujinaga, I.: Pixel. js: web-based pixel classification correction platform for ground truth creation. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 39–40. IEEE (2017)
Saund, E., Lin, J., Sarkar, P.: Pixlabeler: user interface for pixel-level labeling of elements in document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 646–650. IEEE (2009)
Shafait, F., Keysers, D., Breuel, T.M.: Pixel-accurate representation and evaluation of page segmentation in document images, pp. 872–875. IEEE (2006)
Strecker, T., Van Beusekom, J., Albayrak, S., Breuel, T.M.: Automated ground truth data generation for newspaper document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 1275–1279. IEEE (2009)
Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)
Thoma, G.: Ground truth data for document image analysis. In: Symposium on Document Image Understanding and Technology (SDIUT), pp. 199–205 (2003)
Wenyin, L., Dori, D.: A protocol for performance evaluation of line detection algorithms. Mach. Vis. Appl. 9(5–6), 240–250 (1997)
Yacoub, S., Saxena, V., Sami, S.N.: Perfectdoc: a ground truthing environment for complex documents. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 452–456. IEEE (2005)
Yang, L., Huang, W., Tan, C.L.: Semi-automatic ground truth generation for chart image recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 324–335. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_29
Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)
Acknowledgement
This research was partially supported and funded by IMPRINT, Government of India, through the research project titled “Information Access from Document Images of Indian Languages”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Poddar, A., Mukherjee, R., Mukhopadhyay, J., Biswas, P.K. (2019). MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System. In: Sundaram, S., Harit, G. (eds) Document Analysis and Recognition. DAR 2018. Communications in Computer and Information Science, vol 1020. Springer, Singapore. https://doi.org/10.1007/978-981-13-9361-7_1
Download citation
DOI: https://doi.org/10.1007/978-981-13-9361-7_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9360-0
Online ISBN: 978-981-13-9361-7
eBook Packages: Computer ScienceComputer Science (R0)