Skip to main content

MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System

  • Conference paper
  • First Online:
Document Analysis and Recognition (DAR 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1020))

Included in the following conference series:

Abstract

Content of the document images are often shows hierarchical multi-layered tree structure. Further, the algorithms for document image applications like line detection, paragraph detection, word recognition, layout analysis etc. require pixel level annotation. In this paper, a Multi-layered Document Image Annotation System (MultiDIAS) has been introduced. The proposed system simultaneously provide a platform for hierarchical and pixel level annotation of document. MultiDIAS label the document image in four hierarchical layers (layout type, entity type, line type, word type) assigned by the user. The output generated are four ground-truth images and an XML file representing the metadata information. The MultiDIAS is tested on a complex handwritten manuscript written by renowned film director Satyajit Ray for the movie ‘Goopi Gyne Bagha Byne’. This annotated data generated using MultiDIAS can further be used in a wide range of applications of document image understanding and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhowmik, S., Sarkar, R., Nasipuri, M., Doermann, D.: Text and non-text separation in offline document images: a survey. Int. J. Doc. Anal. Recognit. (IJDAR) 21(1–2), 1–20 (2018)

    Google Scholar 

  2. Chaudhuri, B., Pal, U.: Skew angle detection of digitized indian script documents. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 182–186 (1997)

    Article  Google Scholar 

  3. Chen, K., Seuret, M., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Ground truth model, tool, and dataset for layout analysis of historical documents. In: Document Recognition and Retrieval XXII, vol. 9402, p. 940204. International Society for Optics and Photonics (2015)

    Google Scholar 

  4. Dey, S., Mukherjee, J., Sural, S., Nandedkar, A.V.: Anveshak - a groundtruth generation tool for foreground regions of document images. In: Mukherjee, S., Mukherjee, S., Mukherjee, D.P., Sivaswamy, J., Awate, S., Setlur, S., Namboodiri, A.M., Chaudhury, S. (eds.) ICVGIP 2016. LNCS, vol. 10481, pp. 255–264. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68124-5_22

    Chapter  Google Scholar 

  5. Doermann, D., Zotkina, E., Li, H.: GEDI-a groundtruthing environment for document images. In: Ninth IAPR International Workshop on Document Analysis Systems (DAS 2010). Citeseer (2010)

    Google Scholar 

  6. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: Int. J. Geog. Inf. Geovisualization 10(2), 112–122 (1973)

    Article  Google Scholar 

  7. Gonzalez, R.C., Woods, R.E., et al.: Digital image processing (2002)

    Google Scholar 

  8. Hormann, K., Agathos, A.: The point in polygon problem for arbitrary polygons. Comput. Geometry 20(3), 131–144 (2001)

    Article  MathSciNet  Google Scholar 

  9. Lee, C.H., Kanungo, T.: The architecture of trueviz: a groundtruth/metadata editing and visualizing toolkit. Pattern Recognit. 36(3), 811–825 (2003)

    Article  Google Scholar 

  10. Moll, M.A., Baird, H.S., An, C.: Truthing for pixel-accurate segmentation. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 379–385. IEEE (2008)

    Google Scholar 

  11. Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. Pattern Recognit. 37(9), 1887–1899 (2004)

    Article  Google Scholar 

  12. Saleh, Z., Zhang, K., Calvo-Zaragoza, J., Vigliensoni, G., Fujinaga, I.: Pixel. js: web-based pixel classification correction platform for ground truth creation. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 39–40. IEEE (2017)

    Google Scholar 

  13. Saund, E., Lin, J., Sarkar, P.: Pixlabeler: user interface for pixel-level labeling of elements in document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 646–650. IEEE (2009)

    Google Scholar 

  14. Shafait, F., Keysers, D., Breuel, T.M.: Pixel-accurate representation and evaluation of page segmentation in document images, pp. 872–875. IEEE (2006)

    Google Scholar 

  15. Strecker, T., Van Beusekom, J., Albayrak, S., Breuel, T.M.: Automated ground truth data generation for newspaper document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 1275–1279. IEEE (2009)

    Google Scholar 

  16. Suzuki, S., et al.: Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 30(1), 32–46 (1985)

    Article  Google Scholar 

  17. Thoma, G.: Ground truth data for document image analysis. In: Symposium on Document Image Understanding and Technology (SDIUT), pp. 199–205 (2003)

    Google Scholar 

  18. Wenyin, L., Dori, D.: A protocol for performance evaluation of line detection algorithms. Mach. Vis. Appl. 9(5–6), 240–250 (1997)

    Article  Google Scholar 

  19. Yacoub, S., Saxena, V., Sami, S.N.: Perfectdoc: a ground truthing environment for complex documents. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 452–456. IEEE (2005)

    Google Scholar 

  20. Yang, L., Huang, W., Tan, C.L.: Semi-automatic ground truth generation for chart image recognition. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 324–335. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_29

    Chapter  Google Scholar 

  21. Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1998)

    Article  Google Scholar 

Download references

Acknowledgement

This research was partially supported and funded by IMPRINT, Government of India, through the research project titled “Information Access from Document Images of Indian Languages”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnab Poddar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Poddar, A., Mukherjee, R., Mukhopadhyay, J., Biswas, P.K. (2019). MultiDIAS: A Hierarchical Multi-layered Document Image Annotation System. In: Sundaram, S., Harit, G. (eds) Document Analysis and Recognition. DAR 2018. Communications in Computer and Information Science, vol 1020. Springer, Singapore. https://doi.org/10.1007/978-981-13-9361-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9361-7_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9360-0

  • Online ISBN: 978-981-13-9361-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics