skip to main content
10.1145/3599957.3606219acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

Weakly-supervised Semantic Segmentation on Historical Document Images

Published:29 August 2023Publication History

ABSTRACT

Weakly-Supervised Semantic Segmentation (WSSS) has been widely studied as a feasible option to reduce the expensive annotation costs of deep semantic segmentation models. While numerous studies have proposed novel approaches for generating pseudo-labels and demonstrated their effectiveness, their domain is limited to natural scene images. There is a growing need for a comprehensive exploration of WSSS in the domain of document images, given the increasing number of digitized historical document images and the growing importance of document image segmentation for successful information retrieval. Importantly, document images possess inherent characteristics that distinguish them from natural scene images, rendering conventional image-level labels unsuitable. Consequently, the application of recent WSSS frameworks designed for natural scene images is limited. In this work, we propose a simple yet effective pseudo-label generation technique using content-adaptive geometric feature analysis. This approach enables the training of a segmentation model in a weakly-supervised manner without relying on image-level labels. Our method utilizes a Gravity-map, which can highlight potential regions of interest without requiring a priori-knowledge, serving as an initial coarse pixel-level label. The Gravity-map is subsequently refined through simple binarization and noise removal to form a pseudo-label. Finally, a segmentation model is trained using the generated pseudo-labels. Experimental results on the publicly available historical document collection demonstrate that the proposed pseudo-label generation technique offers a viable option for training the semantic segmentation model in the document image domain.

References

  1. Jiwoon Ahn, Sunghyun Cho, and Suha Kwak. 2019. Weakly supervised learning of instance segmentation with inter-pixel relations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2209--2218.Google ScholarGoogle ScholarCross RefCross Ref
  2. Jiwoon Ahn and Suha Kwak. 2018. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4981--4990.Google ScholarGoogle ScholarCross RefCross Ref
  3. Amy Bearman, Olga Russakovsky, Vittorio Ferrari, and Li Fei-Fei. 2016. What's the point: Semantic segmentation with point supervision. In Computer Vision--ECCV2016: 14thEuropean Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part VII 14. Springer, 549--565.Google ScholarGoogle Scholar
  4. Christopher Michael Bishop. 2016. Pattern Recognition and Machine Learn- ing. springer.Google ScholarGoogle Scholar
  5. Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics 21, 1 (2020), 1--13.Google ScholarGoogle Scholar
  6. Jifeng Dai, Kaiming He, and Jian Sun. 2015. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceedings of the IEEE international conference on computer vision. 1635--1643.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).Google ScholarGoogle Scholar
  8. Koichi Kise, Akinori Sato, and Motoi Iwata. 1998. Segmentation of page images using the area Voronoi diagram. Computer Vision and Image Understanding 70, 3 (1998), 370--382.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bernhard Liebl and Manuel Burghardt. 2021. An evaluation of DNN architectures for page segmentation of historical newspapers. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 5153--5160.Google ScholarGoogle ScholarCross RefCross Ref
  10. Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, and Jian Sun. 2016. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3159--3167.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sofia Ares Oliveira, Benoit Seguin, and Frederic Kaplan. 2018. dhSegment: A generic deep-learning approach for document segmentation. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, 7--12.Google ScholarGoogle ScholarCross RefCross Ref
  12. Chulwoo Pack, Leen-Kiat Soh, and Elizabeth Lorang. 2021. Visual domain knowledge-based multimodal zoning for textual region localization in noisy historical document images. Journal of Electronic Imaging 30, 6 (2021), 063028. https://doi.org/10.1117/1.JEI.30.6.063028Google ScholarGoogle ScholarCross RefCross Ref
  13. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241.Google ScholarGoogle Scholar
  14. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  15. Christoph Wick and Frank Puppe. 2018. Fully convolutional neural networks for page segmentation of historical document images. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 287--292.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yue Xu, Wenhao He, Fei Yin, and Cheng-Lin Liu. 2017. Page segmentation for historical handwritten documents using fully convolutional networks. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 1. IEEE, 541--546.Google ScholarGoogle ScholarCross RefCross Ref
  17. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2921--2929.Google ScholarGoogle ScholarCross RefCross Ref
  18. Zhi-Hua Zhou. 2017. A brief introduction to weakly supervised learning. National Science Review 5, 1 (08 2017), 44--53. https://doi.org/10.1093/nsr/nwx106arXiv:https://academic.oup.com/nsr/article-pdf/5/1/44/31567770/nwx106.pdfGoogle ScholarGoogle Scholar

Index Terms

  1. Weakly-supervised Semantic Segmentation on Historical Document Images

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems
          August 2023
          251 pages
          ISBN:9798400702280
          DOI:10.1145/3599957

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 29 August 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate393of1,581submissions,25%
        • Article Metrics

          • Downloads (Last 12 months)32
          • Downloads (Last 6 weeks)1

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader