research-article

Weakly-supervised Semantic Segmentation on Historical Document Images

Authors:
Chulwoo Pack

South Dakota State University, Brookings, South Dakota

South Dakota State University, Brookings, South Dakota

0000-0002-9876-4388
View Profile

,
Dongyoun Kim

Iowa State University, Ames, Iowa

Iowa State University, Ames, Iowa

0000-0002-7476-747X
View Profile

RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent SystemsAugust 2023Article No.: 41Pages 1–6https://doi.org/10.1145/3599957.3606219

Published:29 August 2023Publication History

RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems

Pages 1–6

ABSTRACT

Weakly-Supervised Semantic Segmentation (WSSS) has been widely studied as a feasible option to reduce the expensive annotation costs of deep semantic segmentation models. While numerous studies have proposed novel approaches for generating pseudo-labels and demonstrated their effectiveness, their domain is limited to natural scene images. There is a growing need for a comprehensive exploration of WSSS in the domain of document images, given the increasing number of digitized historical document images and the growing importance of document image segmentation for successful information retrieval. Importantly, document images possess inherent characteristics that distinguish them from natural scene images, rendering conventional image-level labels unsuitable. Consequently, the application of recent WSSS frameworks designed for natural scene images is limited. In this work, we propose a simple yet effective pseudo-label generation technique using content-adaptive geometric feature analysis. This approach enables the training of a segmentation model in a weakly-supervised manner without relying on image-level labels. Our method utilizes a Gravity-map, which can highlight potential regions of interest without requiring a priori-knowledge, serving as an initial coarse pixel-level label. The Gravity-map is subsequently refined through simple binarization and noise removal to form a pseudo-label. Finally, a segmentation model is trained using the generated pseudo-labels. Experimental results on the publicly available historical document collection demonstrate that the proposed pseudo-label generation technique offers a viable option for training the semantic segmentation model in the document image domain.

References

Jiwoon Ahn, Sunghyun Cho, and Suha Kwak. 2019. Weakly supervised learning of instance segmentation with inter-pixel relations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2209--2218.Google ScholarCross Ref
Jiwoon Ahn and Suha Kwak. 2018. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4981--4990.Google ScholarCross Ref
Amy Bearman, Olga Russakovsky, Vittorio Ferrari, and Li Fei-Fei. 2016. What's the point: Semantic segmentation with point supervision. In Computer Vision--ECCV2016: 14thEuropean Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part VII 14. Springer, 549--565.Google Scholar
Christopher Michael Bishop. 2016. Pattern Recognition and Machine Learn- ing. springer.Google Scholar
Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics 21, 1 (2020), 1--13.Google Scholar
Jifeng Dai, Kaiming He, and Jian Sun. 2015. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceedings of the IEEE international conference on computer vision. 1635--1643.Google ScholarDigital Library
Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).Google Scholar
Koichi Kise, Akinori Sato, and Motoi Iwata. 1998. Segmentation of page images using the area Voronoi diagram. Computer Vision and Image Understanding 70, 3 (1998), 370--382.Google ScholarDigital Library
Bernhard Liebl and Manuel Burghardt. 2021. An evaluation of DNN architectures for page segmentation of historical newspapers. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 5153--5160.Google ScholarCross Ref
Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, and Jian Sun. 2016. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3159--3167.Google ScholarCross Ref
Sofia Ares Oliveira, Benoit Seguin, and Frederic Kaplan. 2018. dhSegment: A generic deep-learning approach for document segmentation. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, 7--12.Google ScholarCross Ref
Chulwoo Pack, Leen-Kiat Soh, and Elizabeth Lorang. 2021. Visual domain knowledge-based multimodal zoning for textual region localization in noisy historical document images. Journal of Electronic Imaging 30, 6 (2021), 063028. https://doi.org/10.1117/1.JEI.30.6.063028Google ScholarCross Ref
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241.Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Christoph Wick and Frank Puppe. 2018. Fully convolutional neural networks for page segmentation of historical document images. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 287--292.Google ScholarCross Ref
Yue Xu, Wenhao He, Fei Yin, and Cheng-Lin Liu. 2017. Page segmentation for historical handwritten documents using fully convolutional networks. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 1. IEEE, 541--546.Google ScholarCross Ref
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2921--2929.Google ScholarCross Ref
Zhi-Hua Zhou. 2017. A brief introduction to weakly supervised learning. National Science Review 5, 1 (08 2017), 44--53. https://doi.org/10.1093/nsr/nwx106arXiv:https://academic.oup.com/nsr/article-pdf/5/1/44/31567770/nwx106.pdfGoogle Scholar

Index Terms

Weakly-supervised Semantic Segmentation on Historical Document Images
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Document analysis
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
  2. Machine learning
    1. Learning paradigms

Recommendations

Dual semantic-guided model for weakly-supervised zero-shot semantic segmentation
Abstract
The major obstacle in semantic segmentation is that it requires a large number of pixel-level labeled data to train an effective model. In order to reduce the cost of annotation, weakly-supervised methods use weaker labels to overcome the need for ... $_{}$
Read More
Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Successful semantic segmentation methods typically rely on the training datasets containing a large number of pixel-wise labeled images. To alleviate the dependence on such a fully annotated training dataset, in this paper, we propose a semi- and weakly-...
Read More
Dual-aware Domain Mining and Cross-aware Supervision for Weakly-supervised Semantic Segmentation
Weakly Supervised Semantic Segmentation with image-level annotation uses localization maps from the classifier to generate pseudo labels. However, such localization maps focus only on sparse salient object regions, it is difficult to generate high-quality ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems
August 2023
251 pages
ISBN:9798400702280
DOI:10.1145/3599957

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 August 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Weakly-supervised Semantic Segmentation
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate393of1,581submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 32
  Total Downloads
- Downloads (Last 12 months)32
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Weakly-supervised Semantic Segmentation on Historical Document Images

RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Dual semantic-guided model for weakly-supervised zero-shot semantic segmentation

Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks

Dual-aware Domain Mining and Cross-aware Supervision for Weakly-supervised Semantic Segmentation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Weakly-supervised Semantic Segmentation on Historical Document Images

RACS '23: Proceedings of the 2023 International Conference on Research in Adaptive and Convergent Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Dual semantic-guided model for weakly-supervised zero-shot semantic segmentation

Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks

Dual-aware Domain Mining and Cross-aware Supervision for Weakly-supervised Semantic Segmentation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media