research-article

A method for document image enhancement to improve template-based classification

Authors:

Tingting ZhangAuthors Info & Claims

HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

Pages 87 - 91

https://doi.org/10.1145/3409501.3409531

Published: 25 August 2020 Publication History

Abstract

Document classification is one of the significant procedure in paper document recognition. This article proposed a method for document image enhancement to improve the performance of classification in the convolutional neural network. An enhanced document image was generated by extracting the table frame, text region, and shape of the raw document. The template-based classification experiment on 414 customs documents and more than one thousand generated images showed the enhanced image could help CNN model achieve higher accuracies compared to the original images. It could also diminish the interference of noise and unrelated features in document classification optimizing the robustness of networks. The proposed method also demonstrated the channels of the image could provide more information except for color in deep neural networks. As the similarity in the whole image classification tasks, the conclusion might provide ideas for the training of the neural networks in other fields such as street view recognition, medical image recognition, etc.

References

[1]

Benjamin Seidler, Markus Ebbecke, and Michael Gillmann. 2010. SmartFIX statistics: towards systematic document analysis performance evaluation and optimization. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (DAS '10). Association for Computing Machinery, New York, NY, USA, 333--340.

Digital Library

[2]

Gaceb, D., Eglin, V., & Lebourgeois, F. (2014). Classification of business documents for real-time application. Journal Of Real-Time Image Processing, 9(2), 329--345.

Digital Library

[3]

Larry M. Manevitz and Malik Yousef. 2002. One-class svms for document classification. J. Mach. Learn. Res. 2 (March 2002), 139--154.

Digital Library

[4]

Palm, R., Winther, O., & Laws, F. (2017). CloudScan - A configuration-free invoice analysis system using recurrent neural networks.

[5]

Jianying Hu, R., Kashi, & Wilfong. (1999). Document image layout comparison and classification. Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318), 285--288.

[6]

Augereau, O., Journet, N., Vialard, A., & Domenger, J. (2014). Improving Classification of an Industrial Document Image Database by Combining Visual and Textual Features. 2014 11th IAPR International Workshop on Document Analysis Systems, 314--318.

Digital Library

[7]

Lecun, Y., Bottou, Bengio, & Haffner. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278--2324.

[8]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (May 2017), 84--90.

Digital Library

[9]

Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition.

[10]

Sun, Y., Zhang, J., Meng, Y., Yang, J., & Gui, G. (2019). Smart Phone-Based Intelligent Invoice Classification Method Using Deep Learning. IEEE Access, 7(99), 118046--118054.

[11]

Chen X. Image enhancement effect on the performance of convolutional neural networks. Blekinge Institute of Technology 2019.

[12]

Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Thirty-First AAAI Conference on Artificial Intelligence. 2017.

[13]

Liao M, Shi B, Bai X. Textboxes++: A single-shot oriented scene text detector. IEEE transactions on image processing, 2018, 27(8): 3676--3690.

[14]

Chen K, Seuret M, Wei H, et al. Ground truth model, tool, and dataset for layout analysis of historical documents.Document Recognition and Retrieval XXII. International Society for Optics and Photonics, 2015, 9402: 940204.

[15]

Chen K, Liu C L, Seuret M, et al. Page segmentation for historical document images based on superpixel classification with unsupervised feature learning. 2016 12th IAPR Workshop on Document Analysis Systems (DAS). IEEE, 2016: 299--304.

[16]

Smith R. An overview of the Tesseract OCR engine. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007). IEEE, 2007, 2: 629--633.

Cited By

Singh BKasana G(2024)A review of digital watermarking techniques: Current trends, challenges and opportunitiesWeb Intelligence10.3233/WEB-23028022:4(523-553)Online publication date: 18-Jan-2024
https://doi.org/10.3233/WEB-230280
Shemiakina JLimonova ESkoryukina NArlazarov VNikolaev D(2021)A Method of Image Quality Assessment for Text Recognition on Camera-Captured and Projectively Distorted DocumentsMathematics10.3390/math91721559:17(2155)Online publication date: 3-Sep-2021
https://doi.org/10.3390/math9172155

Index Terms

A method for document image enhancement to improve template-based classification
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

A comprehensive qualitative and quantitative survey on image dehazing based on deep neural networks
Abstract
Image dehazing has become a necessary area of research with the increasing popularity and demand of computer vision systems. Image dehazing is a method to remove haze from an image to improve its visual quality. Dehazing techniques are widely ...
Highlights
- Explored diverse applications for image dehazing, covering methodological solutions.
- Classified cutting-edge deep learning dehazing techniques by neural network types: CNN, GAN, RNN, RCNN.
- Analyzed image quality metrics used to ...
Embedded Textual Content for Document Image Classification with Convolutional Neural Networks
DocEng '16: Proceedings of the 2016 ACM Symposium on Document Engineering

In this paper we introduce a novel document image classification method based on combined visual and textual information. The proposed algorithm's pipeline is inspired to the ones of other recent state-of-the-art methods which perform document image ...
Wavelet-Attention CNN for image classification
Abstract
The feature learning methods based on convolutional neural network (CNN) have successfully produced tremendous achievements in image classification tasks. However, the inherent noise and some other factors may weaken the effectiveness of the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

July 2020

276 pages

ISBN:9781450375603

DOI:10.1145/3409501

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

HPCCT & BDAI 2020

HPCCT & BDAI 2020: 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

July 3 - 6, 2020

Qingdao, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
67
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Singh BKasana G(2024)A review of digital watermarking techniques: Current trends, challenges and opportunitiesWeb Intelligence10.3233/WEB-23028022:4(523-553)Online publication date: 18-Jan-2024
https://doi.org/10.3233/WEB-230280
Shemiakina JLimonova ESkoryukina NArlazarov VNikolaev D(2021)A Method of Image Quality Assessment for Text Recognition on Camera-Captured and Projectively Distorted DocumentsMathematics10.3390/math91721559:17(2155)Online publication date: 3-Sep-2021
https://doi.org/10.3390/math9172155

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten