skip to main content
10.1145/3409501.3409531acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcctConference Proceedingsconference-collections
research-article

A method for document image enhancement to improve template-based classification

Published: 25 August 2020 Publication History

Abstract

Document classification is one of the significant procedure in paper document recognition. This article proposed a method for document image enhancement to improve the performance of classification in the convolutional neural network. An enhanced document image was generated by extracting the table frame, text region, and shape of the raw document. The template-based classification experiment on 414 customs documents and more than one thousand generated images showed the enhanced image could help CNN model achieve higher accuracies compared to the original images. It could also diminish the interference of noise and unrelated features in document classification optimizing the robustness of networks. The proposed method also demonstrated the channels of the image could provide more information except for color in deep neural networks. As the similarity in the whole image classification tasks, the conclusion might provide ideas for the training of the neural networks in other fields such as street view recognition, medical image recognition, etc.

References

[1]
Benjamin Seidler, Markus Ebbecke, and Michael Gillmann. 2010. SmartFIX statistics: towards systematic document analysis performance evaluation and optimization. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (DAS '10). Association for Computing Machinery, New York, NY, USA, 333--340.
[2]
Gaceb, D., Eglin, V., & Lebourgeois, F. (2014). Classification of business documents for real-time application. Journal Of Real-Time Image Processing, 9(2), 329--345.
[3]
Larry M. Manevitz and Malik Yousef. 2002. One-class svms for document classification. J. Mach. Learn. Res. 2 (March 2002), 139--154.
[4]
Palm, R., Winther, O., & Laws, F. (2017). CloudScan - A configuration-free invoice analysis system using recurrent neural networks.
[5]
Jianying Hu, R., Kashi, & Wilfong. (1999). Document image layout comparison and classification. Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318), 285--288.
[6]
Augereau, O., Journet, N., Vialard, A., & Domenger, J. (2014). Improving Classification of an Industrial Document Image Database by Combining Visual and Textual Features. 2014 11th IAPR International Workshop on Document Analysis Systems, 314--318.
[7]
Lecun, Y., Bottou, Bengio, & Haffner. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278--2324.
[8]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (May 2017), 84--90.
[9]
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition.
[10]
Sun, Y., Zhang, J., Meng, Y., Yang, J., & Gui, G. (2019). Smart Phone-Based Intelligent Invoice Classification Method Using Deep Learning. IEEE Access, 7(99), 118046--118054.
[11]
Chen X. Image enhancement effect on the performance of convolutional neural networks. Blekinge Institute of Technology 2019.
[12]
Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Thirty-First AAAI Conference on Artificial Intelligence. 2017.
[13]
Liao M, Shi B, Bai X. Textboxes++: A single-shot oriented scene text detector. IEEE transactions on image processing, 2018, 27(8): 3676--3690.
[14]
Chen K, Seuret M, Wei H, et al. Ground truth model, tool, and dataset for layout analysis of historical documents.Document Recognition and Retrieval XXII. International Society for Optics and Photonics, 2015, 9402: 940204.
[15]
Chen K, Liu C L, Seuret M, et al. Page segmentation for historical document images based on superpixel classification with unsupervised feature learning. 2016 12th IAPR Workshop on Document Analysis Systems (DAS). IEEE, 2016: 299--304.
[16]
Smith R. An overview of the Tesseract OCR engine. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007). IEEE, 2007, 2: 629--633.

Cited By

View all
  • (2024)A review of digital watermarking techniques: Current trends, challenges and opportunitiesWeb Intelligence10.3233/WEB-23028022:4(523-553)Online publication date: 18-Jan-2024
  • (2021)A Method of Image Quality Assessment for Text Recognition on Camera-Captured and Projectively Distorted DocumentsMathematics10.3390/math91721559:17(2155)Online publication date: 3-Sep-2021

Index Terms

  1. A method for document image enhancement to improve template-based classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence
    July 2020
    276 pages
    ISBN:9781450375603
    DOI:10.1145/3409501
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 August 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Image enhancement
    2. convolutional neural network
    3. document recognition
    4. image classification

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    HPCCT & BDAI 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A review of digital watermarking techniques: Current trends, challenges and opportunitiesWeb Intelligence10.3233/WEB-23028022:4(523-553)Online publication date: 18-Jan-2024
    • (2021)A Method of Image Quality Assessment for Text Recognition on Camera-Captured and Projectively Distorted DocumentsMathematics10.3390/math91721559:17(2155)Online publication date: 3-Sep-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media