Abstract
Processing historical documents is a complicated task in computer vision due to the presence of degradation, which decreases the performance of Machine Learning models. Recently, Deep Learning (DL) models have achieved state-of-the-art accomplishments in processing historical documents. However, these performances do not match the results obtained in other computer vision tasks, and the reason is that such models require large datasets to perform well. In the case of historical documents, only small datasets are available, making it hard for DL models to capture the degradation. In this paper, we propose a framework to overcome issues by following a two-stage approach. Stage-I is devoted to data augmentation. A Generative Adversarial Network (GAN), trained on degraded documents, generates synthesized new training document images. In stage-II, the document images generated in stage-I, are improved using an inverse problem model with a deep neural network structure. Our approach enhances the quality of the generated document images and removes degradation. Our results show that the proposed framework is well suited for binarization tasks. Our model was trained on the 2014 and 2016 DIBCO datasets and tested on the 2018 DIBCO dataset. The obtained results are promising and competitive with the state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adak, C., Chaudhuri, B.B., Blumenstein, M.: A study on idiosyncratic handwriting with impact on writer identification. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 193–198. IEEE (2018)
Azadi, S., Fisher, M., Kim, V.G., Wang, Z., Shechtman, E., Darrell, T.: Multi-content GAN for few-shot font style transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7564–7573 (2018)
Bui, Q.A., Mollard, D., Tabbone, S.: Automatic synthetic document image generation using generative adversarial networks: application in mobile-captured document analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 393–400. IEEE (2019)
Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. Pattern Recogn. 86, 37–47 (2019)
Dumpala, V., Kurupathi, S.R., Bukhari, S.S., Dengel, A.: Removal of historical document degradations using conditional GANs. In: ICPRAM, pp. 145–154 (2019)
Gattal, A., Abbas, F., Laouar, M.R.: Automatic parameter tuning of k-means algorithm for document binarization. In: Proceedings of the 7th International Conference on Software Engineering and New Technologies, pp. 1–4 (2018)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hedjam, R., Cheriet, M.: Historical document image restoration using multispectral imaging system. Pattern Recogn. 46(8), 2297–2312 (2013)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Signal Process. Lett. 11(2), 228–231 (2004)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Paszke, A., et al.: Automatic differentiation in PyTorch (2017)
Pratikakis, I., Zagori, K., Kaddas, P., Gatos, B.: ICFHR 2018 competition on handwritten document image binarization (H-DIBCO 2018). In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 489–493 (2018). https://doi.org/10.1109/ICFHR-2018.2018.00091
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396 (2016)
Saddami, K., Afrah, P., Mutiawani, V., Arnia, F.: A new adaptive thresholding technique for binarizing ancient document. In: 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), pp. 57–61. IEEE (2018)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9446–9454 (2018)
Vo, Q.N., Kim, S.H., Yang, H.J., Lee, G.: Binarization of degraded document images based on hierarchical deep supervised network. Pattern Recogn. 74, 568–586 (2018)
Acknowledgement
The authors thank the NSERC Discovery held by Prof. Cheriet for their financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tamrin, M.O., El-Amine Ech-Cherif, M., Cheriet, M. (2021). A Two-Stage Unsupervised Deep Learning Framework for Degradation Removal in Ancient Documents. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-68787-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68786-1
Online ISBN: 978-3-030-68787-8
eBook Packages: Computer ScienceComputer Science (R0)