Skip to main content
Log in

Character segmentation and restoration of Qin-Han bamboo slips using local auto-focus thresholding method

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents a novel auto-thresholding method for character segmentation and restoration of historical Chinese documents. The objective was to segment and restore the characters of Qin-Han bamboo slips effectively with complex background noise. To that end, giving a whole page image with several bamboo slips, the proposed method first extracted and straightened every single slip by connected component analysis. Furthermore, for every straightened slip, a horizontal histogram projection method was used to segment all character regions. After that, a novel auto thresholding method, which was motivated by the auto-focus process of camera, was used to find the optimal threshold of every character region. In this method, the algorithm traversed all the thresholds in a certain range and generated an Effective Character Contour Length (ECCL) value for each threshold, then multi-Gaussian model was used to fit the ECCL curve and the global peak position of ECCL curve was the needed final optimal threshold for the character region. Experimental results showed that the proposed method was effective for historical character segmentation and restoration under complex background noise. Compared to five existing state of the art algorithms, including Otsu, integral image adaptive thresholding method, Sauvola, GAN denoising and SAE algorithm, the proposed method can not only restore the whole characters more completely, but also suppress the noise better.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

The dataset is available in https://gitee.com/cramkl_cjlu/auto-focus-threshold-character-segment.

Code availability

The source code is available in https://gitee.com/cramkl_cjlu/auto-focus-threshold-character-segment.

References

  1. Babu NSA (2019) Character recognition in historical handwritten documents – A survey. Proceedings of the 2019 International Conference on Communication and Signal Processing (ICCSP), F 4-6 April 2019

  2. Calvo-Zaragoza J, Gallego A-J (2019) A selectional auto-encoder approach for document image binarization [J]. Pattern Recogn 86:37–47

    Article  Google Scholar 

  3. Huang Z-K, Ma Y-L Lu L et al (2016) Chinese historic image threshold using adaptive K-means cluster and Bradley’s [J]. 9773:171–179

  4. Kehtarnavaz N, Oh HJ (2003) Development and real-time implementation of a rule-based auto-focus algorithm [J]. Real-Time Imaging 9(3):197–203

    Article  Google Scholar 

  5. Liu CL, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading [J]. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437

    Article  Google Scholar 

  6. Liu S, Liu M, Yang Z (2016) An image auto-focusing algorithm for industrial image measurement. EURASIP J Adv Signal Process 2016(1):70. https://doi.org/10.1186/s13634-016-0368-5

  7. Messina R, Louradour J (2015) Segmentation-free handwritten Chinese text recognition with LSTM-RNN. Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), F 23-26 2015 [C]

  8. Nguyen K, Nguyen C, Nakagawa M (2017) A Segmentation method of single- and multiple-touching characters in offline handwritten Japanese text recognition [J]. IEICE Trans Inf Syst E100.D:2962-72

  9. Panichkriangkrai C, Li L, Hachimura K (2013) Character segmentation and retrieval for learning support system of Japanese historical books. In Proceedings of the 2nd international workshop on historical document imaging and processing. Association for Computing Machinery: Washington, District of Columbia, USA. pp 118–122. https://doi.org/10.1145/2501115.2501129

  10. Santos R et al (2009) Text line segmentation based on morphology and histogram projection. 2009 International conference on document analysis and recognition. pp 651–655. https://doi.org/10.1109/ICDAR.2009.183

  11. Sauvola J, Pietikainen M (2000) Adaptive document image binarization [J]. Pattern Recogn 33(2):225–236

    Article  Google Scholar 

  12. Shirai K et al (2013) Character shape restoration of binarized historical documents by smoothing via geodesic morphology. In 2013 12th International conference on document analysis and recognition. https://doi.org/10.1109/ICDAR.2013.260

  13. Wang QF, Yin F, Liu CL (2012) Handwritten Chinese text recognition by integrating multiple contexts [J]. IEEE Trans Pattern Anal Mach Intell 34(8):1469–1481

    Article  Google Scholar 

  14. Watanabe K, Takahashi S, Kamaya Y et al (2019) Japanese character segmentation for historical handwritten official documents using fully convolutional networks. Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), F 20-25 2019 [C]

  15. Wu YC, Yin F, Chen Z, et al (2018) Handwritten Chinese text recognition using separable multi-dimensional recurrent neural network. Proceedings of the 201714th IAPR International Conference on Document Analysis and Recognition(ICDAR), F [C]

  16. Xu X, Wang Y, Tang J et al (2011) Robust automatic focus algorithm for low contrast images using a new contrast measure [J]. Sensors (Basel) 11(9):8281–8294

    Article  Google Scholar 

  17. Yang H, Jin L, Huang W et al (2018) Dense and tight detection of Chinese characters in historical documents: datasets and a recognition guided detector [J]. IEEE Access 6:30174–30183

    Article  Google Scholar 

  18. Zecheng X, Zenghui S, Lianwen J et al (2016) Fully convolutional recurrent network for handwritten Chinese text recognition. Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), F 4-8 [C]

  19. Zhang J, Guo M (2019) A novel generative adversarial net for calligraphic tablet images denoising [J]. Multimed Tools Appl 79(1–2):119–140

    Google Scholar 

Download references

Funding

Natural Science Foundation of Zhejiang Province, China (No.LY18E050009 and No.Q19E060008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Songxiao Cao.

Ethics declarations

Conflict of interest

The authors hereby declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, S., Shu, Z., Xu, Z. et al. Character segmentation and restoration of Qin-Han bamboo slips using local auto-focus thresholding method. Multimed Tools Appl 81, 8199–8213 (2022). https://doi.org/10.1007/s11042-022-11988-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-11988-z

Keywords

Navigation