Abstract
In this paper, we propose a method for detecting handwritten ancient texts. The challenges in detecting this type of data are: the complexity of the layout of handwritten ancient texts, the varying text sizes, mixed arrangement of pictures and texts, the high number of hand-drawn patterns and the high background noise. Unlike general scene text detection tasks (ICDAR, TotalText, etc.), the texts in the images of ancient books are more densely distributed. For the features of the dataset, we propose a detection model based on cascade feature fusion called DFCOS, which aims to improve the fusion of localization information in lower layers. Specifically, bottom-up paths are created to use more localization signals from low-levels, and we incorporate skip connections to better extract information in the backbone, and then improve our model by parallel cascading. We verified the effectiveness of our DFCOS on HWAD (Handwritten Ancient-Books Dataset), a dataset containing four languages - Yi, Chinese, Tibetan and Tangut - provided by the Institute of Yi of Guizhou University of Engineering Science and National Digital Library of China, and its precision, recall and F-measure outperformed most of the existing text detection models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: ICCV (2019)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature Pyramid Networks for Object Detection, arXiv preprint. arXiv: 1612.03144 (2017)
Handwritten Ancient-Books Dataset: HWAD. Unpublished Data
Dai, J., et al.: Deformable convolutional networks. In: ICCV (2017)
Bodla, N., Singh, B., Chellappa, R., Davis, L.: Improving object detection with one line of code. In: ICCV (2017)
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: faster and better learning for bounding box regression. In: AAAI (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: a fast text detector with a single deep neural network. In: AAAI (2017)
Zhou, X., et al.: East: an efficient and accurate scene text detector. In: CVPR (2017)
Zhang, C., et al.: Look more than once: an accurate detector for text of arbitrary shapes. In: CVPR (2019)
Deng, D., Liu, H., Li, X., Cai, D.: PixelLink: detecting scene text via instance segmentation. In: AAAI, pp. 6773–6780 (2018)
Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: CVPR (2019)
Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: AAAI (2020)
Su, X., Gao, G.: A knowledge-based recognition system for historical Mongolian documents. Int. J. Document Anal. Recogn. Neural Netw. 124, 117–129 (2020)
Shi, X., Huang, Y., Liu, Y.: Text on oracle rubbing segmentation method based on connected domain. In: Proceedings of IEEE Advanced Information Management, Communicates Electronic and Automation Control Conference, pp. 414–418. IEEE Computer Society Press, Anyang (2016)
Hailin, Y., Lianwen, J., Weiguo, H., et al.: Dense and tight detection of Chinese characters in historical documents: datasets and a recognition guided detector. IEEE Access 6, 30174–30183 (2018)
Han, Y.H., Wang, W.L., Wang, Y.Q.: Research on automatic block binarization method of stained Tibetan historical document image based on Lab color space. In: International Forum on Management, Education and Information Technology Application, pp. 327–338 (2018)
Huang, G., Liu, Z., Maaten, L., Weinberger, Q.: Densely connected convolutional networks. In: CVPR (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: ICCV (2016)
Rezatofighi, H., Tsoi, N., Gwak, J.Y., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: CVPR (2019)
Karatzas, D., et al.: ICDAR 2015 competition on robust reading. In: ICDAR 2015 (2015)
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features. In: ICCV (2019)
Paszke, A., et al.: Automatic differentiation in PyTorch (2017)
Chen, K., et al.: MMDetection: Open MMLab Detection Toolbox and Benchmark, arXiv preprint. arXiv: 1906.07155 (2019)
Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: Proceedings of CVPR, pp. 3482–3490 (2017)
Liao, M., Zhu, Z., Shi, B., Xia, G.-S., Bai, X.: Rotation-sensitive regression for oriented scene text detection. In: CVPR (2018)
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: CVPR (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Feng, R., Zhao, F., Chen, S., Zhang, S., Wang, D. (2021). A Handwritten Text Detection Model Based on Cascade Feature Fusion Network Improved by FCOS. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12917. Springer, Cham. https://doi.org/10.1007/978-3-030-86159-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-86159-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86158-2
Online ISBN: 978-3-030-86159-9
eBook Packages: Computer ScienceComputer Science (R0)