Fully convolutional network with dilated convolutions for handwritten text line segmentation

Renton, Guillaume; Soullard, Yann; Chatelain, Clément; Adam, Sébastien; Kermorvant, Christopher; Paquet, Thierry

doi:10.1007/s10032-018-0304-3

Fully convolutional network with dilated convolutions for handwritten text line segmentation

Special Issue Paper
Published: 30 May 2018

Volume 21, pages 177–186, (2018)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Guillaume Renton¹,
Yann Soullard¹,
Clément Chatelain¹,
Sébastien Adam¹,
Christopher Kermorvant^1,2 &
…
Thierry Paquet¹

1810 Accesses
3 Altmetric
Explore all metrics

Abstract

We present a learning-based method for handwritten text line segmentation in document images. Our approach relies on a variant of deep fully convolutional networks (FCNs) with dilated convolutions. Dilated convolutions allow to never reduce the input resolution and produce a pixel-level labeling. The FCN is trained to identify X-height labeling as text line representation, which has many advantages for text recognition. We show that our approach outperforms the most popular variants of FCN, based on deconvolution or unpooling layers, on a public dataset. We also provide results investigating various settings, and we conclude with a comparison of our model with recent approaches defined as part of the cBAD (https://scriptnet.iit.demokritos.gr/competitions/5/) international competition, leading us to a 91.3% F-measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text Line Segmentation: A FCN Based Approach

Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images

Article 21 May 2021

Text Line Extraction in Handwritten Historical Documents

Notes

Please note that a preliminary work has been presented at the ICDAR-WML workshop [25].
https://scriptnet.iit.demokritos.gr/competitions/5/.
https://scriptnet.iit.demokritos.gr/competitions/8/.

References

Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder–decoder architecture for image segmentation (2015). arXiv:1511.00561
Chen, L., Papandreou, V., Kokkinos, I., Murphy, K., Yuille, A.: Semantic image segmentation with deep convolutional nets and fully connected crfs (2014). arXiv:1412.7062
Chen, LC., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs (2016). arXiv:1606.00915
Chen, LC., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation (2017). arXiv:1706.05587
Eskenazi, S., Gomez-Krämer, P., Ogier, J.M.: A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recognit. 64, 1–14 (2017)
Article Google Scholar
Girshick, R.: Fast r-cnn. In: ICCV, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Grüning, T., Labahn, R., Diem, M., Kleber, F., Fiel, S.: Read-bad: a new dataset and evaluation scheme for baseline detection in archival documents (2017). arXiv:1705.03311
Holschneider, M., Kronland-Martinet, R., Morlet, J., Tchamitchian, P.: A real-time algorithm for signal analysis with the help of the wavelet transform. In: Wavelets, pp. 286–297. Springer (1989)
Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced mser trees. In: ECCV, pp. 497–511 (2014)
Krähenbühl, P.: Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS, pp. 109–117 (2011)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., Berg, A.: Ssd: Single shot multibox detector. In: ECCV, pp. 21–37. Springer (2016)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Moysset, B., Adam, P., Wolf, C., Louradour, J.: Space displacement localization neural networks to locate origin points of handwritten text lines in historical documents. In: Workshop on Historical Document Imaging and Processing, August (2015)
Moysset, B., Kermorvant, C., Wolf, C.: Full-page text recognition: learning where to start and when to stop. In: ICDAR (2017)
Moysset, B., Kermorvant, C., Wolf, C., Louradour, J.: Paragraph text segmentation into lines with recurrent neural networks. In: ICDAR, pp. 456–460 (2015)
Moysset, B., Louradour, J., Kermorvant, C., Wolf, C.: Learning text-line localization with shared and local regression neural networks. In: ICFHR (2016)
Murdock, M., Reid, S., Hamilton, B., Reese, J.: Icdar 2015 competition on text line detection in historical documents. In: ICDAR, pp, 1171–1175 (2015)
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: ICCV, pp. 1520–1528 (2015)
Paquet, T., Heutte, L., Koch, G., Chatelain, C.: A categorization system for handwritten documents. IJDAR 15(4), 315–330 (2012)
Article Google Scholar
Parvez, M.T., Mahmoud, S.A.: Offline arabic handwritten text recognition: a survey. ACM Comput. Surv. (CSUR) 45(2), 23 (2013)
Article MATH Google Scholar
Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters—improve semantic segmentation by global convolutional network (2017). arXiv:1703.02719
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. CoRR, abs/1612.08242 (2016)
Renton, G., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T.: Handwritten text line segmentation using fully convolutional network. In 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017, vol. 5, pp. 5–9. IEEE (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597 (2015)
Ryu, J., Koo, H.I., Cho, N.I.: Language-independent text-line extraction algorithm for handwritten documents. Signal Process. Lett. 21(9), 1115–1119 (2014)
Article Google Scholar
Shi, Z., Setlur, S., Govindaraju, V.: A steerable directional local profile technique for extraction of handwritten arabic text lines. In: ICDAR, pp. 176–180 (2009)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)
Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: Icdar 2013 handwriting segmentation contest. In: ICDAR, pp. 1402–1406 (2013)
Stuner, B., Chatelain, C., Paquet, T.: LV-ROVER: lexicon verified recognizer output voting error reduction. CoRR, abs/1707.07432 (2017)
Vo, Q.N., Lee, G.: Dense prediction for text line segmentation in handwritten document images. In: ICIP, pp. 3264–3268 (2016)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions (2015). arXiv:1511.07122
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks (2016). arXiv:1604.04018
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.: Conditional random fields as recurrent neural networks. In: ICCV, pp. 1529–1537 (2015)
Zhu, S., Zanibbi, R.: A text detection system for natural scenes with convolutional feature learning and cascaded classification. In: CVPR, pp. 625–632 (2016)

Download references

Author information

Authors and Affiliations

Normandie Univ, UNIROUEN, UNIHAVRE, INSA Rouen, LITIS, 76000, Rouen, France
Guillaume Renton, Yann Soullard, Clément Chatelain, Sébastien Adam, Christopher Kermorvant & Thierry Paquet
TEKLIA SAS, Paris, France
Christopher Kermorvant

Authors

Guillaume Renton
View author publications
You can also search for this author inPubMed Google Scholar
Yann Soullard
View author publications
You can also search for this author inPubMed Google Scholar
Clément Chatelain
View author publications
You can also search for this author inPubMed Google Scholar
Sébastien Adam
View author publications
You can also search for this author inPubMed Google Scholar
Christopher Kermorvant
View author publications
You can also search for this author inPubMed Google Scholar
Thierry Paquet
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Guillaume Renton.

Additional information

This work has been supported by the French National grant ANR 16-LCV2-0004-01 Labcom INKS. This work is founded by the French region Normandy and the European Union. Europe acts in Normandy with the European Regional Development Fund (ERDF).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Renton, G., Soullard, Y., Chatelain, C. et al. Fully convolutional network with dilated convolutions for handwritten text line segmentation. IJDAR 21, 177–186 (2018). https://doi.org/10.1007/s10032-018-0304-3

Download citation

Received: 29 September 2017
Revised: 16 May 2018
Accepted: 17 May 2018
Published: 30 May 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10032-018-0304-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fully convolutional network with dilated convolutions for handwritten text line segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Text Line Segmentation: A FCN Based Approach

Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images

Text Line Extraction in Handwritten Historical Documents

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now