Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates

Bottaioli, Natalia; Tarride, Solène; Anger, Jérémy; Mowlavi, Seginus; Gardella, Marina; Tadros, Antoine; Facciolo, Gabriele; von Gioi, Rafael Grompone; Kermorvant, Christopher; Morel, Jean-Michel; Preciozzi, Javier

doi:10.1007/978-3-031-70645-5_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14935))

Included in the following conference series:

International Conference on Document Analysis and Recognition

362 Accesses

Abstract

This study evaluates the recently proposed Document Attention Network (DAN) for extracting key-value information from Uruguayan birth certificates, handwritten in Spanish. We investigate two annotation strategies for automatically transcribing handwritten documents, fine-tuning DAN with minimal training data and annotation effort. Experiments were conducted on two datasets containing the same images (201 scans of birth certificates written by more than 15 different writers) but with different annotation methods. Our findings indicate that normalized annotation is more effective for fields that can be standardized, such as dates and places of birth, whereas diplomatic annotation performs much better for fields containing names and surnames, which can not be standardized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SHIBR—The Swedish Historical Birth Records: a semi-annotated dataset

Article Open access 27 June 2021

SIMARA: A Database for Key-Value Information Extraction from Full-Page Handwritten Documents

Callico: A Versatile Open-Source Document Image Annotation Platform

Notes

References

Dan implementation repository by TEKLIA. https://gitlab.teklia.com/atr/dan, release: 0.2.0rc6
Abadie, N., Carlinet, E., Chazalon, J., Duménieu, B.: A benchmark of named entity recognition approaches in historical documents application to 19th century French directories. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237, pp. 445–460. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_30
Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., Vollgraf, R.: FLAIR: an easy-to-use framework for state-of-the-art NLP. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pp. 54–59 (2019)
Google Scholar
Arora, A., et al.: Using ASR methods for OCR. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 663–668. IEEE (2019)
Google Scholar
Bluche, T., Louradour, J., Messina, R.: Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1050–1055. IEEE (2017)
Google Scholar
Boillet, M., Tarride, S., Schneider, Y., Abadie, B., Kesztenbaum, L., Kermorvant, C.: The Socface project: large-scale collection, processing, and analysis of a century of French censuses (2024)
Google Scholar
Cheplygina, V., Varoquaux, G.: Artificial intelligence in science: lessons from shortcomings in machine learning for medical imaging. In: Artificial Intelligence in Science: Challenges, Opportunities and the Future of Research. Organization for Economic Co-operation and Development (OECD) (2023)
Google Scholar
Clérice, T., et al.: CATMuS medieval: a multilingual large-scale cross-century dataset in Latin script for handwritten text recognition and beyond (2024)
Google Scholar
Constum, T. et al.: Recognition and information extraction in historical handwritten tables: toward understanding early 20th century Paris census. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. LNCS, vol 13237, pp. 143–157 Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_10
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 508–524 (2022)
Article Google Scholar
Coquenet, D., Chatelain, C., Paquet, T.: DAN: a segmentation-free document attention network for handwritten document recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 8227–8243 (2023)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Proceedings of the 21st International Conference on Neural Information Processing Systems, NIPS 2008, pp. 545–552. Curran Associates Inc., Red Hook, NY, USA (2008)
Google Scholar
Grosicki, E., Carré, M., Brodin, J.M., Geoffrois, E.: Results of the RIMES evaluation campaign for handwritten mail processing. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 941–945. IEEE (2009)
Google Scholar
Huang, Y., Lv, T., Cui, L., Lu, Y., Wei, F.: LayoutLMv3: pre-training for document AI with unified text and image masking. In: Proceedings of the 30th ACM International Conference on Multimedia, MM 2022, pp. 4083–4091. ACM, New York, NY, USA (2022). https://doi.org/10.1145/3503161.3548112
Kim, G., et al.: OCR-free document understanding transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Comput. Vision - ECCV 2022, pp. 498–517. Springer Nature Switzerland, Cham (2022)
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Proc. Syst . 25 (2012)
Google Scholar
Li, M., et al.: TrOCR: transformer-based optical character recognition with pre-trained models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 13094–13102 (2023)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013)
Google Scholar
Monroc, C.B., Miret, B., Bonhomme, M.-L., Kermorvant, C.: A comprehensive study of open-source libraries for named entity recognition on handwritten historical documents. In: Uchida, S., Barney, E., Eglin, V. (eds.) Document Analysis Systems: 15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22–25, 2022, Proceedings, pp. 429–444. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_29
Chapter Google Scholar
Nion, T., et al.: Handwritten information extraction from historical census documents. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 822–826. IEEE (2013)
Google Scholar
Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7–12. IEEE (2018)
Google Scholar
Peng, Q., et al.: ERNIE-layout: layout knowledge enhanced pre-training for visually-rich document understanding. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 3744–3756. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Dec 2022). https://doi.org/10.18653/v1/2022.findings-emnlp.274, https://aclanthology.org/2022.findings-emnlp.274
Petitpierre, R., Kramer, M., Rappo, L.: An end-to-end pipeline for historical censuses processing. Int. J. Doc. Anal. Recogn. (IJDAR) 26(4), 419–432 (2023)
Article Google Scholar
Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 67–72. IEEE (2017)
Google Scholar
Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 67–72. IEEE (2017)
Google Scholar
Romero, V., et al.: The ESPOSALLES database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recogn. 46(6), 1658–1669 (2013). https://doi.org/10.1016/j.patcog.2012.11.024, https://www.sciencedirect.com/science/article/pii/S0031320312005080
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Singh, S.S., Karayev, S.: Full page handwriting recognition via image to sequence extraction. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 55–69. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_4
Chapter Google Scholar
Tarride, S., Boillet, M., Kermorvant, C.: Key-Value Information Extraction from Full Handwritten Pages. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. LNCS, vol 14188, pp. 185–204 Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41679-8_11
Tarride, S., Boillet, M., Moufflet, J.-F., Kermorvant, C.: SIMARA: a database for key-value information extraction from full-page handwritten documents. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds.) Document Analysis and Recognition - ICDAR 2023: 17th International Conference, San José, CA, USA, August 21–26, 2023, Proceedings, Part III, pp. 421–437. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-41682-8_26
Chapter Google Scholar
Tarride, S., Lemaitre, A., Coüasnon, B., Tardivel, S.: A comparative study of information extraction strategies using an attention-based neural network. In: Uchida, S., Barney, E., Eglin, V. (eds.) Document Analysis Systems: 15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22–25, 2022, Proceedings, pp. 644–658. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_43
Chapter Google Scholar
Tarride, S., et al.: Large-scale genealogical information extraction from handwritten Quebec parish records. Int. J. Doc. Anal. Recogn. (IJDAR) 26(3), 255–272 (2023). https://doi.org/10.1007/s10032-023-00427-w
Article Google Scholar
Tu, Y., Guo, Y., Chen, H., Tang, J.: LayoutMask: enhance text-layout interaction in multi-modal pre-training for document understanding. In: Annual Meeting of the Association for Computational Linguistics (2023). https://api.semanticscholar.org/CorpusID:258967524
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing System, vol. 30 (2017)
Google Scholar
Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B., Cohen, S.: Start, follow, read: end-to-end full-page handwriting recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 372–388. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_23
Chapter Google Scholar

Download references

Acknowledgments

The research that originated the results presented in this publication was partly supported by the Agencia Nacional de Investigación e Innovación (ANII) and the France 2030 CollabNext project.

Author information

Authors and Affiliations

Université Paris-Saclay, ENS Paris-Saclay, CNRS, Centre Borelli, Paris, France
Natalia Bottaioli, Jérémy Anger, Seginus Mowlavi, Antoine Tadros, Gabriele Facciolo & Rafael Grompone von Gioi
Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Natalia Bottaioli & Javier Preciozzi
Digital Sense, Montevideo, Uruguay
Natalia Bottaioli & Javier Preciozzi
TEKLIA, Paris, France
Solène Tarride & Christopher Kermorvant
IMPA, Rio de Janeiro, Brazil
Marina Gardella
City University of Hong Kong, Hong Kong, China
Jean-Michel Morel

Authors

Natalia Bottaioli
View author publications
You can also search for this author in PubMed Google Scholar
Solène Tarride
View author publications
You can also search for this author in PubMed Google Scholar
Jérémy Anger
View author publications
You can also search for this author in PubMed Google Scholar
Seginus Mowlavi
View author publications
You can also search for this author in PubMed Google Scholar
Marina Gardella
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Tadros
View author publications
You can also search for this author in PubMed Google Scholar
Gabriele Facciolo
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Grompone von Gioi
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Kermorvant
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Michel Morel
View author publications
You can also search for this author in PubMed Google Scholar
Javier Preciozzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalia Bottaioli .

Editor information

Editors and Affiliations

Nantes Université, Nantes, France
Harold Mouchère
Wuhan University of Technology, Wuhan, China
Anna Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bottaioli, N. et al. (2024). Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14935. Springer, Cham. https://doi.org/10.1007/978-3-031-70645-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-70645-5_4
Published: 11 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70644-8
Online ISBN: 978-3-031-70645-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates