Skip to main content

Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2024 Workshops (ICDAR 2024)

Abstract

This study evaluates the recently proposed Document Attention Network (DAN) for extracting key-value information from Uruguayan birth certificates, handwritten in Spanish. We investigate two annotation strategies for automatically transcribing handwritten documents, fine-tuning DAN with minimal training data and annotation effort. Experiments were conducted on two datasets containing the same images (201 scans of birth certificates written by more than 15 different writers) but with different annotation methods. Our findings indicate that normalized annotation is more effective for fields that can be standardized, such as dates and places of birth, whereas diplomatic annotation performs much better for fields containing names and surnames, which can not be standardized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://unstats.un.org/unsd/demographic-social/Standards-and-Methods/files/Handbooks/crvs/crvs-mgt-E.pdf.

  2. 2.

    https://www.undp.org/africa/blog/having-legal-identity-fundamental-human-rights.

  3. 3.

    https://sdgs.un.org/es/goals.

  4. 4.

    https://getinthepicture.org.

  5. 5.

    https://www.gub.uy/ministerio-educacion-cultura/registro-civil.

References

  1. Dan implementation repository by TEKLIA. https://gitlab.teklia.com/atr/dan, release: 0.2.0rc6

  2. Abadie, N., Carlinet, E., Chazalon, J., Duménieu, B.: A benchmark of named entity recognition approaches in historical documents application to 19th century French directories. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237, pp. 445–460. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_30

  3. Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., Vollgraf, R.: FLAIR: an easy-to-use framework for state-of-the-art NLP. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pp. 54–59 (2019)

    Google Scholar 

  4. Arora, A., et al.: Using ASR methods for OCR. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 663–668. IEEE (2019)

    Google Scholar 

  5. Bluche, T., Louradour, J., Messina, R.: Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1050–1055. IEEE (2017)

    Google Scholar 

  6. Boillet, M., Tarride, S., Schneider, Y., Abadie, B., Kesztenbaum, L., Kermorvant, C.: The Socface project: large-scale collection, processing, and analysis of a century of French censuses (2024)

    Google Scholar 

  7. Cheplygina, V., Varoquaux, G.: Artificial intelligence in science: lessons from shortcomings in machine learning for medical imaging. In: Artificial Intelligence in Science: Challenges, Opportunities and the Future of Research. Organization for Economic Co-operation and Development (OECD) (2023)

    Google Scholar 

  8. Clérice, T., et al.: CATMuS medieval: a multilingual large-scale cross-century dataset in Latin script for handwritten text recognition and beyond (2024)

    Google Scholar 

  9. Constum, T. et al.: Recognition and information extraction in historical handwritten tables: toward understanding early 20th century Paris census. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. LNCS, vol 13237, pp. 143–157 Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_10

  10. Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 508–524 (2022)

    Article  Google Scholar 

  11. Coquenet, D., Chatelain, C., Paquet, T.: DAN: a segmentation-free document attention network for handwritten document recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 8227–8243 (2023)

    Google Scholar 

  12. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  13. Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In: Proceedings of the 21st International Conference on Neural Information Processing Systems, NIPS 2008, pp. 545–552. Curran Associates Inc., Red Hook, NY, USA (2008)

    Google Scholar 

  14. Grosicki, E., Carré, M., Brodin, J.M., Geoffrois, E.: Results of the RIMES evaluation campaign for handwritten mail processing. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 941–945. IEEE (2009)

    Google Scholar 

  15. Huang, Y., Lv, T., Cui, L., Lu, Y., Wei, F.: LayoutLMv3: pre-training for document AI with unified text and image masking. In: Proceedings of the 30th ACM International Conference on Multimedia, MM 2022, pp. 4083–4091. ACM, New York, NY, USA (2022). https://doi.org/10.1145/3503161.3548112

  16. Kim, G., et al.: OCR-free document understanding transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Comput. Vision - ECCV 2022, pp. 498–517. Springer Nature Switzerland, Cham (2022)

    Chapter  Google Scholar 

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Proc. Syst . 25 (2012)

    Google Scholar 

  18. Li, M., et al.: TrOCR: transformer-based optical character recognition with pre-trained models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 13094–13102 (2023)

    Google Scholar 

  19. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013)

    Google Scholar 

  21. Monroc, C.B., Miret, B., Bonhomme, M.-L., Kermorvant, C.: A comprehensive study of open-source libraries for named entity recognition on handwritten historical documents. In: Uchida, S., Barney, E., Eglin, V. (eds.) Document Analysis Systems: 15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22–25, 2022, Proceedings, pp. 429–444. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_29

    Chapter  Google Scholar 

  22. Nion, T., et al.: Handwritten information extraction from historical census documents. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 822–826. IEEE (2013)

    Google Scholar 

  23. Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7–12. IEEE (2018)

    Google Scholar 

  24. Peng, Q., et al.: ERNIE-layout: layout knowledge enhanced pre-training for visually-rich document understanding. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 3744–3756. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Dec 2022). https://doi.org/10.18653/v1/2022.findings-emnlp.274, https://aclanthology.org/2022.findings-emnlp.274

  25. Petitpierre, R., Kramer, M., Rappo, L.: An end-to-end pipeline for historical censuses processing. Int. J. Doc. Anal. Recogn. (IJDAR) 26(4), 419–432 (2023)

    Article  Google Scholar 

  26. Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 67–72. IEEE (2017)

    Google Scholar 

  27. Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 67–72. IEEE (2017)

    Google Scholar 

  28. Romero, V., et al.: The ESPOSALLES database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recogn. 46(6), 1658–1669 (2013). https://doi.org/10.1016/j.patcog.2012.11.024, https://www.sciencedirect.com/science/article/pii/S0031320312005080

  29. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  30. Singh, S.S., Karayev, S.: Full page handwriting recognition via image to sequence extraction. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 55–69. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_4

    Chapter  Google Scholar 

  31. Tarride, S., Boillet, M., Kermorvant, C.: Key-Value Information Extraction from Full Handwritten Pages. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. LNCS, vol 14188, pp. 185–204 Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41679-8_11

  32. Tarride, S., Boillet, M., Moufflet, J.-F., Kermorvant, C.: SIMARA: a database for key-value information extraction from full-page handwritten documents. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds.) Document Analysis and Recognition - ICDAR 2023: 17th International Conference, San José, CA, USA, August 21–26, 2023, Proceedings, Part III, pp. 421–437. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-41682-8_26

    Chapter  Google Scholar 

  33. Tarride, S., Lemaitre, A., Coüasnon, B., Tardivel, S.: A comparative study of information extraction strategies using an attention-based neural network. In: Uchida, S., Barney, E., Eglin, V. (eds.) Document Analysis Systems: 15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22–25, 2022, Proceedings, pp. 644–658. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_43

    Chapter  Google Scholar 

  34. Tarride, S., et al.: Large-scale genealogical information extraction from handwritten Quebec parish records. Int. J. Doc. Anal. Recogn. (IJDAR) 26(3), 255–272 (2023). https://doi.org/10.1007/s10032-023-00427-w

    Article  Google Scholar 

  35. Tu, Y., Guo, Y., Chen, H., Tang, J.: LayoutMask: enhance text-layout interaction in multi-modal pre-training for document understanding. In: Annual Meeting of the Association for Computational Linguistics (2023). https://api.semanticscholar.org/CorpusID:258967524

  36. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing System, vol. 30 (2017)

    Google Scholar 

  37. Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B., Cohen, S.: Start, follow, read: end-to-end full-page handwriting recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 372–388. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_23

    Chapter  Google Scholar 

Download references

Acknowledgments

The research that originated the results presented in this publication was partly supported by the Agencia Nacional de Investigación e Innovación (ANII) and the France 2030 CollabNext project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalia Bottaioli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bottaioli, N. et al. (2024). Normalized vs Diplomatic Annotation: A Case Study of Automatic Information Extraction from Handwritten Uruguayan Birth Certificates. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14935. Springer, Cham. https://doi.org/10.1007/978-3-031-70645-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70645-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70644-8

  • Online ISBN: 978-3-031-70645-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics