Skip to main content

Writer Identification in Historical Handwritten Documents: A Latin Dataset and a Benchmark

  • Conference paper
  • First Online:
Image Analysis and Processing - ICIAP 2023 Workshops (ICIAP 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14366))

Included in the following conference series:

  • 183 Accesses

Abstract

Writer identification refers to the process of determining or attributing the authorship of a document to a specific individual through the analysis of various elements such as writing style, linguistic characteristics, and other textual features. This is a relevant task in heterogeneous fields such as cybersecurity, forensics, or linguistics and becomes particularly challenging when considering historical documents. In fact, the latter might present deterioration due to time, often lack signatures, and could be authored by multiple people. Complicating matters further, scribes were trained to mimic handwriting meticulously when copying manuscripts, making author identification of such documents even more difficult. In this context, this paper introduces a curated collection of Latin documents from the Genesis and Gospel of Matthew specifically gathered for the purpose of exploring the writer identification task. In particular, the dataset comprises over 400 pages, written by nine distinct persons. The primary objective is to explore the efficacy of state-of-the-art deep learning architectures in accurately ascribing historical texts to their rightful authors. To this end, this paper conducts extensive experiments, utilizing varying training set sizes and employing diverse pre-processing techniques to assess the performance and capabilities of these renowned models on the writer identification task while also providing the community with a baseline on the introduced collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adam, K., Baig, A., Al-Maadeed, S., Bouridane, A., El-Menshawy, S.: KERTAS: dataset for automatic dating of ancient Arabic manuscripts. Int. J. Doc. Anal. Recogn. 21, 283–290 (2018)

    Article  Google Scholar 

  2. Amelin, K., Granichin, O., Kizhaeva, N., Volkovich, Z.: Patterning of writing style evolution by means of dynamic similarity. Pattern Recogn. 77, 45–64 (2018)

    Article  Google Scholar 

  3. Andronache, I., Liritzis, I., Jelinek, H.F.: Fractal algorithms and RGB image processing in scribal and ink identification on an 1819 secret initiation manuscript to the “Philike Hetaereia’’. Sci. Rep. 13(1), 1735 (2023)

    Article  Google Scholar 

  4. Avola, D., Bacciu, A., Cinque, L., Fagioli, A., Marini, M.R., Taiello, R.: Study on transfer learning capabilities for pneumonia classification in chest-x-rays images. Comput. Methods Programs Biomed. 221, 106833 (2022)

    Article  Google Scholar 

  5. Avola, D., Bigdello, M.J., Cinque, L., Fagioli, A., Marini, M.R.: R-signet: reduced space writer-independent feature learning for offline writer-dependent signature verification. Pattern Recogn. Lett. 150, 189–196 (2021)

    Article  Google Scholar 

  6. Avola, D., Cascio, M., Cinque, L., Fagioli, A., Foresti, G.L.: Affective action and interaction recognition by multi-view representation learning from handcrafted low-level skeleton features. Int. J. Neural Syst. 2250040 (2022)

    Google Scholar 

  7. Avola, D., Cinque, L., Fagioli, A., Filetti, S., Grani, G., Rodolà, E.: Multimodal feature fusion and knowledge-driven learning via experts consult for thyroid nodule classification. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2527–2534 (2021)

    Article  Google Scholar 

  8. Avola, D., Cinque, L., Fagioli, A., Foresti, G.L.: Sire-networks: convolutional neural networks architectural extension for information preservation via skip/residual connections and interlaced auto-encoders. Neural Netw. 153, 386–398 (2022)

    Article  Google Scholar 

  9. Avola, D., et al.: Medicinal boxes recognition on a deep transfer learning augmented reality mobile application. In: Proceedings of the International Conference on Image Analysis and Processing, pp. 489–499 (2022)

    Google Scholar 

  10. Avola, D., Cinque, L., Fagioli, A., Foresti, G.L., Massaroni, C.: Deep temporal analysis for non-acted body affect recognition. IEEE Trans. Affect. Comput. 13(3), 1366–1377 (2020)

    Article  Google Scholar 

  11. Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)

    Article  Google Scholar 

  12. Chammas, M., Makhoul, A., Demerjian, J.: Writer identification for historical handwritten documents using a single feature extraction method. In: International Conference on Machine Learning and Applications, pp. 1–6 (2020)

    Google Scholar 

  13. Chen, Z., Yu, H.X., Wu, A., Zheng, W.S.: Level online writer identification. Int. J. Comput. Vis. 129(5), 1394–1409 (2021)

    Article  Google Scholar 

  14. Christlein, V., Nicolaou, A., Seuret, M., Stutzmann, D., Maier, A.: ICDAR 2019 competition on image retrieval for historical handwritten documents. In: International Conference on Document Analysis and Recognition, pp. 1505–1509 (2019)

    Google Scholar 

  15. Cilia, N.D., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Di Freca, A.S.: An end-to-end deep learning system for medieval writer identification. Pattern Recogn. Lett. 129, 137–143 (2020)

    Article  Google Scholar 

  16. De Stefano, C., Fontanella, F., Maniaci, M., Scotto di Freca, A.: A method for scribe distinction in medieval manuscripts using page layout features. In: International Conference on Image Analysis and Processing, pp. 393–402 (2011)

    Google Scholar 

  17. De Stefano, C., Maniaci, M., Fontanella, F., di Freca, A.S.: Reliable writer identification in medieval manuscripts through page layout features: the “Avila’’ bible case. Eng. Appl. Artif. Intell. 72, 99–110 (2018)

    Article  Google Scholar 

  18. Decker, S., Hassard, J., Rowlinson, M.: Rethinking history and memory in organization studies: the case for historiographical reflexivity. Hum. Relat. 74(8), 1123–1155 (2021)

    Article  Google Scholar 

  19. Dolfing, H.J., Bellegarda, J., Chorowski, J., Marxer, R., Laurent, A.: The “ScribbleLens” Dutch historical handwriting corpus. In: International Conference on Frontiers in Handwriting Recognition, pp. 67–72 (2020)

    Google Scholar 

  20. Foltỳnek, T., Meuschke, N., Gipp, B.: Academic plagiarism detection: a systematic literature review. ACM Comput. Surv. (CSUR) 52(6), 1–42 (2019)

    Article  Google Scholar 

  21. Gan, J., Wang, W., Lu, K.: Compressing the CNN architecture for in-air handwritten Chinese character recognition. Pattern Recogn. Lett. 129, 190–197 (2020)

    Article  Google Scholar 

  22. He, S., Schomaker, L.: Deep adaptive learning for writer identification based on single handwritten word images. Pattern Recogn. 88, 64–74 (2019)

    Article  Google Scholar 

  23. He, S., Schomaker, L.: GR-RNN: global-context residual recurrent neural networks for writer identification. Pattern Recogn. 117, 107975 (2021)

    Article  Google Scholar 

  24. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  25. Kleber, F., Fiel, S., Diem, M., Sablatnig, R.: CVL-database: an off-line database for writer retrieval, writer identification and word spotting. In: International Conference on Document Analysis and Recognition, pp. 560–564 (2013)

    Google Scholar 

  26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)

    Google Scholar 

  27. Lastilla, L., Ammirati, S., Firmani, D., Komodakis, N., Merialdo, P., Scardapane, S.: Self-supervised learning for medieval handwriting identification: a case study from the Vatican apostolic library. Inf. Process. Manag. 59(3), 102875 (2022)

    Article  Google Scholar 

  28. Maarand, M., Beyer, Y., Kåsen, A., Fosseide, K.T., Kermorvant, C.: A comprehensive comparison of open-source libraries for handwritten text recognition in Norwegian. In: International Workshop on Document Analysis Systems, pp. 399–413 (2022)

    Google Scholar 

  29. Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5, 39–46 (2002)

    Article  Google Scholar 

  30. Mohammed, H., Marthot-Santaniello, I., Märgner, V.: GRK-Papyri: a dataset of Greek handwriting on papyri for the task of writer identification. In: International Conference on Document Analysis and Recognition, pp. 726–731 (2019)

    Google Scholar 

  31. Nasir, S., Siddiqi, I., Moetesum, M.: Writer characterization from handwriting on papyri using multi-step feature learning. In: International Conference on Document Analysis and Recognition Workshop, pp. 451–465 (2021)

    Google Scholar 

  32. Nikolaidou, K., Seuret, M., Mokayed, H., Liwicki, M.: A survey of historical document image datasets. Int. J. Doc. Anal. Recogn. 25(4), 305–338 (2022)

    Article  Google Scholar 

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 preprint, pp. 1–14 (2014)

  34. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114 (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported by “A Brain Computer Interface (BCI) based System for Transferring Human Emotions inside Unmanned Aerial Vehicles (UAVs)” Sapienza Research Projects (Protocol number: RM1221816C1CF63B); and Departmental Strategic Plan (DSP) of the University of Udine - Interdepartmental Project on Artificial Intelligence (2020–25); and “A proactive counter-UAV system to protect army tanks and patrols in urban areas (PROACTIVE COUNTER-UAV)” project of the Italian Ministry of Defence (Number 2066/16.12.2019); and the MICS (Made in Italy - Circular and Sustainable) Extended Partnership and received funding from Next-Generation EU (Italian PNRR - M4 C2, Invest 1.3 - D.D. 1551.11-10-2022, PE00000004). CUP MICS B53C22004130001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessio Fagioli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fagioli, A., Avola, D., Cinque, L., Colombi, E., Foresti, G.L. (2024). Writer Identification in Historical Handwritten Documents: A Latin Dataset and a Benchmark. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing - ICIAP 2023 Workshops. ICIAP 2023. Lecture Notes in Computer Science, vol 14366. Springer, Cham. https://doi.org/10.1007/978-3-031-51026-7_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-51026-7_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-51025-0

  • Online ISBN: 978-3-031-51026-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics