Skip to main content

Automated Dating of Medieval Manuscripts with a New Dataset

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2024 Workshops (ICDAR 2024)

Abstract

Automated manuscript dating is a long-awaited valuable tool for scholars in their research of historical documents. This study presents a new dataset of medieval Hebrew manuscripts annotated with dates. Our initial experiments focus on documents written in the Ashkenazi square script, allowing us to refine our methodologies in a manageable setting before addressing more complex script types. Also, to accurately reflect the script’s historical evolution, we adopt a novel classification approach for time periods of varying lengths, which acknowledges the uneven development of the script over time. We perform extensive experimentation with a variety of deep-learning models and show that the regression approach is more appropriate for estimating the date of the manuscript compared to categorical classification.

D. Vasyutinsky-Shapira—The participation of Dr. Daria Vasyutinsky Shapira was funded by the European Union (ERC, MiDRASH, Project No.@ 101071829). Views and opinions expressed are, however, those of the author only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adam, K., Al-Maadeed, S., Akbari, Y.: Hierarchical fusion using subsets of multi-features for historical Arabic manuscript dating. J. Imaging 8(3), 60 (2022)

    Article  Google Scholar 

  2. Assael, Y., et al.: Restoring and attributing ancient texts using deep neural networks. Nature 603(7900), 280–283 (2022)

    Article  Google Scholar 

  3. Bao, H., Dong, L., Piao, S., Wei, F.: BEiT: BERT pre-training of image transformers. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=p-BhZSz59o4

  4. Beit-Arié, M., Engel, E.: Specimens of Mediaeval Hebrew scripts, vol. 3. Israel Academy of Sciences and Humanities (2017)

    Google Scholar 

  5. Boldsen, S., Paggio, P.: Automatic dating of medieval charters from Denmark. In: CEUR Workshop Proceeding (2019)

    Google Scholar 

  6. Christlein, V., Gropp, M., Maier, A.: Automatic dating of historical documents. Kodikologie und Paläographie im digitalen Zeitalter 4, 151–164 (2017)

    Google Scholar 

  7. Cloppet, F., Eglin, V., Helias-Baron, M., Kieu, C., Vincent, N., Stutzmann, D.: Icdar2017 competition on the classification of medieval handwritings in Latin script. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1371–1376. IEEE (2017)

    Google Scholar 

  8. Cloppet, F., Eglin, V., Stutzmann, D., Vincent, N., et al.: ICFHR2016 competition on the classification of medieval handwritings in Latin script. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 590–595. IEEE (2016)

    Google Scholar 

  9. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021)

    Google Scholar 

  10. Droby, A., Kurar Barakat, B., Vasyutinsky Shapira, D., Rabaev, I., El-Sana, J.: VML-HP: Hebrew paleography dataset. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021, Part IV 16. LNCS, vol. 12824, pp. 205–220. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86337-1_14

    Chapter  Google Scholar 

  11. Droby, A., Rabaev, I., Shapira, D.V., Kurar Barakat, B., El-Sana, J.: Digital Hebrew paleography: script types and modes. J. Imaging 8(5) (2022). https://doi.org/10.3390/jimaging8050143

  12. Droby, A., Shapira, D.V., Rabaev, I., Barakat, B.K., El-Sana, J.: Hard and soft labeling for hebrew paleography: a case study. In: Uchida, S., Barney, E., Eglin, V. (eds.) DAS 2022. LNCS, vol. 13237, pp. 492–506. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_33

    Chapter  Google Scholar 

  13. Engel, E.: Calamus or Chisel: On the History of the Ashkenazic Script, pp. 183 – 197. Brill, Leiden, The Netherlands (2010). https://doi.org/10.1163/ej.9789004179547.i-398.39

  14. Engel, E.: Between France and Germany: gothic characteristics in Ashkenazi script. Nicholas de Lange and Judith Olszowy-Schlanger, Manuscrits hébreux et arabes: Mélanges en l’honneur de Colette Sirat, pp. 197–219 (2014)

    Google Scholar 

  15. Faigenbaum-Golovin, S., Shaus, A., Sober, B.: Computational handwriting analysis of ancient Hebrew inscriptions - a survey. IEEE BITS Inf. Theory Mag. 2(1), 90–101 (2022). https://doi.org/10.1109/MBITS.2022.3197559

    Article  Google Scholar 

  16. Feuerverger, A., Hall, P., Tilahun, G., Gervers, M.: Using statistical smoothing to date medieval manuscripts. In: Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen, vol. 1, pp. 321–332. Institute of Mathematical Statistics (2008)

    Google Scholar 

  17. Hamid, A., Bibi, M., Moetesum, M., Siddiqi, I.: Deep learning based approach for historical manuscript dating. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 967–972 (2019). https://doi.org/10.1109/ICDAR.2019.00159

  18. He, S., Samara, P., Burgers, J., Schomaker, L.: Discovering visual element evolutions for historical document dating. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7–12 (2016). https://doi.org/10.1109/ICFHR.2016.0015

  19. He, S., Samara, P., Burgers, J., Schomaker, L.: Historical manuscript dating based on temporal pattern codebook. Comput. Vis. Image Underst. 152, 167–175 (2016). https://doi.org/10.1016/j.cviu.2016.08.008

    Article  Google Scholar 

  20. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  21. Li, J., Xu, Y., Lv, T., Cui, L., Zhang, C., Wei, F.: DIT: self-supervised pre-training for document image transformer (2022)

    Google Scholar 

  22. Li, Y., Genzel, D., Fujii, Y., Popat, A.C.: Publication date estimation for printed historical documents using convolutional neural networks. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 99–106 (2015)

    Google Scholar 

  23. Liu, Z., et al.: Swin transformer v2: scaling up capacity and resolution. In: International Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  24. Liu, Z., et al.: dosovitskiy2021an: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  25. Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  26. Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. In: International Conference on Learning Representations (2022)

    Google Scholar 

  27. Molina, A., Gomez, L., Ramos Terrades, O., Lladós, J.: A generic image retrieval method for date estimation of historical document collections. In: Uchida, S., Barney, E., Eglin, V. (eds.) DAS 2022. LNCS, vol. 13237, pp. 583–597. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_39

    Chapter  Google Scholar 

  28. Naamneh, S., et al.: Classifying the scripts of Aramaic incantation bowls. In: Proceedings of the 7th International Workshop on Historical Document Imaging and Processing, pp. 55–60. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3604951.3605510

  29. Olszowy-Schlanger, J.: The early developments of Hebrew scripts in North-Western Europe. Gazette du livre médiéval 63(1), 1–19 (2017)

    Article  Google Scholar 

  30. Paparrigopoulou, A., Kougia, V., Konstantinidou, M., et al.: Greek literary papyri dating benchmark. Preprint 2272076 (2023). https://doi.org/10.21203/rs.3.rs-2272076/v2

  31. Paparrigopoulou, A., Pavlopoulos, J., Konstantinidou, M.: Dating Greek papyri images with machine learning (2022). https://doi.org/10.21203/rs.3.rs-2272076/v1

  32. Pavlopoulos, J., Konstantinidou, M., Marthot-Santaniello, I., Essler, H., Paparigopoulou, A.: Dating Greek papyri with text regression. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 10001–10013. Association for Computational Linguistics, Toronto, Canada (2023). https://doi.org/10.18653/v1/2023.acl-long.556

  33. Pavlopoulos, J., et al.: Explaining the Chronological Attribution of Greek Papyri Images, pp. 401–415 (2023). https://doi.org/10.1007/978-3-031-45275-8_27

  34. Prebor, G., Zhitomirsky-Geffet, M., Miller, Y.: A new analytic framework for prediction of migration patterns and locations of historical manuscripts based on their script types. Digit. Scholarsh. Human. 35(2), 441–458 (06 2019). https://doi.org/10.1093/llc/fqz038

  35. Seuret, M., et al.: ICDAR 2021 competition on historical document classification. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12824, pp. 618–634. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86337-1_41

    Chapter  Google Scholar 

  36. Sidorov, K.: Paleographic dating of birch bark manuscripts. In: GraphiCon 2017, pp. 162–168 (2017)

    Google Scholar 

  37. Soumya, A., Kumar, G.H.: Classification of ancient epigraphs into different periods using random forests. In: 2014 Fifth International Conference on Signal and Image Processing, pp. 171–178 (2014). https://doi.org/10.1109/ICSIP.2014.33

  38. Studer, L., et al.: A comprehensive study of imagenet pre-training for historical document image analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 720–725 (2019). https://doi.org/10.1109/ICDAR.2019.00120

  39. Tagami, D., Satlow, M.: Machine learning techniques for analyzing inscriptions from israel. DHQ: Digit. Human. Q. 17(2) (2023)

    Google Scholar 

  40. Tilahun, G., Feuerverger, A., Gervers, M.: Dating medieval English charters. Ann. Appl. Stat. 6(4), 1615–1640 (2012). https://doi.org/10.1214/12-AOAS566

    Article  MathSciNet  Google Scholar 

  41. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jegou, H.: Training data-efficient image transformers distillation through attention. In: International Conference on Machine Learning, vol. 139, pp. 10347–10357 (2021)

    Google Scholar 

  42. Tvalavadze, T., Gigashvili, K., Mania, E., Iavich, M.: Automated dating of Galaktion Tabidze’s handwritten texts. In: Hu, Z., Dychka, I., He, M. (eds.) ICCSEEA 2023. LNDE and CT, vol. 181, pp. 260–268. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-36118-0_23

    Chapter  Google Scholar 

  43. Vasyutinsky Shapira, D., Rabaev, I., Droby, A., Barakat, B.K., El-Sana, J.: Is a deep learning algorithm effective for the classification of medieval Hebrew scripts? Studies in Digital History and Hermeneutics, p. 349 (2022). https://doi.org/10.1515/9783110744828-016

  44. Wahlberg, F., Mårtensson, L., Brun, A.: Large scale continuous dating of medieval scribes using a combined image and language model. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 48–53 (2016). https://doi.org/10.1109/DAS.2016.71

  45. Wahlberg, F., Wilkinson, T., Brun, A.: Historical manuscript production date estimation using deep convolutional neural networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 205–210 (2016). https://doi.org/10.1109/ICFHR.2016.0048

  46. Wolf, L., Dershowitz, N., Potikha, L., German, T., Shweka, R., Choueka, Y.: Automatic palaeographic exploration of genizah manuscripts. In: Kodikologie und Paläographie im digitalen Zeitalter 2 - Codicology and Palaeography in the Digital Age 2, pp. 157–179. Books on Demand (BoD), Norderstedt (2011)

    Google Scholar 

  47. Woo, S., et al.: Convnext v2: co-designing and scaling convnets with masked autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16133–16142 (2023)

    Google Scholar 

  48. Wu, H., et al.: CVT: introducing convolutions to vision transformers. arXiv preprint arXiv:2103.15808 (2021)

  49. Yang, J., Li, C., Dai, X., Gao, J.: Focal modulation networks (2022)

    Google Scholar 

  50. Yu, X., Huangfu, W.: A machine learning model for the dating of ancient Chinese texts. In: 2019 International Conference on Asian Language Processing (IALP), pp. 115–120 (2019). https://doi.org/10.1109/IALP48816.2019.9037653

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Boraq Madi or Nour Atamni .

Editor information

Editors and Affiliations

A Appendix

A Appendix

Fig. 4.
figure 4

Distribution of manuscripts/pages over the years

Fig. 5.
figure 5

Examples of the extracted patches. The patches are extracted with an average of 5 lines per patch.

Table 6. Mean and Median Errors for Test and Blind Test Sets
Fig. 6.
figure 6

The difference between the mean and median values on the blind test for each model

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Madi, B., Atamni, N., Tsitrinovich, V., Vasyutinsky-Shapira, D., El-Sana, J., Rabaev, I. (2024). Automated Dating of Medieval Manuscripts with a New Dataset. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14936. Springer, Cham. https://doi.org/10.1007/978-3-031-70642-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70642-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70641-7

  • Online ISBN: 978-3-031-70642-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics