Abstract
Automated manuscript dating is a long-awaited valuable tool for scholars in their research of historical documents. This study presents a new dataset of medieval Hebrew manuscripts annotated with dates. Our initial experiments focus on documents written in the Ashkenazi square script, allowing us to refine our methodologies in a manageable setting before addressing more complex script types. Also, to accurately reflect the script’s historical evolution, we adopt a novel classification approach for time periods of varying lengths, which acknowledges the uneven development of the script over time. We perform extensive experimentation with a variety of deep-learning models and show that the regression approach is more appropriate for estimating the date of the manuscript compared to categorical classification.
D. Vasyutinsky-Shapira—The participation of Dr. Daria Vasyutinsky Shapira was funded by the European Union (ERC, MiDRASH, Project No.@ 101071829). Views and opinions expressed are, however, those of the author only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adam, K., Al-Maadeed, S., Akbari, Y.: Hierarchical fusion using subsets of multi-features for historical Arabic manuscript dating. J. Imaging 8(3), 60 (2022)
Assael, Y., et al.: Restoring and attributing ancient texts using deep neural networks. Nature 603(7900), 280–283 (2022)
Bao, H., Dong, L., Piao, S., Wei, F.: BEiT: BERT pre-training of image transformers. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=p-BhZSz59o4
Beit-Arié, M., Engel, E.: Specimens of Mediaeval Hebrew scripts, vol. 3. Israel Academy of Sciences and Humanities (2017)
Boldsen, S., Paggio, P.: Automatic dating of medieval charters from Denmark. In: CEUR Workshop Proceeding (2019)
Christlein, V., Gropp, M., Maier, A.: Automatic dating of historical documents. Kodikologie und Paläographie im digitalen Zeitalter 4, 151–164 (2017)
Cloppet, F., Eglin, V., Helias-Baron, M., Kieu, C., Vincent, N., Stutzmann, D.: Icdar2017 competition on the classification of medieval handwritings in Latin script. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1371–1376. IEEE (2017)
Cloppet, F., Eglin, V., Stutzmann, D., Vincent, N., et al.: ICFHR2016 competition on the classification of medieval handwritings in Latin script. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 590–595. IEEE (2016)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021)
Droby, A., Kurar Barakat, B., Vasyutinsky Shapira, D., Rabaev, I., El-Sana, J.: VML-HP: Hebrew paleography dataset. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021, Part IV 16. LNCS, vol. 12824, pp. 205–220. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86337-1_14
Droby, A., Rabaev, I., Shapira, D.V., Kurar Barakat, B., El-Sana, J.: Digital Hebrew paleography: script types and modes. J. Imaging 8(5) (2022). https://doi.org/10.3390/jimaging8050143
Droby, A., Shapira, D.V., Rabaev, I., Barakat, B.K., El-Sana, J.: Hard and soft labeling for hebrew paleography: a case study. In: Uchida, S., Barney, E., Eglin, V. (eds.) DAS 2022. LNCS, vol. 13237, pp. 492–506. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_33
Engel, E.: Calamus or Chisel: On the History of the Ashkenazic Script, pp. 183 – 197. Brill, Leiden, The Netherlands (2010). https://doi.org/10.1163/ej.9789004179547.i-398.39
Engel, E.: Between France and Germany: gothic characteristics in Ashkenazi script. Nicholas de Lange and Judith Olszowy-Schlanger, Manuscrits hébreux et arabes: Mélanges en l’honneur de Colette Sirat, pp. 197–219 (2014)
Faigenbaum-Golovin, S., Shaus, A., Sober, B.: Computational handwriting analysis of ancient Hebrew inscriptions - a survey. IEEE BITS Inf. Theory Mag. 2(1), 90–101 (2022). https://doi.org/10.1109/MBITS.2022.3197559
Feuerverger, A., Hall, P., Tilahun, G., Gervers, M.: Using statistical smoothing to date medieval manuscripts. In: Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen, vol. 1, pp. 321–332. Institute of Mathematical Statistics (2008)
Hamid, A., Bibi, M., Moetesum, M., Siddiqi, I.: Deep learning based approach for historical manuscript dating. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 967–972 (2019). https://doi.org/10.1109/ICDAR.2019.00159
He, S., Samara, P., Burgers, J., Schomaker, L.: Discovering visual element evolutions for historical document dating. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7–12 (2016). https://doi.org/10.1109/ICFHR.2016.0015
He, S., Samara, P., Burgers, J., Schomaker, L.: Historical manuscript dating based on temporal pattern codebook. Comput. Vis. Image Underst. 152, 167–175 (2016). https://doi.org/10.1016/j.cviu.2016.08.008
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, J., Xu, Y., Lv, T., Cui, L., Zhang, C., Wei, F.: DIT: self-supervised pre-training for document image transformer (2022)
Li, Y., Genzel, D., Fujii, Y., Popat, A.C.: Publication date estimation for printed historical documents using convolutional neural networks. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 99–106 (2015)
Liu, Z., et al.: Swin transformer v2: scaling up capacity and resolution. In: International Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Liu, Z., et al.: dosovitskiy2021an: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. In: International Conference on Learning Representations (2022)
Molina, A., Gomez, L., Ramos Terrades, O., Lladós, J.: A generic image retrieval method for date estimation of historical document collections. In: Uchida, S., Barney, E., Eglin, V. (eds.) DAS 2022. LNCS, vol. 13237, pp. 583–597. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06555-2_39
Naamneh, S., et al.: Classifying the scripts of Aramaic incantation bowls. In: Proceedings of the 7th International Workshop on Historical Document Imaging and Processing, pp. 55–60. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3604951.3605510
Olszowy-Schlanger, J.: The early developments of Hebrew scripts in North-Western Europe. Gazette du livre médiéval 63(1), 1–19 (2017)
Paparrigopoulou, A., Kougia, V., Konstantinidou, M., et al.: Greek literary papyri dating benchmark. Preprint 2272076 (2023). https://doi.org/10.21203/rs.3.rs-2272076/v2
Paparrigopoulou, A., Pavlopoulos, J., Konstantinidou, M.: Dating Greek papyri images with machine learning (2022). https://doi.org/10.21203/rs.3.rs-2272076/v1
Pavlopoulos, J., Konstantinidou, M., Marthot-Santaniello, I., Essler, H., Paparigopoulou, A.: Dating Greek papyri with text regression. In: Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 10001–10013. Association for Computational Linguistics, Toronto, Canada (2023). https://doi.org/10.18653/v1/2023.acl-long.556
Pavlopoulos, J., et al.: Explaining the Chronological Attribution of Greek Papyri Images, pp. 401–415 (2023). https://doi.org/10.1007/978-3-031-45275-8_27
Prebor, G., Zhitomirsky-Geffet, M., Miller, Y.: A new analytic framework for prediction of migration patterns and locations of historical manuscripts based on their script types. Digit. Scholarsh. Human. 35(2), 441–458 (06 2019). https://doi.org/10.1093/llc/fqz038
Seuret, M., et al.: ICDAR 2021 competition on historical document classification. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12824, pp. 618–634. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86337-1_41
Sidorov, K.: Paleographic dating of birch bark manuscripts. In: GraphiCon 2017, pp. 162–168 (2017)
Soumya, A., Kumar, G.H.: Classification of ancient epigraphs into different periods using random forests. In: 2014 Fifth International Conference on Signal and Image Processing, pp. 171–178 (2014). https://doi.org/10.1109/ICSIP.2014.33
Studer, L., et al.: A comprehensive study of imagenet pre-training for historical document image analysis. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 720–725 (2019). https://doi.org/10.1109/ICDAR.2019.00120
Tagami, D., Satlow, M.: Machine learning techniques for analyzing inscriptions from israel. DHQ: Digit. Human. Q. 17(2) (2023)
Tilahun, G., Feuerverger, A., Gervers, M.: Dating medieval English charters. Ann. Appl. Stat. 6(4), 1615–1640 (2012). https://doi.org/10.1214/12-AOAS566
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jegou, H.: Training data-efficient image transformers distillation through attention. In: International Conference on Machine Learning, vol. 139, pp. 10347–10357 (2021)
Tvalavadze, T., Gigashvili, K., Mania, E., Iavich, M.: Automated dating of Galaktion Tabidze’s handwritten texts. In: Hu, Z., Dychka, I., He, M. (eds.) ICCSEEA 2023. LNDE and CT, vol. 181, pp. 260–268. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-36118-0_23
Vasyutinsky Shapira, D., Rabaev, I., Droby, A., Barakat, B.K., El-Sana, J.: Is a deep learning algorithm effective for the classification of medieval Hebrew scripts? Studies in Digital History and Hermeneutics, p. 349 (2022). https://doi.org/10.1515/9783110744828-016
Wahlberg, F., Mårtensson, L., Brun, A.: Large scale continuous dating of medieval scribes using a combined image and language model. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 48–53 (2016). https://doi.org/10.1109/DAS.2016.71
Wahlberg, F., Wilkinson, T., Brun, A.: Historical manuscript production date estimation using deep convolutional neural networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 205–210 (2016). https://doi.org/10.1109/ICFHR.2016.0048
Wolf, L., Dershowitz, N., Potikha, L., German, T., Shweka, R., Choueka, Y.: Automatic palaeographic exploration of genizah manuscripts. In: Kodikologie und Paläographie im digitalen Zeitalter 2 - Codicology and Palaeography in the Digital Age 2, pp. 157–179. Books on Demand (BoD), Norderstedt (2011)
Woo, S., et al.: Convnext v2: co-designing and scaling convnets with masked autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16133–16142 (2023)
Wu, H., et al.: CVT: introducing convolutions to vision transformers. arXiv preprint arXiv:2103.15808 (2021)
Yang, J., Li, C., Dai, X., Gao, J.: Focal modulation networks (2022)
Yu, X., Huangfu, W.: A machine learning model for the dating of ancient Chinese texts. In: 2019 International Conference on Asian Language Processing (IALP), pp. 115–120 (2019). https://doi.org/10.1109/IALP48816.2019.9037653
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Appendix
A Appendix
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Madi, B., Atamni, N., Tsitrinovich, V., Vasyutinsky-Shapira, D., El-Sana, J., Rabaev, I. (2024). Automated Dating of Medieval Manuscripts with a New Dataset. In: Mouchère, H., Zhu, A. (eds) Document Analysis and Recognition – ICDAR 2024 Workshops. ICDAR 2024. Lecture Notes in Computer Science, vol 14936. Springer, Cham. https://doi.org/10.1007/978-3-031-70642-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-70642-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70641-7
Online ISBN: 978-3-031-70642-4
eBook Packages: Computer ScienceComputer Science (R0)