Abstract
Identification of scribes from historical manuscripts has remained an equally interesting problem for paleographers as well as the pattern classification researchers. Though significant research endeavors have been made to address the writer identification problem in contemporary handwriting, the problem remains challenging when it comes to historical manuscripts primarily due to the degradation of documents over time. This study targets scribe identification from ancient documents using Greek handwriting on the papyri as a case study. The technique relies on segmenting the handwriting from background and extracting keypoints which are likely to carry writer-specific information. Using the handwriting keypoints as centers, small fragments (patches) are extracted from the image and are employed as units of feature extraction and subsequent classification. Decisions from fragments of an image are then combined to produce image-level decisions using a majority vote. Features are learned using a two-step fine-tuning of convolutional neural networks where the models are first tuned on contemporary handwriting images (relatively larger dataset) and later tuned to the small set of writing samples under study. The preliminary findings of the experimental study are promising and establish the potential of the proposed ideas in characterizing writer from a challenging set of writing samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hamid, A., Bibi, M., Siddiqi, I., Moetesum, M.: Historical manuscript dating using textural measures. In: 2018 International Conference on Frontiers of Information Technology (FIT), pp. 235–240. IEEE (2018)
Baird, H.S., Govindaraju, V., Lopresti, D.P.: Document analysis systems for digital libraries: challenges and opportunities. In: Marinai, S., Dengel, A.R. (eds.) DAS 2004. LNCS, vol. 3163, pp. 1–16. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28640-0_1
Le Bourgeois, F., Trinh, E., Allier, B., Eglin, V., Emptoz, H.: Document images analysis solutions for digital libraries. In: First International Workshop on Document Image Analysis for Libraries, 2004, Proceedings, pp. 2–24. IEEE (2004)
Sankar, K.P., Ambati, V., Pratha, L., Jawahar, C.V.: Digitizing a Million books: challenges for document analysis. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 425–436. Springer, Heidelberg (2006). https://doi.org/10.1007/11669487_38
Klemme, A.: International dunhuang project: the silk road online. Reference Reviews (2014)
Jouili, S., Coustaty, M., Tabbone, S., Ogier, J.-M.: NAVIDOMASS: structural-based approaches towards handling historical documents. In: 2010 20th International Conference on Pattern Recognition, pp. 946–949 (2010)
Hamid, A., Bibi, M., Moetesum, M., Siddiqi, I.: Deep learning based approach for historical manuscript dating. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 967–972 (2019)
Aiolli, F., Ciula, A.: A case study on the system for paleographic inspections (SPI): challenges and new developments. Comput. Intell. Bioeng. 196, 53–66 (2009)
He, S., Samara, P., Burgers, J., Schomaker, L.: Image-based historical manuscript dating using contour and stroke fragments. Pattern Recogn. 58, 159–171 (2016)
Srihari, S.N., Cha, S.-H., Arora, H., Lee, S.: Individuality of handwriting. J. Forensic Sci. 47(4), 1–17 (2002)
Said, H.E., Tan, T.N., Baker, K.D.: Personal identification based on handwriting. Pattern Recogn. 33(1), 149–160 (2000)
He, Z., You, X., Tang, Y.Y.: Writer identification using global wavelet-based features. Neurocomputing 71(10–12), 1832–1841 (2008)
He, S., Schomaker, L.: Deep adaptive learning for writer identification based on single handwritten word images. Pattern Recogn. 88, 64–74 (2019)
Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 701–717 (2007)
Siddiqi, I., Vincent, N.: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recogn. 43(11), 3853–3865 (2010)
Xing, L., Qiao, Y.: DeepWriter: a multi-stream deep CNN for text-independent writer identification. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 584–589. IEEE (2016)
Nasir, S., Siddiqi, I.: Learning features for writer identification from handwriting on Papyri. In: Djeddi, C., Kessentini, Y., Siddiqi, I., Jmaiel, M. (eds.) MedPRAI 2020. CCIS, vol. 1322, pp. 229–241. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-71804-6_17
Mohammed, H. Marthot-Santaniello, I., Märgner, V.: GRK-Papyri: a dataset of greek handwriting on papyri for the task of writer identification. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 726–731 (2019)
Rehman, A., Naz, S., Razzak, M.I., Hameed, I.A.: Automatic visual features for writer identification: a deep learning approach. IEEE Access 7, 17149–17157 (2019)
Xing, L., Qiao, Y.: DeepWriter: a multi-stream deep CNN for text-independent writer identification. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 584–589 (2016)
Christlein, V., Gropp, M., Fiel, S., Maier, A.: Unsupervised feature learning for writer identification and writer retrieval. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 991–997 (2017)
Keglevic, M., Fiel, S., Sablatnig, R.: Learning features for writer retrieval and identification using triplet CNNs. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 211–216 (2018)
Awaida, S.M., Mahmoud, S.A.: State of the art in off-line writer identification of handwritten text and survey of writer identification of Arabic text. Educ. Res. Rev. 7(20), 445–463 (2012)
Tan, G.J., Sulong, G., Rahim, M.S.M.: Writer identification: a comparative study across three world major languages. Forensic Sci. Int. 279, 41–52 (2017)
He, S., Schomaker, L.: FragNet: writer identification using deep fragment networks. IEEE Trans. Inf. Forensics Secur. 15, 3013–3022 (2020)
Kumar, B., Kumar, P., Sharma, A.: RWIL: robust writer identification for Indic language. In: 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 695–700 (2018)
Tang, Y., Wu, X.: Text-independent writer identification via CNN features and joint Bayesian. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 566–571, October 2016
Nasuno, R., Arai, S.: Writer identification for offline Japanese handwritten character using convolutional neural network. In: Proceedings of the 5th IIAE (Institute of Industrial Applications Engineers) International Conference on Intelligent Systems and Image Processing, pp. 94–97 (2017)
Chen, S., Wang, Y., Lin, C.-T., Ding, W., Cao, Z.: Semi-supervised feature learning for improving writer identification. Inf. Sci. 482, 156–170 (2019)
Fiel, S., Sablatnig, R.: Writer identification and retrieval using a convolutional neural network. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9257, pp. 26–37. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23117-4_3
Nguyen, H.T., Nguyen, C.T., Ino, T., Indurkhya, B., Nakagawa, M.: Text-independent writer identification using convolutional neural network. Pattern Recogn. Lett. 121, 104–112 (2019)
Christlein, V., Bernecker, D., Maier, A., Angelopoulou, E.: Offline writer identification using convolutional neural network activation features. In: Gall, J., Gehler, P., Leibe, B. (eds.) GCPR 2015. LNCS, vol. 9358, pp. 540–552. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24947-6_45
Javidi, M., Jampour, M.: A deep learning framework for text-independent writer identification. Eng. Appl. Artif. Intell. 95, 103912 (2020)
Schomaker, L., Franke, K., Bulacu, M.: Using codebooks of fragmented connected-component contours in forensic and historic writer identification. Pattern Recogn. Lett. 28(6), 719–727 (2007)
Lai, S., Zhu, Y., Jin, L.: Encoding pathlet and SIFT features with bagged VLAD for historical writer identification. IEEE Trans. Inf. Forensics Secur. 15, 3553–3566 (2020)
Abdeljalil, G., Djeddi, C., Siddiqi, I., Al-Maadeed, S.: Writer identification on historical documents using oriented basic image features. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 369–373 (2018)
Chammas, M., Makhoul, A., Demerjian, J.: Writer identification for historical handwritten documents using a single feature extraction method. In: 19th International Conference on Machine Learning and Applications (ICMLA 2020) (2020)
Cilia, N.D., et al.: A two-step system based on deep transfer learning for writer identification in medieval books. In: Vento, M., Percannella, G. (eds.) CAIP 2019. LNCS, vol. 11679, pp. 305–316. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29891-3_27
Cilia, N., De Stefano, C., Fontanella, F., Marrocco, C., Molinara, M., Di Freca, A.S.: An end-to-end deep learning system for medieval writer identification. Pattern Recogn. Lett. 129, 137–143 (2020)
Mohammed, H., Märgner, V., Stiehl, H.S.: Writer identification for historical manuscripts: analysis and optimisation of a classifier as an easy-to-use tool for scholars from the humanities. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 534–539 (2018)
Studer, L., et al.: A comprehensive study of ImageNet pre-training for historical document image analysis. arXiv preprint arXiv:1905.09113 (2019)
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
He, S., Schomaker, L.: DeepOtsu: document enhancement and binarization using iterative deep learning. Pattern Recogn. 91, 379–390 (2019)
Fiel, S., Hollaus, F., Gau, M., Sablatnig, R.: Writer identification on historical Glagolitic documents. In: Document Recognition and Retrieval XXI, vol. 9021, p. 902102. International Society for Optics and Photonics (2014)
Bennour, A., Djeddi, C., Gattal, A., Siddiqi, I., Mekhaznia, T.: Handwriting based writer recognition using implicit shape codebook. Forensic Sci. Int. 301, 91–100 (2019)
Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 430–443. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_34
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Marti, U.-V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5(1), 39–46 (2002)
Gazda, M., Hireš, M., Drotár, P.: Multiple-fine-tuned convolutional neural networks for Parkinson’s disease diagnosis from offline handwriting. IEEE Trans. Syst. Man Cybern. Syst. (2021)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Targ, S., Almeida, D., Lyman, K.: Resnet in resnet: Generalizing residual architectures. arXiv preprint arXiv:1603.08029 (2016)
Acknowledgements
Authors would like to thank Dr. Isabelle Marthot-Santaniello from University of Basel, Switzerland for making the dataset available.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nasir, S., Siddiqi, I., Moetesum, M. (2021). Writer Characterization from Handwriting on Papyri Using Multi-step Feature Learning. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12916. Springer, Cham. https://doi.org/10.1007/978-3-030-86198-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-86198-8_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86197-1
Online ISBN: 978-3-030-86198-8
eBook Packages: Computer ScienceComputer Science (R0)