An end-to-end deep learning system for medieval writer identification☆
Introduction
Paleography is the study of ancient and medieval handwriting. An important problem faced by paleographers is to identify the writers, a.k.a. scribes, who contributed to the drawing up of a manuscript. Traditionally, paleographers perform qualitative evaluations to distinguish the writers, and in recent years, these techniques have been joined by computer-based tools [1] to measure quantities automatically such as height and width of letters, distances between characters, inclination angles, number and types of abbreviations, etc. Recently emerged approaches in digital paleography combine powerful machine learning algorithms with high-quality digital images of medieval manuscripts. However, traditional techniques require a preliminary feature engineering step that involves an expert in the field, thus increasing the application development cost.
In recent years, deep-learning-based approaches have received increasing attention from researchers thanks to their ability to handle complex and difficult image classification tasks [2]. Deep neural networks are capable of learning hierarchical feature representations directly from data, instead of using handcrafted features based on domain-specific knowledge [3]. Nonetheless, very few studies applied deep learning techniques to the interpretation of medieval manuscripts, and previous approaches were mainly used for identifying sundry elements of interest inside document pages, but not with the specific focus on writer recognition.
In our previous paper [4], we presented preliminary results of a study in which deep neural networks were employed for the identification of the scribes in ancient documents. For this aim, we proposed a deep transfer learning solution for row detection and page classification obtaining very encouraging results that enabled us to extend the previous approach and develop an end-to-end system for writer recognition. The proposed approach is based on three steps intended (i) to detect the lines (a.k.a. rows) in each page of the manuscript, (ii) to classify them, and (iii) to recognize the writer of the entire page. The first step consists in a deep-learning-based object detector trained in transfer learning on a generic dataset (like MS-COCO [5]) and specialized to solve the task in hand. The second step is a row classifier composed of a fully convolutional feature extractor and a meta-architecture classifier that can be trained both from scratch and in fine tuning. The third stage is a weighted majority vote row-decision combiner that assigns each page to a writer. We evaluated the performance of our system with several state-of-the-art deep architectures on a digitized manuscript known as the Avila Bible using only 9.6% of the total pages for training, i.e., of the 749 images available only 96 were completely labeled and used for training.
The remainder of the paper is organized as follows. In Section 2, we report an analysis of the literature in the field; in Section 3, we detail the materials and the structure of the employed dataset; in Section 4, we illustrate the three-step model used to develop the proposed writer identification system; in Section 5, we describe the performed experiments and in Section 6 we show and discuss the results obtained; finally, Section 7 concludes the paper.
Section snippets
Related work
Methods for addressing the analysis of ancient manuscript can be divided into two main categories: traditional machine learning methods and deep learning techniques.
Among the first category, we can consider approaches based on the analysis of single letters or signs for writer recognition and methods based on the observation of the entire page. Regarding the first, various methodologies have been developed for the identification of the writer. In [6], the proposed algorithm uses form
Materials: The avila bible
We used a large dataset of high-quality digital images obtained from a giant Latin copy of the whole Bible, known as the “Avila Bible” (Madrid, Biblioteca Nacional, ms.Vitr. 15.1). It consists of 870 two-column pages handwritten in Italy within the third decade of the XII century where the palaeographic analysis individuated 12 scribal hands. The pages written by each copyist are not equally numerous (they range from 1 to 143), and there are cases (about 2% of the dataset) in which parts of the
A Three-Step model for writer identification
The proposed system consists in a three-step classification model: (i) an object detector to detect automatically text lines contained in each page, (ii) a Deep Neural Network (DNN) to extract the features and classify each single row, and (iii) a majority vote row-decision combiner for page classification.
During the test phase, the system is used in an end-to-end manner, meaning that the system receives in input the RGB image of a single page and outputs a decision corresponding to the writer
Experiments
To evaluate the efficiency of our approach, we performed three experiments related to the different steps of the proposed model: (i) to verify how many lines in each page were correctly detected by the row detector, (ii) to evaluate the performance of the row classifier in terms of correctly classified lines, and (iii) to evaluate the performance of the whole system in page writer identification.
We trained and tested our models on 749 pages of the Avila Bible. As explained in Section 3, we
Results and discussion
Our first experiment was to evaluate the performance of the row detector. We tested our approach on the row labeled dataset, and we considered a row as correctly detected if the detected row contained more than the 90% of the manually annotated row. The row detector obtained an accuracy of the 25%, i.e., 758 on 3,031 were correctly detected. When applied to the page dataset, the row detector was able to recognize 21,071 rows in the 653 test pages. To have an idea of the obtained performance, we
Conclusions
In this paper, a three-step model for the Avila Bible writers identification was proposed. The first two steps were based on deep neural networks trained with transfer learning techniques and specialized to solve the task at hand. The third stage was based on a weighted majority vote row-decision combiner able to assign each page to a writer. The main goal was to understand the applicability of the proposed approach using a relatively narrow training dataset: of the 749 pages globally
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (39)
- et al.
Identifying the writer of ancient inscriptions and byzantine codices. a novel approach
Comput. Vision Image Underst.
(2014) - et al.
A scalable pattern spotting system for historical documents
Pattern Recognit.
(2016) - et al.
Reliable writer identification in medieval manuscripts through page layout features: the avila bible case
Eng. Appl. Art. Int.
(2018) - et al.
Special issue on the analysis of historical documents
IJDAR
(2007) - et al.
Deep learning
Nature
(2015) - et al.
The effect of mammogram preprocessing on microcalcification detection with convolutional neural networks
(2017) - et al.
A two-step system based on deep transfer learning for writer identification in medieval books
Computer Analysis of Images and Patterns - Proceedings of CAIP
(2019) - et al.
Microsoft COCO: Common objects in context
Scribe Attribution for Early Medieval Handwriting by Means of Letter Extraction and Classification and a Voting Procedure for Larger Pieces
Proceedings of the 22nd International Conference on Pattern Recognition
(2014)Quantifying Scribal Behavior: A Novel Approach to Digital Paleography. Ph.D. thesis
(2016)
Text-independent writer identification and verification using textural and allographic features
IEEE Trans. Pattern Anal. Mach. Intell.
Efficient word retrieval using a multiple ranking combination scheme
Int. Work. on Document Analysis Systems
ATHENA: Automatic text height extraction for the analysis of text lines in old handwritten manuscripts
ACM J. Comput. Cult. Herit.
Holistic word recognition for handwritten historical documents
1st International Workshop on Document Image Analysis for Libraries
Implementing word retrieval in handwritten documents using a small dataset
Int. Conf. on Frontiers in Handwriting Recognition
A keyword retrieval system for historical mongolian document images
IJDAR
Curvelets based feature extraction of handwritten shapes for ancient manuscripts classification
Document Recognition and Retrieval XIV
A digital palaeographic approach towards writer identification in the dead sea scrolls
Int. Conf. on Patt. Rec. Appl. and Met.
Minimizing training data for reliable writer identification in medieval manuscripts
Lecture Notes Comput. Sci.
Cited by (0)
- ☆
Handled by Associate Editor: G. Sanniti di Baja, Ph.D.