Elsevier

Pattern Recognition Letters

Volume 129, January 2020, Pages 137-143
Pattern Recognition Letters

An end-to-end deep learning system for medieval writer identification

https://doi.org/10.1016/j.patrec.2019.11.025Get rights and content

Highlights

  • End-to-end writer recognition system on Avila Bible in three steps.

  • Row detection, row classification, page classification.

  • Row detection in transfer learning with MobileNetV2.

  • Transfer learning vs. from scratch for row classification.

  • Five models trained in fine tuning and from scratch with small labelled dataset.

Abstract

This paper presents an end-to-end system to identify writers in medieval manuscripts. The proposed system consists in a three-step model for detection and classification of lines in the manuscript and page writer identification. The first two steps are based on deep neural networks trained with transfer learning techniques and specialized to solve the task in hand. The third stage is a weighted majority vote row-decision combiner that assigns to each page a writer. The main goal of this paper is to study the applicability of deep learning in this context when a relatively small training dataset is available. We tested our system with several state-of-the-art deep architectures on a digitized manuscript known as the Avila Bible, using only 9.6% of the total pages for training. Our approach proves to be very effective in identifying page writers, reaching a peak of 96.48% of accuracy and 96.56% of F1 score.

Introduction

Paleography is the study of ancient and medieval handwriting. An important problem faced by paleographers is to identify the writers, a.k.a. scribes, who contributed to the drawing up of a manuscript. Traditionally, paleographers perform qualitative evaluations to distinguish the writers, and in recent years, these techniques have been joined by computer-based tools [1] to measure quantities automatically such as height and width of letters, distances between characters, inclination angles, number and types of abbreviations, etc. Recently emerged approaches in digital paleography combine powerful machine learning algorithms with high-quality digital images of medieval manuscripts. However, traditional techniques require a preliminary feature engineering step that involves an expert in the field, thus increasing the application development cost.

In recent years, deep-learning-based approaches have received increasing attention from researchers thanks to their ability to handle complex and difficult image classification tasks [2]. Deep neural networks are capable of learning hierarchical feature representations directly from data, instead of using handcrafted features based on domain-specific knowledge [3]. Nonetheless, very few studies applied deep learning techniques to the interpretation of medieval manuscripts, and previous approaches were mainly used for identifying sundry elements of interest inside document pages, but not with the specific focus on writer recognition.

In our previous paper [4], we presented preliminary results of a study in which deep neural networks were employed for the identification of the scribes in ancient documents. For this aim, we proposed a deep transfer learning solution for row detection and page classification obtaining very encouraging results that enabled us to extend the previous approach and develop an end-to-end system for writer recognition. The proposed approach is based on three steps intended (i) to detect the lines (a.k.a. rows) in each page of the manuscript, (ii) to classify them, and (iii) to recognize the writer of the entire page. The first step consists in a deep-learning-based object detector trained in transfer learning on a generic dataset (like MS-COCO [5]) and specialized to solve the task in hand. The second step is a row classifier composed of a fully convolutional feature extractor and a meta-architecture classifier that can be trained both from scratch and in fine tuning. The third stage is a weighted majority vote row-decision combiner that assigns each page to a writer. We evaluated the performance of our system with several state-of-the-art deep architectures on a digitized manuscript known as the Avila Bible using only 9.6% of the total pages for training, i.e., of the 749 images available only 96 were completely labeled and used for training.

The remainder of the paper is organized as follows. In Section 2, we report an analysis of the literature in the field; in Section 3, we detail the materials and the structure of the employed dataset; in Section 4, we illustrate the three-step model used to develop the proposed writer identification system; in Section 5, we describe the performed experiments and in Section 6 we show and discuss the results obtained; finally, Section 7 concludes the paper.

Section snippets

Related work

Methods for addressing the analysis of ancient manuscript can be divided into two main categories: traditional machine learning methods and deep learning techniques.

Among the first category, we can consider approaches based on the analysis of single letters or signs for writer recognition and methods based on the observation of the entire page. Regarding the first, various methodologies have been developed for the identification of the writer. In [6], the proposed algorithm uses form

Materials: The avila bible

We used a large dataset of high-quality digital images obtained from a giant Latin copy of the whole Bible, known as the “Avila Bible” (Madrid, Biblioteca Nacional, ms.Vitr. 15.1). It consists of 870 two-column pages handwritten in Italy within the third decade of the XII century where the palaeographic analysis individuated 12 scribal hands. The pages written by each copyist are not equally numerous (they range from 1 to 143), and there are cases (about 2% of the dataset) in which parts of the

A Three-Step model for writer identification

The proposed system consists in a three-step classification model: (i) an object detector to detect automatically text lines contained in each page, (ii) a Deep Neural Network (DNN) to extract the features and classify each single row, and (iii) a majority vote row-decision combiner for page classification.

During the test phase, the system is used in an end-to-end manner, meaning that the system receives in input the RGB image of a single page and outputs a decision corresponding to the writer

Experiments

To evaluate the efficiency of our approach, we performed three experiments related to the different steps of the proposed model: (i) to verify how many lines in each page were correctly detected by the row detector, (ii) to evaluate the performance of the row classifier in terms of correctly classified lines, and (iii) to evaluate the performance of the whole system in page writer identification.

We trained and tested our models on 749 pages of the Avila Bible. As explained in Section 3, we

Results and discussion

Our first experiment was to evaluate the performance of the row detector. We tested our approach on the row labeled dataset, and we considered a row as correctly detected if the detected row contained more than the 90% of the manually annotated row. The row detector obtained an accuracy of the 25%, i.e., 758 on 3,031 were correctly detected. When applied to the page dataset, the row detector was able to recognize 21,071 rows in the 653 test pages. To have an idea of the obtained performance, we

Conclusions

In this paper, a three-step model for the Avila Bible writers identification was proposed. The first two steps were based on deep neural networks trained with transfer learning techniques and specialized to solve the task at hand. The third stage was based on a weighted majority vote row-decision combiner able to assign each page to a writer. The main goal was to understand the applicability of the proposed approach using a relatively narrow training dataset: of the 749 pages globally

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (39)

  • M. Bulacu et al.

    Text-independent writer identification and verification using textural and allographic features

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • G. Louloudis et al.

    Efficient word retrieval using a multiple ranking combination scheme

    Int. Work. on Document Analysis Systems

    (2012)
  • R. Pintus et al.

    ATHENA: Automatic text height extraction for the analysis of text lines in old handwritten manuscripts

    ACM J. Comput. Cult. Herit.

    (2015)
  • V. Lavrenko et al.

    Holistic word recognition for handwritten historical documents

    1st International Workshop on Document Image Analysis for Libraries

    (2004)
  • Y. Liang et al.

    Implementing word retrieval in handwritten documents using a small dataset

    Int. Conf. on Frontiers in Handwriting Recognition

    (2012)
  • H. Wei et al.

    A keyword retrieval system for historical mongolian document images

    IJDAR

    (2014)
  • G. Joutel et al.

    Curvelets based feature extraction of handwritten shapes for ancient manuscripts classification

    Document Recognition and Retrieval XIV

    (2007)
  • M.A. Dhali et al.

    A digital palaeographic approach towards writer identification in the dead sea scrolls

    Int. Conf. on Patt. Rec. Appl. and Met.

    (2017)
  • N. Cilia et al.

    Minimizing training data for reliable writer identification in medieval manuscripts

    Lecture Notes Comput. Sci.

    (2019)
  • Cited by (0)

    Handled by Associate Editor: G. Sanniti di Baja, Ph.D.

    View full text