Keywords

1 Introduction

In dispute cases, questions related to the authenticity of documents presented as evidence can be discussed in Court. The problem becomes greater when dealing with handwriting documents, since the attempts to fraud and forgeries are more easily accessible, because high technology is not necessary to do them. In most cases, a writer and a pen are sufficient to accomplish a fraud or a forgery.

Currently, the forensic handwriting identification is performed by experts using optical (optical device) and/or chemicals methods. Based on Sheikholeslami [1], the manual process of feature extraction and observation is tedious and may leave doubts about the writer identification. In addition, different Forensic Document Examiners (FDE) may extract the same features from a document in a different way. Then, the use of semi-automatic identification systems can be useful and helpful to experts.

According to Sreeraj and Idicula [2], the automatic handwriting-based writer identification is an active research arena. As it is one of the most difficult problems encountered in the field of digital image processing and pattern recognition, the handwriting-based writer identification problem faces with several sub problems such as: designing algorithms to identify handwritings of different individuals; identifying and representing relevant features of the handwriting and evaluating the performance of automatic methods.

Although different approaches have been presented in researches such as [3,4,5,6,7] the principal difference between them is the feature set used to represent the handwriting. In this work, we present a baseline system to automatic handwriting identification based only on graphometric features, i.e., the same principles used by the FDEs during their analysis. Initially a set composed of 12 features was defined and its extraction process developed. To evaluate the efficiency of these features, a selection process was applied, and a smaller group composed only of 4 features present the best writer identification rates (considering the experiments realized).

It is worth mentioning that in a previous work [8] the features selection process was applied in a group composed by 8 features. In this work we increment this group including features to extract information related to the writer loops habits and better results were obtained.

Besides, experiments were conducted to determine the number of writers required to validate the baseline system. All experiments were realized considering TOP1, TOP5 and TOP10 choices. These classifications mean that the baseline system return a group of possible writers (one for TOP1, five for TOP5 and ten for TOP10) of a questioned document, and the correct writer is present in this group.

The paper is divided into the following sections. Section 2 presents the principles of forensic handwriting analysis used to define the proposed baseline system. Section 3 summarizes the baseline system, including the feature set defined and the feature selection process used to obtain the best feature set. Section 4 presents the experimental results and a brief discussion based on results obtained. Finally, Sect. 5 provides some considerations and indicates future investigations.

2 Forensic Handwriting Analysis

This work presents a baseline system to automatic handwriting identification based only on forensic features. Thus, in this section we present a discussion of the forensic handwriting analysis.

2.1 Forensic Principles and Concepts

According to Morris [9], the forensic handwriting identification is part of criminology and it analyses provide a great number of elements that affect a person’s writing. This important area also knows the relevance of writing systems and how they influence the writer since his childhood even his graphic maturity writing.

According to Schomaker [10], contrary to biometrics with a purely physical or biophysical basis, the biometric analysis of handwriting requires a very broad knowledge at multiple levels of observation. For the identification of a writer in a large collection of known samples of handwriting, multi-level knowledge must be considered. In forensic practice, many aspects are considered, ranging from the physics of ink deposition [11] to knowledge on the cultural influences in a population [12].

Bensefia [4] point out that each writer can be characterized by his own handwriting, by the reproduction of details and unconscious practices. Handwriting identification is based on the principle that there are individual features that distinguish one person’s writing from that of another.

According to Bensefia [4], the writer identification task concerns the retrieval of handwritten samples from a database using the handwritten sample under study as a graphical query. It provides a subset of relevant candidate documents, on which complementary analysis will be developed by the expert. Whereas, the writer verification task, on its own, must conclude about two samples of handwriting and determines whether they are written by the same writer or not.

2.2 Graphometry and Other Approaches

Based on Sreeraj and Idicula [2], approaches related to the feature extraction for writer identification can be divided into: global (extracted from paragraphs, lines, or just pieces of the text image); and local (extracted from characters and words).

Different approaches for handwriting identification have been presented in the literature. Many of them apply features extracted from the document image, such as texture approaches [13,14,15,16,17] or codebook approaches [6, 18]. These features are not considered graphometric, because they consist of complex computational transformations and procedures on the document image and do not consider the same principles used by the FDEs. The approach presented in this work uses specifically graphometric features as presented in [19,20,21,22,23]. These features are those observed by FDEs during their analyses.

3 Forensic Handwriting Identification Based on Graphometry

In this work, we propose a baseline system for handwriting writer identification. To conduct the experiments which validate the system, we apply documents from 200 different writers from Brazilian Forensic Letter Database [24]. This base, that is text dependent, is composed by three copies of the same letter for each writer.

During the first stage (training stage) it is necessary provide the model for each writer randomly selected from forensic database. Two letters from each writer was used in this stage. At second stage (testing stage), the baseline system compares a specific writer against the models established in the training stage applying the third letter of each writer. In the next sections we describe the preprocessing, feature extraction, classification and feature selection steps.

3.1 Preprocessing

The preprocessing consists in five tasks, that are: thresholding, that is the process of converting the 256-gray images in a binary image using the OTSU algorithm; lines segmentation, this process consists in finding and targeting the lines in the forensic letter; segmentation the words of each line, this task realizes the segmentation of the words of each line for further processing it; contours extraction, the stroke contours were obtained through the application of morphological filters; and document image segmentation, this process consist in spliting the image in 24 segments (6 × 4).

3.2 Feature Extraction

Based on the study of graphometry, the set of feature used for forensic handwriting identification process in current work is: relative placement habits (f 1 , f 3 , f 4 , f 5 , f 6 ) relative relationship between individual words height (f 2 and f 7 ), axial slant (f 8 ) and relative loop habits (f 9 , f 10 , f 11 , f12 ) as presented in Table 1.

Table 1. Feature description

An important feature related to handwriting individuality is relative placement habits [9]. Writers can make a better use of the paper sheet and write to its physical limit.

Another important feature is related to the size of the first word of each handwriting line. When this feature had to be computed, the first word of each line was bounded by a box and its height and proportion of black pixels were computed.

The axial slant is a graphometric feature extensively used in approaches to automatic writer identification. In fact, it represents the general angle of the handwriting and has the best individual performance in the baseline system.

The relative loop habits are a set of graphometric features extracted from words and characters. These features present information about the upward and downward loops of the words (height, width, number of pixels and axial slant).

Figures 1 and 2 presents an overview of the extraction process from a letter image of the Brazilian Forensic Letter Database [24]. The result of the extraction process is a vector containing 85 primitives (as can be observed in Table 1).

Fig. 1.
figure 1

Feature extraction f1 − f7 [8]

Fig. 2.
figure 2

Feature extraction f8 − f12

This vector is applied to SVM classifier in the training and testing stages. All features were normalized to improve the classification process.

3.3 Classification

The classification task consists in submitting the vectors of primitives extracted from the forensic letters to the SVM classifier. We select SVM classifier based on the literature and based on some tests applying other classifiers. In this stage, the questioned document (forensic letter) is confronted with the models generated for each writer (all-against-all), and a confusion matrix is generated as result. This matrix contains the probably of each writer to be the author of the questioned document. These probabilities permit to identifying not only the correct classification (TOP1), but also the five and ten (TOP5 and TOP10 respectively) candidates to be the author of the questioned document.

3.4 Feature Selection

In order to validate our feature set, a feature selection process was applied in the entire set (f 1 , f 2, f 3 , f 4 , f 5 , f 6, f 7, f 8, f 9, f 10, f 11, f 12 ) and a group composed only by the features f 1 , f 6 , f 8 and f 12 present the best writer identification rates. This selection process was reported by [8], in which, the group of features was composed by the features(f 1 , f 2, ,f 3 , f 4 , f 5 , f 6, f 7, f 8, ) and the selected features was composed by (f 1 & f 6 & f 8 ).

According to Dy and Broadley [25], feature selection is a process that selects a subset of original features. A general feature selection process comprises four steps: subset generation, subset evaluation, stopping criterion, and result validation.

4 Experimental Results and Discussion

To validate the baseline system experiments are realized focusing on: analyze the resulting group of the feature selection process and reach a maximum number of writers used in the experiments that significantly affect the accuracy of the system.

In the first experiment the feature selection process was used to achieve the best group of features, as described in Sect. 3.4. By a sequential forward search and an evaluation criterion based on dependency, the goodness set (GS) obtained was composed of features f 1 , f 6 , f 8 and f 12 . To ensure that the feature set, resulting from the feature selection process, was good, other sets of features empirically defined were evaluated (Table 2). The result of these experiments is also analyzed and TOP5 and TOP10 match classifications were prepared (as showed in Table 3) reaching writer identification rates close to 100%.

Table 2. Comparison between feature set and GS
Table 3. TOP1, TOP5 and TOP10 match classification for GS

It is important to highlight that using TOP5 and TOP10 match classification the FDEs obtain better productivity since they can reduce the number of handwriting samples (to 5 or 10) which must be manually analyzed.

As mentioned before, another group of experiments was conducted to determine the number of writers which stabilizes the baseline system. To perform this task, writers randomly selected from the Brazilian Forensic Letter Database [24] were added in the group of users experimented in the baseline system, from 40 to 300 writers. The experiments were done with all the features, the best group of features (GS) and ensemble of features (Table 2) and the writer identification performance was computed.

It can be observed that gradually the relation between the number of writers and accuracy is stabilized, and with 200 writers the results are maintained. It is important note that applying 200 different writers represents to consider 400 letters in the training stage and 200 letters in the test stage, totaling 600 letters. Furthermore, the writer identification rate with the larger group (200 writers) was of 71%.

Table 4 presents a brief comparison of the results obtained using the baseline system and other present in the literature. Considering the number of writers and the accuracy, our results are very promising, as can be observed, with 160 writers our identification rate is 76% while other work with similar sample size [22] is 58%.

Table 4. Recent studies comparison.

5 Conclusion and Future Works

Current paper discussed the efficiency of a graphometric feature set which can be applied to writer identification. Firstly, we have described the main features based on graphometric principles and research related to them. Thereafter, we presented the baseline system. We have demonstrated, based on experimental results and a feature selection process, that these features achieved promising results for forensic handwriting analysis. Results were improved in TOP1 classification when the GS was applied, and results were comparable to others in the literature (Table 4) when graphometric features were considered. Considering TOP5 and TOP10 classifications, the writer identification rates achieved was close to 100%. It is important to detach the productivity gain obtained for forensic handwriting analysis when reducing the number of handwriting samples (to 5 or 10) which must be manually analyzed.

Besides, experiments were conducted to determine the number of writers which stabilizes the baseline system performance, and with 200 different writers no significantly gain or damage was perceived in the results. As future work, new features will be studied and included in the baseline system trying to improve the results and some tests with other classifiers will be prepared.