Keywords

1 Introduction

In recent years, many problems in the industry, if supplemented by high-performance quality evaluation to do the pretreatment, the system will be stable and efficient. License plate image recognition system is the case. License plate recognition system (LPRS) which currently developing gradually, is one of the most popular research direction. LPRS is generally divided into the following sections [1]: license plate location, license plate segmentation, and license plate recognition (LPR) which is the main part of LPRS. At present, there are many LPR algorithms with the combination of the neural network, and the training samples of the algorithm has a very high demand [2,3,4]. If the license plate image training samples can be classified according to the clarity, thus making different network models, the final accuracy of LPR algorithm could be improved. Based on this, we propose a classification algorithm, dividing the license plate image into high-clarity and low-clarity two categories, to assist the LPRS.

2 Relate Work

The clarity classification of license plate image is a brand new problem in the study of non-reference image quality assessment (NRIQA), and there are no research papers before. Compared with the general NRIQA algorithms, the clarity classification algorithm of the license plate images does not evaluate the quality score, but the image is classified by clarity according to the industrial demand. However, both are essentially the features that can accurately describe the image quality. Therefore, the recent study of NRIQA algorithm can give us a lot of help.

Early NRIQA algorithms generally assume the types of distortion model that affect the image quality are known [5,6,7,8,9,10,11,12]. Based on the presumed distortion types, these approaches extract distortion specific features to predict the quality. However, there are more types of image distortions in reality, and distortions may affect each other. The assumption limits the application of these methods.

Recent numbers of studies on NRIQA take similar architecture. First in the training stage, the data containing the distorted image and the associated subjective evaluation are trained [13] to extract feature vectors. Then a regression model is learned to map these feature vectors to subjective human scores. In the test stage, feature vectors are extracted from test images, and then send into the regression model to predict its quality model [14,15,16,17,18]. The strategy to classify the quality of license plate images is almost the same, but a classification model will replace the regression model.

Moorthy and Bovik proposed a two-step framework of BIQA, called BIQI [14]. Scene statistics that extracted form given distorted image firstly, are used to decide the distortion type this image belongs to. The statistics are also used to evaluate the quality of the distortion type. By the same strategy, Moorthy and Bovik proposed DIIVINE, which use a richer natural scene feature [15]. However, completely different to the actual application, both BIQI and DIIVINE assume that the distortion type in the test image is represented in the training data set.

Saad et al. assumed that the statistics of the DCT characteristics can be varied in a predictable manner as the image quality changes [16]. According to it, a probability model called BLIINDS is trained by the contrast and structural features extracted in the DCT domain. BLIINDS is extended to BLIINDS-II by using more complex NSS-based DCT functionality [17]. Another approach not only extract the DCT feature, but also the wavelet and curvelet [18]. Although this model achieve certain effects in different distortion types, it still cannot work on every type.

As there is a big difference between general natural images and license plate (LP) images, the general NRIQA pursuing universal can not be used on LP images directly. Statistical information of LP image, such as DCT domain, wavelet domain and gradient statistics, not only affected by the distortion, also affected by the LP characters. This makes it difficult for the general NRIQA algorithm to work on the LP image. However, the advantage of the LP image is that the image does not appear in addition to the license plate characters other than the image. This very useful priori information can help us solve the problem of clarity classification of LP images. Base on this, we can produce different large enough over complete dictionaries to represent all types of LP images and to classify them by extracting the LP images from different dictionary reconstruction errors as valid features.

3 Framework of License Plate Image Classification

Here are the six uniform size gray-scale LP images in Fig. 1. We can see that for the distinction between the obvious high-clarity images and low-clarity, their DCT transformation are no rules to follow, gradient statistics also mixed together indistinguishable, which makes the vast majority of known NRIQA algorithms cannot effectively extract the feature of the LP image quality can be described.

Fig. 1.
figure 1

Different clarity LP images and their DCT transform (a) (b) (c) are high-clarity images (d) (e) (f) are low-clarity images (g) is their gradient histogram

Here, we propose a method based on sparse representation algorithm to extract the appropriate feature vector, and then put the feature vector into the appropriate classification model.

3.1 Classification Principle

Familiar with human subjective scoring that often used in general NRIQA algorithms as the final output, the clarity classification principle for LP images also needs to be artificially developed. In this algorithm, LP images are divided into two categories:

  1. 1.

    High-clarity LP image: the LP image that can be recognized all the last five characters by human.

  2. 2.

    Low-clarity LP image: the LP image that cannot be recognized all the last five characters by human.

Successful recognition of all the last five characters or not is one of the most intuitive manifestations of human eye’s assessment. It is worth noting that we believe that the high/low clarity binary classification of images is a prerequisite for faithfully reflecting the subjective evaluation of image quality. Likewise, if the feature we extract can accurately distinguish between high and low clarity images, this feature can also be used to describe the image’s quality score. Therefore, the classification principle that we propose is not unique. LP images could be classified into different categories by different principles.

3.2 The Algorithm of Feature Extraction

Sparse representation-based algorithms have been widely applied to computer vision and image processing like image denoising [19] and face recognition [20], especially the famous sparse representation-based classifier [21]. We follow this idea and describe in detail below.

Given sufficient samples of both high-clarity LP images and low, \( A_{1} = \left[ {v_{1,1} , \;v_{1,2} , \ldots , \;v_{{1,n_{1} }} } \right] \; \in \;R^{{m*n_{1} }} \), \( A_{2} = \left[ {v_{2,1} , \;v_{2,2} , \ldots ,\; v_{{2,n_{2} }} } \right] \; \in \;R^{{m*n_{2} }} \), and test sample \( y \; \in \;R^{m} \) from the same class would approximately lie in the subspace spanned by the training samples associated with either class:

$$ y = a_{1,1} *v_{1,1} + a_{1,2} *v_{1,2} + \ldots + a_{{1,n_{1} }} *v_{{1,n_{1} }} $$
(1)
$$ y = a_{2,1} *v_{2,1} + a_{2,2} *v_{2,2} + \ldots + a_{{2,n_{2} }} *v_{{2,n_{2} }} $$
(2)

and \( a_{i,j} \; \in \;{\mathbb{R}},\text{ }\;j = 1,\;2, \ldots n_{i} \) is a scalar.

Then we form a dictionary \( {\mathcal{D}} \) by grouping all the samples from both classes.

$$ {\mathcal{D}} = \text{ }\left[ {A_{1} , \;A_{2} } \right] = \left[ {v_{1,1} , \;v_{1,2} , \ldots v_{{1,n_{1} }} ,\;v_{2,1} , \ldots ,\;v_{{2,n_{2} }} } \right] $$
(3)

and the linear representation of \( y \) can be written as:

$$ y = {\mathcal{D}} *x_{0} = a_{1,1} *v_{1,1} + a_{1,2} *v_{1,2} + \ldots + a_{{2,n_{2} }} *v_{{2,n_{2} }} $$
(4)

here \( x_{0} \) is a coefficient vector whose entries are zero except those associated with the first class or second.

So to determine the class of clarity a test LP image is, it would be reconstructed and extracted the reconstruct error:

$$ y_{1} = D*\alpha_{0} \left( 1 \right) $$
(5)
$$ y_{2} = D*\alpha_{0} \left( 2 \right) $$
(6)
$$ e\left( 1 \right) = \left\| {y_{test} - y_{1} } \right\|_{2} $$
(7)
$$ e\left( 2 \right) = \left\| {y_{test} - y_{2} } \right\|_{2} $$
(8)

By compare these two error, the small one would be the clarity class the test image belongs to.

To learn the over-complete dictionary, K-SVD algorithm [22] would be used to solve the next optimization problem:

$$ \left\langle {D,\alpha } \right\rangle = \mathop {\text{argmin}}\nolimits_{D,\alpha } \left\| {Y - D*\alpha } \right\|_{2} \,\, s.t. \left\| \alpha \right\|_{0} \le L $$
(9)

However, the hypothesis of SRC algorithm is too strong to classify clarity of LP image directly. One of the prerequisites for the SRC algorithm is that the subspace of different classes are the same. To verify this, we create over complete dictionaries of the same parameters for the high/low clarity LP image training samples \( D_{1} D_{2} \).

$$ \left\langle {D_{1} ,\alpha_{1} } \right\rangle = \mathop {\text{argmin}}\nolimits_{{D_{1} ,\alpha_{1} }} \left\| {Y_{1} - D_{1} *\alpha_{1} } \right\|_{2} \,\,s.t.\left\| { \alpha_{1} } \right\|_{0} \le L $$
(10)
$$ \left\langle {D_{2} ,\alpha_{2} } \right\rangle = \mathop {\text{argmin}}\nolimits_{{D_{2} ,\alpha_{2} }} \left\| {Y_{2} - D_{2} *\alpha_{2} } \right\|_{2} \,\, s.t. \left\| {\alpha_{2} } \right\|_{0} \le L $$
(11)

In the process of making the dictionary, we found that for different image segmentation size, training error of dictionary produced by high-clarity image samples \( E_{1} = \left\| {Y_{1} - D_{1} *\alpha_{1,training} } \right\|_{2} \) is always larger than the error of low-clarity dictionary \( E_{2} \) as shown in Fig. 2.

Fig. 2.
figure 2

Training error of different clarity LP images.

This means that high-clarity LP images can provide more information and are not more completely represented than low-clarity LP images that provide lower information. Although the frequency domain features that reflect the amount of LP image information cannot be effectively extracted, the same indirect reconstruction error can also represent the amount of information of the LP image.

The experimental results show that high-clarity LP images require more dictionary atoms to represent, and the dictionary’s reconstruction error is related to the amount of information contained in the image. Although the SRC algorithm directly to the reconstruction error comparison method is not effective, the reconstruction error can still be extracted as LP image quality statistical feature.

So here we form two over complete dictionaries by high/low-clarity LP images with the same parameters. Then we get the reconstruction error \( e\left( 1 \right) \) and \( e\left( 2 \right) \) by Eqs. 5–8 as a two-dimensional feature vector of LP image.

In this algorithm, we utilize a support vector machine (SVM) for classification. While the feature vector was extracted, any classifier can be chosen to map it onto classes. The choice of SVM was motivated by the well performance [23] (Fig. 3).

Fig. 3.
figure 3

Algorithm flowchart

4 Experiments and Analysis

4.1 Database Establishment

First of all, there is no public database of clarity classified LP images, so we have to build one. The size of LP images form the surveillance video should be uniformed to 20 * 60 pixels. Then, all the LP images will be evaluated artificially through the principle above. Based on this principle, we establishment a LP image database containing 500 high-clarity images and 500 low-clarity images. Low clarity LP images include a variety of common distortion types like motion blur, Gaussian blur, defocus blur, compression and noise.

4.2 Algorithm Performance and Analysis

There are a few parameters in the algorithm. We set L (the sparse prior in Eq. 9) to 2, so that we can get obvious reconstruction error with a small sparse prior. The kernel used in the SVM part is the radial basis function (RBF) kernel, whose parameters estimated by using cross-validation. In order to prevent over-fitting, we take random 30% of both clarity LP images, divide them into pieces with the size of n * n by sliding, and use the largest variance of 60% of each LP image to learn the dictionary. The other 70% images used as test samples. The figure reported here are the correct rate of classification by different algorithms.

We test our algorithm on the database we made above. Since we were the first to study this problem, there was no comparison of clarity classification algorithms for LP images. Here we test the performance of No-reference PSNR [24], BIQI [15], NIQE [14] and SSEQ [25]. We record the assessment score of No-reference PSNR, NIQE, SSEQ for the random selection of 150 high-clarity LP images and 150 low-clarity LP images in the database (Figs. 4, 5 and 6).

Fig. 4.
figure 4

Assessment score of no-reference PSNR

Fig. 5.
figure 5

Assessment score of NIQE

Fig. 6.
figure 6

Assessment score of SSEQ

It can be seen that these assessment scores are very close to the different LP image clarity, and cannot effectively distinguish between high-clarity images and low. Of course, these algorithms cannot evaluate the LP image quality.

For the BIQI algorithm, we tested the algorithm in another way. We did not directly calculate the assessment score of LP images, but sent the feature vector extracted from 9 wavelet transform into the SVM classifier. We compared this feature with ours (Table 1):

Table 1. Classification performance of algorithms on two clarity LP image database

It can be seen that the features extracted by BIQI also do not apply to the quality of LP image.

The result shows that our algorithm performs well especially with the large size of patch the LP image divided. Due to the lack of relevant databases and algorithms, we did not have more experiments. Based on SRC, our algorithm considered the expression ability of different dictionaries. By extracting different reconstruction error, this classification algorithm could be extended to the assessment of LP image quality. Familiar with the information extracted form DCT domain and gradient, the reconstruction error could also be used as feature in regression model to map to the associated human subjective scores.

5 Conclusion

We have proposed a well-performed clarity classification method that improves on the database we built. Not the same as traditional ones, this model extract the different reconstruction error as feature. This method can be widely applied to the quality assessment algorithm for fixed category objects. First determine the image category, and then determine the image quality, this process in more in line with human visual perception.