1 Introduction

The production cost of the ceramic tile industry is expensive because of the cost of raw materials and energetics. In order to decrement the negative impact of this costs, industrial software for automatic vision inspection are proposed to classify the quality of the final product. There are many elements that can directly affect the quality of the final tile in the production line. That is, a failure could occurs in the beginning of the process, which is reflected in the structure of the tiles as irregular geometric composition. Middle and final stage of the process are related with failures that appears as small pinholes, surface cracks, presence of chromatic discrepancies in the tonality of surface and texture anomalies. For this reason, the last phase of manufacturing process requires that tiles be subject to a quality operation with the aim of identifying any defects. Usually, the quality inspection of the tiles in some industries is done by visual inspection, that has a propensity to wrong classifications, thus, the experience, judgment, fatigue and visual capacity of the employee plays an essential role for a successful identification of fails. In other plants, a computer system support separates tiles that are in bad conditions from the other tiles. Such computer systems use different mathematical methods varying the precision in results and processing time [2].

The tiles fabrication process usually throws 4 different kinds of defects in the product related with the different RGB color components. The first class refers to spots which affect Green and Blue color components, the second class affects the Red and Blue components, the third one refers to Red and Green components, the fourth affects the three components. Finally, tiles without failures are considered as the fifth class.

1.1 Previous Works

In the 90’s Boukavalas et al. [3,4,5,6], detected different kinds of failures (multiclassification) in the final tile, they used methods like color histograms, texture and color gradient. Other works in this area use different algorithms to calculate different features from tile images such as segmentation and Wavelet transformation. Common techniques used for classification are: Bayes functions [7], K-means [3, 4], binary trees [6], K-NN [8] and neural networks [8,9,10].

In this work the CIELab color space is computed and complemented by common texture attributes obtained from the sum and difference images (SDH technique). Tile image attributes are tested for different combinations used as inputs to the ANN. The parameter configuration of the ANN is obtained after detailed tests.

Section 2 describes the color and texture feature computation. Section 3 explains the configuration and training process of the neural network. Then in the Sect. 4 it is presented the obtained results and it is made a comparison of the work. The conclusions are included at the end of this document.

2 Feature Extraction

The Fig. 1 illustrates the different steps for extracting color and texture features.

Fig. 1.
figure 1

Color and Texture features strategy

2.1 CIEL*ab Color Space

The RGB images are device-dependent and highly affected by illumination changes. To avoid this constraint, it is used the CIE L*ab color space which describes all the colors seen by the human eye.

As illustrated in the two first levels of Fig. 1, the RGB to \(CIE L*ab\) transformation is performed through an intermediate space known as CIE-XYZ based on the tristimulus values. The three primary colors in the human eye are described by the XYZ parameters of the Eq. 1. The numerical matrix in such equation represents the predefined parameters or factors to carry out the transformation.

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} X \\ Y \\ Z \\ \end{array} } \right] = \left[ {\begin{array}{*{20}{c}} {0.4124} &{} {0.3575} &{} {0.1804} \\ {0.2126} &{} {0.7151} &{} {0.0721} \\ {0.0193} &{} {0.1191} &{} {0.9502} \\ \end{array} } \right] \quad \left[ {\begin{array}{*{20}{c}} R \\ G \\ B \\ \end{array} } \right] \end{aligned}$$
(1)

Then, the transformation CIE-XYZ to CIE L*ab is given by the following equations:

$$\begin{aligned} L* = 116f\left( \frac{Y}{Y_w}\right) -16 \end{aligned}$$
(2)
$$\begin{aligned} a=500\left[ f\left( \frac{X}{X_w}\right) -f\left( \frac{Y}{Y_w}\right) \right] \end{aligned}$$
(3)
$$\begin{aligned} b=200\left[ f\left( \frac{Y}{Y_w}\right) -f\left( \frac{Z}{Z_w}\right) \right] \end{aligned}$$
(4)

where \(X_w\), \(Y_w\) and \(Z_w\) are tristimulus of CIE-XYZ values with reference to the “white spot”; given by:

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{X_w}} \\ {{Y_w}} \\ {{Z_w}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}{c}} {0.9504} \\ {1.0000} \\ {1.0887} \\ \end{array} } \right] \end{aligned}$$
(5)

and

$$\begin{aligned} f\left( t\right) =\left\{ \begin{array}{lc} t^\frac{1}{3} &{} if \left( \frac{6}{29}\right) ^{3} \\ \frac{1}{3}\left( \frac{29}{6}\right) ^2t+\frac{4}{29} &{} otherwise \end{array} \right. \end{aligned}$$
(6)

where f(t) is a time function.

2.2 Texture Features

The surface of an object is typically characterized by its texture. The texture is defined as a property in a neighborhood and is generally computed using gray scale or binary images. In this project, it is used the Sum and Difference Histograms (SDHs) algorithm proposed by Unser [1], for computing texture attributes. A general description of the SDH strategy is described in the following.

Be \(K\times L\) a rectangular grid that contains the analyzed discrete texture of an image denoted by I(kl) where \((k\in \left[ 0,K-1\right] ; l\in \left[ 0,L-1\right] )\). Suppose that the gray level at each pixel is quantified to \(N_g\) levels, so let \(G\in \left[ 0,...,N_g-1\right] \) be the set of these \(N_g\) levels. Next, for a given pixel \(\left( k,l\right) \), let \(\left( \delta _k, \delta _l\right) =\left\{ \left( \delta _{k_1}, \delta _{l_1}\right) ,\left( \delta _{k_2},\delta _{l_2}\right) ,...,\left( \delta _{k_M}, \delta _{l_M}\right) \right\} \) be the set of M relative displacements. The Sum and Difference images, \(I_S\) and \(I_D\) respectively, associated with each relative displacement \(\left( \delta _k, \delta _l\right) \), are defined as:

$$\begin{aligned} \begin{array}{c} I_S\left( k,l\right) =I\left( k,l\right) +I\left( k+\delta _k,l+\delta _l\right) \\ I_D\left( k,l\right) =I\left( k,l\right) -I\left( k+\delta _k,l+\delta _l\right) \end{array} \end{aligned}$$
(7)

Thus, the range of the \(I_S\) image is \(\left[ 0,2\left( N_g-1\right) \right] \), and for the \(I_D\) image is \(\left[ -N_g+1,N_g-1\right] \). From this, let define i and j as two any gray levels in the \(I_S\) and \(I_D\) image range respectively. Then, let D be a subset of indexes which specifies a region to be analyzed, so, the SDHs with parameters \(\left( \delta _k,\delta _l\right) \) over the domain \(\left( k,l\right) \in D\) are, respectively, defined as:

$$\begin{aligned} \begin{array}{c} h_S\left( i;\delta _k,\delta _l\right) =h_S\left( i\right) =\#\left\{ \left( k,l\right) \in D,I_S\left( k,l\right) =i\right\} \\ h_D\left( j;\delta _k,\delta _l\right) =h_D\left( j\right) =\#\left\{ \left( k,l\right) \in D,I_D\left( k,l\right) =j\right\} \end{array} \end{aligned}$$
(8)

where, the total number of counts is

$$\begin{aligned} N=\#\left\{ D\right\} =K\times L=\sum _i h_S\left( i\right) =\sum _j h_D\left( j\right) \end{aligned}$$
(9)

The normalized SDHs is given by:

$$\begin{aligned} \widehat{P_S}\left( i\right) =\frac{h_S\left( i\right) }{N}\qquad \widehat{P_D}\left( j\right) =\frac{h_D\left( j\right) }{N} \end{aligned}$$
(10)

Unser [1] has proposed a variety of features for extracting only useful texture information from the SDHs. The features most frequently used are: mean, variance, correlation, contrast, homogeneity, cluster shade and cluster prominence. In particular, all of these features are computed in this approach.

3 Failure Detection Strategy

The design of the classifier proposed is divided in two stages: (1) the configuration of the artificial neural network (ANN), and, (2) testing the texture tiles with the classifier obtained in the stage 1. The classifier is designed considering that there is no distance and optimal orientation for all texture features computed. Therefore, exhaustive tests are performed in order to find the best parameters to configure the ANN. Figure 2 illustrates the block diagram where each specific task performed to obtain the final classifier.

Fig. 2.
figure 2

Global strategy for tile classification

3.1 Design of the Artificial Neural Network (ANN)

The database for the development of this phase considers 5 different kinds of Brodatz images with the purpose of training the ANN to classify 5 classes as mentioned in Sect. 1. This number defines also the neurons at the output layer of the neural network, representing one neuron for each class. Furthermore, for each kind of image, 200 points are chosen randomly. Additionally, the parameters that describe the system are: (1) the size of the rectangular window to compute texture attributes, (2) the distance and orientation to compute sum and difference images, (3) the number of hidden layers in the neural network, and (4) the number of neurons in the hidden layers.

In this work, the size of the rectangular window is tested for 12 different values: 3, 5, 7, 9, ..., 25 and their corresponding combinations yielding 144 different window’s sizes. On the other hand, the most frequently distances used to compute the sum and difference images are 1, 2, 3 and 4 pixels with orientations of \(0^{\circ }\), \(45^{\circ }\), \(90^{\circ }\), \(135^{\circ }\). However, it is assumed that changes in the texture features for \(0^{\circ }\) and \(45^{\circ }\) or for small distances such as 1, 2, 3 are minimal due to their proximity. Therefore, the distances and orientations used are 1, 4 pixels and \({0^{\circ }, 90^{\circ }}\), respectively.

Setting ANN Parameters. In the classifier, the number of hidden layers is limited to one or two, and finally, the number of neurons per layer contemplate 2, 3, 4, ..., 40 neurons in the first hidden layer. Then the number of neurons in the second layer is chosen considering that: if an unsatisfied result is reached (using a particular number of neurons in the first hidden layer), then the neurons in the second layer are increased.

After exhaustive tests of different system parameters, the best performance and accuracy is given for a squared window of 13 pixels size used to compute texture features, the Sum and Difference images are computed using 1 pixel of distance and orientation of \(0^{\circ }\). The neural network uses one hidden layer of 28 neurons, and an output layer of 5 neurons. The validation of the selected neural network is performed by evaluating an image-mask, which contains 5 Brodatz images in different dimensions, the output of the network provides an overall accuracy of 84.5%.

3.2 Training the ANN

The tile database contains 99 images used as follows: 30 images for training, 14 for validation and 55 for testing. Thus, twelve different inputs were tested: (1) the nine texture attributes, (2) the five most significant texture attributes (mean, variance, contrast, cluster shade and cluster prominence), (3) the color components \(L*\), a and b from \(CIE-L*ab\) space, (4) the test (2) and the color components a and b, (5) the test (2) and four difference colors, (6) the test (5) and four difference hue and, finally tests (7) to (12) use the same order as the six earlier experiments, but with the equalized images.

The output of the neural network delivers a class label for every pixel. A gray level value is assigned to each class in order to generate the resulted image. A tile without defects differs from a tile with defects quantitatively. Therefore, it is possible to analyze the resulted image by calculating the rate among pixels detected without fail and total pixels on the image. This rate varies depending on the test. However, this value define if the tile have or not defects. In case of the detection of failures in tiles, a class of fail is assigned by analyzing the axis length of the spot. Finally, if different classes of failures are found in the same tile, a second test analyses the larger one, in order to assign a final unique class. A particular case occurs when the tile image is labeled with faults, but without spots, in this case, the tile is directly assigned to class 2, corresponding to defects in the color components due to bad illumination conditions.

The Table 1 shows the confusion matrix for the experimental test number 5 (the five most significant texture attributes and four color differences). Note that, tile images without faults (class 5) are correctly classified (100%) for all cases. On the contrary the worst performance is obtained for the class 1 (labeled as class 3), and for the class 2 (labeled as class 1). By analyzing such errors, it is found that it is possible to locate irregular spots on the tile due to an illumination problem which affect more the class 1 and 2 than the others. Despite of these errors in the classification, the diagonal of the matrix conserves the highest classification values, as it is expected.

Table 1. Confusion matrix of the images used in the experimental test 5.

4 Experimental Results

The experimental tests are performed using tile database provided by the company Daltile at Monterrey in Mexico, consisting of 99 color digital images of tiles taken at the end of the glazing process. The Figs. 3 and 4 illustrates the classification results, detected failures are highlighted in squares. In particular, the fail in the Fig. 3 (second row) is even more difficult to see by the human eye. The dark color pixels in the resulted images make reference to different tons and no homogeneous zones of color in the image tile. However, such pixels do not belong to one specific class of the ANN, that is, they represent an unknown failure.

Fig. 3.
figure 3

Left column: tiles belonged to class 1 and 4, respectively. Right column: resulted output images, the True Positives are highlighted.

4.1 Performance Evaluation and Comparative Analysis

Results in Table 2 points out that the classes 3, 4 and 5 obtain the highest scores for all metrics. The low score in precision and sensitivity for classes 1 and 2 is mainly due to a few number of images belonged to this class, one-third in comparison to the rest of the classes.

Table 2. Performance evaluation of the classifier
Fig. 4.
figure 4

Left: tile belonged to class 3 with several fails (TP). Middle: resulted image classifies one of the spots as class 1. Right: several spots classified as class 3.

This approach can be compared with other similar projects that use different neural network architectures. For instance in [9, 11], the authors perform tile classification but only for two classes. Another similar work is presented by Kukonen et al. [8] in which a multi-classification work is proposed, also 5 classes. Such approach uses a K-NN and a neural network SOM with spectral characteristics from the tile, obtaining a 30.0% of error in the classification using a 1-NN and 2-D SOM. Second tests are presented obtaining 20.0% of error using a 7-NN and 1-D SOM, in the system proposed has an error rate is 20.0% also.

5 Conclusions

A neural network is used as a classifier of color and texture features from tile images. A database of tiles with and without different kind of fails are used for experimental tests, validating the accuracy and feasibility of the proposed approach in comparison with other approaches. This project validates that texture features like contrast, variance, homogeneity complements the color information provided by the CIE-Lab color space. The classifier achieves high performance for detecting classes 3 to 5. However, some of the tile images present changing light conditions that is hard to classify. Finally, even if the approach is a first proposal strategy, good classification results are provided for tackling this typical industrial problem. Future researches include an increment of the number of images per class and to have a balanced number of cases among classes.