Image classification with the use of radial basis function neural networks and the minimization of the localized generalization error

doi:10.1016/j.patcog.2006.07.002

Pattern Recognition

Volume 40, Issue 1, January 2007, Pages 19-32

https://doi.org/10.1016/j.patcog.2006.07.002 Get rights and content

Abstract

Image classification arises as an important phase in the overall process of automatic image annotation and image retrieval. In this study, we are concerned with the design of image classifiers developed in the feature space formed by low level primitives defined in the setting of the MPEG-7 standard. Our objective is to investigate the discriminatory properties of such standard image descriptors and look at efficient architectures of the classifiers along with their design pursuits. The generalization capabilities of an image classifier are essential to its successful usage in image retrieval and annotation. Intuitively, it is expected that the classifier should achieve high classification accuracy on unseen images that are quite “similar” to those occurring in the training set. On the other hand, we may assume that the performance of the classifier could not be guaranteed in the case of images that are very much dissimilar from the elements of the training set. To follow this observation, we develop and use a concept of the localized generalization error and show how it guides the design of the classifier. As image classifier, we consider the usage of the radial basis function neural networks (RBFNNs). Through intensive experimentation we show that the resulting classifier outperforms other classifiers such as a multi-class support vector machines (SVMs) as well as “standard” RBFNNs (viz. those developed without the guidance offered by the optimization of the localized generalization error). The experimental studies reveal some interesting interpretation abilities of the RBFNN classifiers being related with their receptive fields.

Introduction

A vast amount of digital images are become omnipresent these days call for an intensified effort towards building efficient needs for their automatic annotation and retrieval mechanisms. Classification of digital images becomes one of the fundamental activities one could view as the fundamental prerequisite for all other image processing pursuits. In image classification we could follow the general paradigm of pattern recognition. In pattern recognition, each object is described by a collection of features that forms a multidimensional space in which all discrimination activities take place. Various classifiers, both linear and nonlinear, become available at this stage including support vector machines (SVM), linear classifiers, polynomial classifiers, radial basis function neural networks (RBFNNs), fuzzy rule-based systems, etc. No matter what classifier has been chosen, a formation of a suitable feature becomes of paramount relevance. The problem of forming of the feature space in the case of images is even more complicated. On one hand, we have a lot of different alternatives. On the other hand, the diversity of images contributes to the elevated level of complexity and difficulty. In images, we encounter a variety of images showing different shapes, colors, texture, etc yet belonging to the same class. An image could be described by an intensity of color of each pixel or even better by some descriptors. In this study, our objective is to explore and quantify the discriminatory properties of the MPEG-7 image descriptors in classification problems. Those are explored in conjunction to two main categories of classifiers such as SVMs and RBFNNs.

Unfortunately, it becomes obvious that any classifier requiring high training accuracy may not achieve good generalization capability. Since both target outputs and distributions of the unseen samples are unknown, it is impossible to compute the generalization error in a direct way. There are two major approaches to estimate the generalization error, namely, analytical model and cross-validation (CV). In general, analytical models bound above the generalization error for any unseen samples and do not distinguish trained classifiers with the same number of effective parameters but different values of parameters. Thus, the error bounds given by those models are usually loose [1]. The major problem of analytical models is the estimation of the number of effective parameters of the classifier, which could be solved by using the VC-dimensions [2]. The VC-dimension of a classifier is defined as the largest number of samples that can be shattered by this classifier [2]. However, only loose bound of VC-dimensions could be found for nonlinear classifiers, e.g. neural networks, and this puts a severe limitation on the applicability of analytical models to nonlinear classifiers, except the SVM [3]. Although CV uses true target outputs for unseen samples, it is time consuming for large datasets and CL classifiers must be trained for C-fold CV and L choices of classifier parameters. CV methods estimate the expected generalization error instead of its bound, thus they do not guarantee the finally built classifier to have good generalization capability [1].

In image classification, one may not expect a classifier trained using one category of images (say, animals) to correctly classify images coming from some other categories (e.g. vegetables). In this case, one may revise the training dataset by adding training samples of vegetables and re-train the classifier to include the new class of images. For example, in our dataset we have images of cow but not airplane, thus we could not expect the classifier trained using our dataset to correctly recognize an airplane. It is expected that an image classifier work well for those classes that have been used to train it assuming images belonging to the same class are conceptually similar and such that their descriptor values should also be similar. That is, unseen samples similar to the training samples, in terms of sup-type of distance in the feature space is smaller than a given threshold, are considered to be more important. Thus, in the evaluation of the generalization capabilities of the image classifiers, one may ignore those images that are totally dissimilar to those existing in the training set.

In general, image classification problems are multi-class classification problems and difficult to find a classifier with good generalization properties. In this work, we aim to find an image classifier featuring better generalization capability and interpretability with respect to domain knowledge in image classification. We concentrate on finding an optimal number of receptive fields for RBFNNs to classify the images with lower generalization error to unseen images.

We organize the study in the following manner. The starting point is a discussion on the formation of the feature space based upon the framework of descriptors being available in the MPEG-7 standard. These issues are covered in Section 2. We provide a brief introduction to image classifiers in Section 3. The localized generalization error model $(R_{SM}^{*})$ and the corresponding approach to the selection of the architecture of the network are described in Sections 4 and 5, respectively. We present a comprehensive suite of experimental studies in Section 5. Concluding comments are covered in Section 6.

Section snippets

MPEG-7 feature space

In this section, we elaborate on the feature space arising within the framework of MPEG-7. The MPEG-7 descriptors are useful for low-level matching and provide a great flexibility for a wide range of applications.

Classifiers for image classification

In this section, we discuss several selected architectures of classifiers that are quite often encountered in image classification. It is of interest to investigate their properties in this setting and review some related development strategies.

A concept and realization of the localized generalization error

Given the anticipated diversity of images to be classified, one could easily envision that there is no classification algorithm that is capable of carrying out a zero error classification. This straightforward and very much intuitive observation is that when it comes to images that are very much different from those the classifier was exposed during the training phase. In other words, we acknowledge that any classifier comes with some limited generalization capabilities. In terms of the

The architecture design of RBFNNs

In the sequel, we confine ourselves to RBFNNs with Gaussian receptive fields. We apply a standard clustering algorithm (say, $k$ -Means, self-organizing maps, etc.) to find the location of the receptive fields of the network. Typically, this is done once the number of the receptive fields has been fixed. The choice of this number is not a trivial task and its suitable selection impacts the generalization abilities of the network. To address this issue, we discuss a new algorithm which will lead to

Experiments

In this section, we elaborate on the series of experiments. First, we elaborate on the experimental setup. Next, we report on the experimental results and focus on the interpretation of the network.

Conclusions

In this study, being motivated by the concept that semantically similar images should exhibit similarity in the feature space, we proposed an application of the localized generalization error model to image classification. This model captures the generalization error for unseen samples that are similar to the training samples. Experimental results show that the RBFNN trained using the minimization of the localized generalization error outperforms “standard” RBFNN and multi-class SVM. Moreover,

Acknowledgments

This work is supported by a Hong Kong Polytechnic University Interfaculty Research Grant No. G-T891 and Canada Research Chair (W. Pedrycz).

References (24)

T. Hastie et al.
The Element of Statistical Learning
(2001)
V. Vapnik
Statistical Learning Theory
(1998)
V. Cherkassky et al.
Model complexity control for regression using VC generalization bounds
IEEE Trans. Neural Networks
(1999)
B.S. Manjunath et al.
Introduction to MPEG-7 Multimedia Content Description Interface
(2002)
E. Izquierdo, I. Damnjanovic, P. Villegas, X. Li-Qun, S. Herrmann, Bringing user satisfaction to media access: the 1st...
V. Mezaris, H. Doulaverakis, R.M.B. de Otalora, S. Herrmann, I. Kompatsiaris, M.G. Strintzis, A test-bed for...
S.-F. Chang et al.
Overview of the MPEG-7 standard
IEEE Trans. Circuits Systems Video Technol.
(2001)
A. Dorado, W. Pedrycz, E. Izquierdo, An MPEG-7 learning space for semantic image classification, Proceedings of the...
B.S. Manjunath et al.
Color and texture descriptors
IEEE Trans. Circuits Systems Video Technol.
(2001)
A. Barla, F. Odone, A. Verri, Old fashioned state-of-the-art image classification, IEEE Proceedings of International...

A.K. Jain et al.

Statistical pattern recognition: a review

IEEE Trans. Pattern Anal. Mach. Intell.

(2000)

O. Chapelle et al.

Support vector machines for histogram-based image classification

IEEE Trans. Neural Networks

(1999)

Cited by (97)

Maximizing minority accuracy for imbalanced pattern classification problems using cost-sensitive Localized Generalization Error Model
2021, Applied Soft Computing
Citation Excerpt :
This is a challenging task if both the data distribution and the costs may change over time. For instance, image classification problems [47] are usually a multi-class imbalanced problems but most of current methods ignore the imbalance issue in different classes. The application of the c-LGEM to multi-class image classification problem may focus on the very large number of classes issue which leads to a very imbalanced classification problem for each class.
Traditional machine learning methods may not yield satisfactory generalization capability when samples in different classes are imbalanced. These methods tend to sacrifice the accuracy of the minority class to improve the overall accuracy without regarding the fact that misclassifications of minority samples usually costs more in many real world applications. Therefore, we propose a neural network training method via a minimization of the cost-sensitive localized generalization error-based objective function (c-LGEM) to achieve a better balance of error yielded by the minority and the majority classes. The c-LGEM emphasizes the minimization of the generalization error of the minority class in a cost-sensitive manner. Experimental results obtained on 16 UCI datasets show that neural networks trained by the c-LGEM yield better performance in comparison to the performance yielded by some existing methods.
ReDMark: Framework for residual diffusion watermarking based on deep networks
2020, Expert Systems with Applications
Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, applications of Convolutional Neural Networks for watermarking have recently emerged. In this paper, we propose a deep end-to-end diffusion watermarking framework (ReDMark) which can learn a new watermarking algorithm in any desired transform space. The framework is composed of two Fully Convolutional Neural Networks with residual structure which handle embedding and extraction operations in real-time. The whole deep network is trained end-to-end to conduct a blind secure watermarking. The proposed framework simulates various attacks as a differentiable network layer to facilitate end-to-end training. The watermark data is diffused in a relatively wide area of the image to enhance security and robustness of the algorithm. Comparative results versus recent state-of-the-art researches highlight the superiority of the proposed framework in terms of imperceptibility, robustness and speed. The source codes of the proposed framework are publicly available at Github¹.
Design methodology for Radial Basis Function Neural Networks classifier based on locally linear reconstruction and Conditional Fuzzy C-Means clustering
2019, International Journal of Approximate Reasoning
Citation Excerpt :
Fuzzy radial basis function neural networks (FRBFNNs) form fuzzy neural networks. FRBFNNs have been used widely in various areas such as system modeling, control, and classification [2,5–13]. FRBFNNs are another type of hybrid system, which stems from the fuzzy inference system and neural networks.
In this study, a new design method for Fuzzy Radial Basis Function Neural Networks classifier is proposed. The proposed approach is based on conditional Fuzzy C-Means clustering algorithm realized with the aid of auxiliary information, which is extracted by the locally linear reconstruction algorithm. Conditional Fuzzy C-Means can analyze the distribution of data (patterns) over the input space when being supervised by the auxiliary information. The conditional fuzzy C-Means clustering can substitute the conventional fuzzy C-Means clustering which has been usually used to define the radial basis functions over the input space. It is advocated that the auxiliary information extracted by using the locally linear reconstruction can determine which patterns among the entire data set are more important than the others. This assumption is based on the observation that the data, which cannot be fully reconstructed by the linear combination of their neighbors, may convey much more information than the other data to be reconstructed. It is well known that in the case of radial basis function neural networks classifier, the classification performance of this classifier is predominantly based on the distribution of the radial basis function over the input space. Several experiments are provided to verify the proposed design method for classification problems.
Modeling of CO<inf>2</inf> solubility in MEA, DEA, TEA, and MDEA aqueous solutions using AdaBoost-Decision Tree and Artificial Neural Network
2017, International Journal of Greenhouse Gas Control
Citation Excerpt :
SVM is categorized as a supervised method of machine learning. In cases associated with estimation of function, regression analysis, and classification, SVMs are attractive approaches (Jeng, 2006; Wing et al., 2007; Tsai and Sun, 2007; Acevedo-Rodríguez et al., 2009; Ceperic et al., 2012; Huang et al., 2004; Li et al., 2009; Stoean and Stoean, 2013; Subasi and Ismail Gursoy, 2010). Meyer et al. (2003) have compared the SVM to 16 classifiers and 9 regression approaches.
This communication deals with investigating the available experimental data for CO₂ capture by MEA, DEA, TEA, and MDEA aqueous solutions. For this purpose, first, the published data on CO₂ solubility in the aforementioned amine solutions have been gathered. Second, the modeling has been carried out by employing AdaBoost algorithm coupled with Decision Tree Regressor (AdaBoost-DT) and Artificial Neural Network (ANN). Furthermore, a new model is developed for predicting the CO₂ loading capacity of MDEA solution based on the Least Square Support Vector Machine (LSSVM). Finally, the modeling capabilities of the AdaBoost-DT, ANN, and LSSVM models for the application of interest are evaluated using some statistical parameters. In terms of R², ARD, AARD, and RMSE values, it was found that the presented AdaBoost-DT models provide the best predictions for all the studied systems. Furthermore, the error analysis revealed that the LSSVM outperforms the ANN in predicting CO₂ loading capacity of amine solutions.
Machine learning techniques for classification of breast tissue
2017, Procedia Computer Science
This paper presents an automated classification of breast tissue using two machine learning techniques: Feedforward neural network using the backpropagation learning algorithm (BPNN) and radial basis function network (RBFN). The two neural network models are implored basically to identify the best model for breast tissue classification after an intense comparison of experimental results. An electrical impedance spectroscopy method was used for data acquisition while BPNN and RBFN were the models implored for the execution of the classification task. The approach implored in this paper is made out of the following steps; feature extraction, feature selection and classification steps. The features are obtained using the electrical impedance spectroscopy (EIS) at the feature extraction stage. These extracted features are impedance at zero frequency (I0), the high frequency slope of phase angle, the phase angle at 500KHz, the area under spectrum, the maximum of spectrum, the normalized area, the impedance distance between spectral ends, the distance between the impedivity at I0 and the real part of the maximum frequency point and the length of the spectral curve. Information theoretic criterion is the strategy used in the proposed algorithm for feature selection and classification phase that was executed using the BPNN and RBFN. The performance measure of the two algorithms is the accuracy of the BPNN and RBFN models. The RBFN outperforms the BPNN in terms of accuracy in classifying breast tissues, minimum square error reached, and time to learn as demonstrated in the experimental results.
Steganalysis classifier training via minimizing sensitivity for different imaging sources
2014, Information Sciences
Owing to the ever proliferation of digital cameras and image editing software, a large variety of JPEG quantization tables are used to compress JPEG images. As a result, learning-based steganalysis methods using a pre-selected quantization table for training images degrade significantly when the quantization table of testing images is different from the one used for training. Recognizing that it would be undesirable and not practical to train a steganalysis classifier with all possible quantization tables, we propose an approach that the differences in features extracted from images with different quantization tables are formulated as perturbations of those features. Then we define a stochastic sensitivity by the expected square of classifier output changes with respect to these feature perturbations to compute the robustness of classifiers with respect to perturbations. A Radial Basis Function Neural Network based steganalysis classifier trained by minimizing the sensitivity is proposed. Experimental results show that the proposed method outperforms learning methods such as Support Vector Machine and Radial Basis Function Neural Network without considering feature perturbations.

View all citing articles on Scopus

View full text

Image classification with the use of radial basis function neural networks and the minimization of the localized generalization error

Abstract

Introduction

Section snippets

MPEG-7 feature space

Classifiers for image classification

A concept and realization of the localized generalization error

The architecture design of RBFNNs

Experiments

Conclusions

Acknowledgments

The Element of Statistical Learning

Statistical Learning Theory

Model complexity control for regression using VC generalization bounds

IEEE Trans. Neural Networks

Introduction to MPEG-7 Multimedia Content Description Interface

Overview of the MPEG-7 standard

IEEE Trans. Circuits Systems Video Technol.

Color and texture descriptors

IEEE Trans. Circuits Systems Video Technol.

Statistical pattern recognition: a review

IEEE Trans. Pattern Anal. Mach. Intell.

Support vector machines for histogram-based image classification

IEEE Trans. Neural Networks