A novel biologically inspired ELM-based network for image recognition

doi:10.1016/j.neucom.2015.03.117

Neurocomputing

Volume 174, Part A, 22 January 2016, Pages 286-298

https://doi.org/10.1016/j.neucom.2015.03.117 Get rights and content

Abstract

In this paper, a novel biologically inspired network for image recognition has been introduced. The Hierarchical model and X (HMAX) model and the extreme learning machine (ELM) are combined, to construct a five-layer feed-forward network: S1–C1–S2–C2–H. The previous four layers, originating from HMAX, provide robust feature representation of specific object, and the feature classification stage in the H layer is implemented with ELM. The HMAX model simulates the hierarchical processing mechanism in primate visual cortex, to calculate complex features representation. As a biological learning algorithm for generalized SLFNs, ELM learns much faster with good generalization performance, and performs well in classification applications. Four groups of experiments are performed on three datasets, and the results are compared with state-of-the-art techniques. Experimental results show that our proposed network has good performance with fast learning speed.

Introduction

Object recognition has been a popular area of intense research and is also a very challenging task in computer vision, while human vision with unique processing mechanism has the ability to recognize objects rapidly, accurately, and effortlessly. The difficulty of object recognition in images is due to different illuminations, viewpoints, occlusions, scale and shift transforms. Meanwhile, the difficulty of object categorization lies in capturing the variability of appearance and shape of different objects belonging to the same class, while avoiding confusing objects from different classes. Thus in order to achieve robust object recognition, overcoming these obstacles above would be beneficial for many fields and applications, such as security surveillance, manufacturing production, robot navigation, character recognition, and clinical image understanding.

There are many research works done for object recognition. In Treiber׳s book [1], he gives a good overview of object recognition algorithms used in various applications, including global approaches, transformation-search-based methods, geometrical model driven methods, 3D object recognition schemes, flexible contour fitting algorithms, and descriptor-based methods. In the work of Belongie [2], a set of discrete points sampled from the contour of the shape is used as a shape descriptor and then K nearest neighbors (KNN) are used for classification. Mohan [3] built a parts based detector with Haar wavelets to represent the image, and then use a support vector machine (SVM) for classification. Lowe [4] developed an image feature, called scale-invariant feature transform (SIFT), that became the basis for features in many object recognition algorithms, while it is not an object recognition algorithm by itself. Laptev [5] uses histograms as features, weighted Fisher linear discriminant as a weak classifier, and then the AdaBoost for classification. Biologically motivated features based on Gabor filters and MAX operations have been developed [6], [7]. Although the performance of the object recognition has been improved with the above algorithms, none of these algorithms available today can surpass the performance of the human brain. It suggests that more work needs to be done in this field, to solve the multiple problems in robust intelligence that the human brain is so good at, for enhancing the performance of object recognition.

Object recognition in human brain is largely invariant with regard to changes in the size, position, and viewpoint of the object. Therefore, it is perhaps not too surprising that the human brain has achieved, through millions of years of evolution, a remarkable ability to recognize objects in a robust, selective and fast manner. It is likely that, upon understanding how the neuronal circuitries can achieve these remarkable properties, it will be possible to translate the biological circuits into algorithms for computer vision and pattern recognition. A hierarchical cortical based model, named Hierarchical Model and X (HMAX) [8], [9], has attracted much attention, as the fact that it focuses on designing simple and complex operations inspired by the visual cortex, and that in [10], it is shown that the HMAX can provide robust representations of specific images, outperforming state-of-the-art such as SIFT under various invariance tasks on synthetic images. Recently, a novel learning algorithm for single hidden layer feed-forward networks (SLFNs), namely, extreme learning machine (ELM), proposed by Huang et al. [11], can be applied to regression and classification problems [12]. And in [13], it has been successfully applied in the face recognition, which improves the recognition accuracy rate.

In order to improve the performance of image based object recognition, this paper brings together two biologically inspired algorithms, HMAX and ELM, and insights to construct a novel biologically inspired network for image recognition. Since the HMAX features have better scale and translation invariance, the four-layer HMAX model, is employed for feature construction, feature selection and feature extraction, and provides robust feature representation of specific object image. As it has better performance than conventional methods, such as SVM, and it has an extremely fast learning speed, which is akin to the fast learning mechanism of the higher cortical areas, ELM is introduced for feature representation classification. Four groups of experiments will be performed on three datasets, to demonstrate the novelty and superiority of our proposed network over existing algorithms.

The rest of the paper is organized as follows. Section 2 states the problem and strategy of object recognition. In Section 3, preliminary information about HMAX and ELM is presented. Section 4 details the proposed biologically inspired ELM-based image recognition algorithm. In Section 5, several experiments are performed, and followed by results and discussions. The paper is concluded in Section 6.

Section snippets

Problem statement

Drawing on ideas from neurophysiology [14], object recognition is defined as the ability to accurately discriminate each named object (“identification”) or set of objects (“categorization”) from all other possible objects, materials, textures other visual stimuli, and to do this over a range of identity-preserving transformations of the retinal image of that object (e.g. image transformations resulting from changes in object position, distance, and pose).

An image is a visual representation of

Brief of the HMAX

A long-time goal for computer vision has been to build a system that achieves human-level recognition performance. Riesenhuber and Poggio summarized the basic facts about the ventral visual stream, a hierarchy of brain areas thought to mediate object recognition in cortex, and then proposed the HMAX model [8], which is a natural extension of the model of simple to complex cells of Hubel and Wiesel. Serre et al. improved the original HMAX model by adding multi-scale representations as well as

Design inspiration

This section will show the design inspiration of our object recognition network. Because humans and primates outperform the best machine vision systems with respect to almost any measure, building a system that emulates object recognition in cortex or matches with human vision as closely as possible has always been an attractive but elusive goal.

As introduced in Section 1, there exist many image recognition schemes. However, the trade-off between acquired accuracy and computational time poses a

Experiments

In this section, we investigate the performance of the proposed ELM-based recognition algorithm by conducting experiments on the image recognition tasks. We select three image datasets: Fifteen Scenes, DARPA LAGR datasets, and Still Action images. Some image samples from the three datasets are shown in Fig. 5.

1.
Fifteen Scenes [26]: The Fifteen Scenes dataset is composed of 15 natural categories of urban and rural scenes for a total of 4885 images.
2.
DARPA LAGR datasets [27]: There are six datasets

Conclusion

In this paper, we have proposed a novel biologically inspired image recognition network based on the HMAX and the extreme learning machine. The network consists of five layers: S1–C1–S2–C2–H, to complete the whole object recognition task. The previous four layers focus on the design of feature representation structure, and build simple and complex features based on physiological data about the mammalian visual pathways. The H layer at last pays attention on learning mechanism of the higher

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 61005085) and Fundamental Research Funds for the Central Universities (2012QNA4024).

Yu Zhang received the B.S. degree in information engineering from Xi׳an Jiaotong University, Xi׳an, China in 2003, and the M.S. and Ph.D. degrees in computer science from Tsinghua University, Beijing, China in 2009. He was a post-doctor at Tsinghua University from 2009 to 2011 and a visiting scholar at Carnegie Mellon University from 2013 to 2014. He is now a lecturer in the School of Aeronautics and Astronautics at Zhejiang University. His research interests include artificial intelligence,

References (32)

I. Laptev
Improving object detection with boosted histograms
Image Vis. Comput.
(2009)
W. Zong et al.
Face recognition based on extreme learning machine
Neurocomputing
(2011)
J.J. DiCarlo et al.
Untangling invariant object recognition
Trends Cognit. Sci.
(2007)
B. Xu et al.
Discrete-time hypersonic flight control based on extreme learning machine
Neurocomputing
(2014)
J.W. Lee et al.
An extensive comparison of recent classification tools applied to microarray data
Comput. Stat. Data Anal.
(2005)
G.-B. Huang et al.
Enhanced random search based incremental extreme learning machine
Neurocomputing
(2008)
G.-B. Huang et al.
Extreme learning machinetheory and applications
Neurocomputing
(2006)
J.J. DiCarlo et al.
How does the brain solve visual object recognition?
Neuron
(2012)
M.A. Treiber, An Introduction to Object Recognition, Springer, London,...
S. Belongie et al.
Shape matching and object recognition using shape contexts
IEEE Trans. Pattern Anal. Mach. Intell.
(2002)

A. Mohan et al.

Example-based object detection in images by components

IEEE Trans. Pattern Anal. Mach. Intell.

(2001)

D.G. Lowe

Distinctive image features from scale-invariant keypoints

Int. J. Comput. Vis.

(2004)

T. Serre, L. Wolf, T. Poggio, Object recognition with features inspired by visual cortex, in: 2005 IEEE Computer...

J. Mutch et al.

Object class recognition and localization using sparse features with limited receptive fields

Int. J. Comput. Vis.

(2008)

M. Riesenhuber et al.

Hierarchical models of object recognition in cortex

Nat. Neurosci.

(1999)

T. Serre et al.

Robust object recognition with cortex-like mechanisms

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

Cited by (16)

A model for fine-grained vehicle classification based on deep learning
2017, Neurocomputing
Citation Excerpt :
The process of this model is as Fig. 6. Original image is first fed into a convolutional network which uses VGG16 network structure as depicted in [36] [53]. And then feature maps of the original image will be generated, on which a RPN network is applied to acquire region proposals.
A model for fine-grained vehicle classification based on deep learning is proposed to handle complicated transportation scene. This model comprises of two parts, vehicle detection model and vehicle fine-grained detection and classification model. Faster R-CNN method is adopted in vehicle detection model to extract single vehicle images from an image with clutter background which may contains serval vehicles. This step provides data for the next classification model. In vehicle fine-grained classification model, an image contains only one vehicle is fed into a CNN model to produce a feature, then a joint bayesian network is used to implement the fine-grained classification process. Experiments show that vehicle’s make and model can be recognized from transportation images effectively by using our method. Furthermore,in order to build a large scale database easier, this paper comes up with a novel network collaborative annotation mechanism.
Deep object recognition across domains based on adaptive extreme learning machine
2017, Neurocomputing
Citation Excerpt :
Many improvements and new applications of ELMs have been proposed by world-wide researchers. The newest work about improved extreme learning machines in deep auto-encoder, local receptive fields for deep learning, transfer learning, and semi-supervised learning have also been proposed [26–30,36–43]. Yang et al. proposed a subnetwork nodes based multilayer ELM framework for representational learning [44].
Deep learning with a convolutional neural network (CNN) has been proved to be very effective in feature extraction and representation of images. For image classification problems, this work aims at exploring the capability of extreme learning machine on high-level deep features of images. Additionally, motivated by the biological learning mechanism of ELM, in this paper, an adaptive extreme learning machine (AELM) method is proposed for handling cross-task (domain) learning problems, without loss of its nature of randomization and high efficiency. The proposed AELM is an extension of ELM from single task to cross task learning, by introducing a new error term and Laplacian graph based manifold regularization term in objective function. We have discussed the nearest neighbor, support vector machines and extreme learning machines for image classification under deep convolutional activation feature representation. Specifically, we adopt 4 benchmark object recognition datasets from multiple sources with domain bias for evaluating different classifiers. The deep features of the object dataset are obtained by a well-trained CNN with five convolutional layers and three fully-connected layers on ImageNet. Experiments demonstrate that the proposed AELM is comparable and effective in single and multiple domains based recognition tasks.
Energy saving and prediction modeling of petrochemical industries: A novel ELM based on FAHP
2017, Energy
Citation Excerpt :
Thus the training speed and the generalization accuracy are high, having strong robustness and not being prone to local optima [19]. Due to these advantages, the ELM has been used in self-organized clustering [20], regression and multiclass classification [21], traffic sign recognition [22], image recognition [23], computer vision processing [24] and feature selection [25]. Cao et al. used the self-adaptive differential evolution algorithm to optimizing the learning parameters of the hidden neuron and obtained an improved self-adaptive evolutionary ELM learning algorithm [26].
Extreme learning machine (ELM), which is a simple single-hidden-layer feed-forward neural network with fast implementation, has been widely applied in many engineering fields. However, it is difficult to enhance the modeling ability of extreme learning in disposing the high-dimensional noisy data. And the predictive modeling method based on the ELM integrated fuzzy C-Means integrating analytic hierarchy process (FAHP) (FAHP-ELM) is proposed. The fuzzy C-Means algorithm is used to cluster the input attributes of the high-dimensional data. The Analytic Hierarchy Process (AHP) based on the entropy weights is proposed to filter the redundant information and extracts characteristic components. Then, the fusion data is used as the input of the ELM. Compared with the back-propagation (BP) neural network and the ELM, the proposed model has better performance in terms of the speed of convergence, generalization and modeling accuracy based on University of California Irvine (UCI) benchmark datasets. Finally, the proposed method was applied to build the energy saving and predictive model of the purified terephthalic acid (PTA) solvent system and the ethylene production system. The experimental results demonstrated the validity of the proposed method. Meanwhile, it could enhance the efficiency of energy utilization and achieve energy conservation and emission reduction.
Voting based q-generalized extreme learning machine
2016, Neurocomputing
Citation Excerpt :
Other application-specific ensembles include online learning ensembles of extreme learning machines for predictions of variables in changing environments [20] and ELM ensembles based on average score aggregation for classification of remote sensing images [21]. Recent work has also been done on single classifiers, such as biologically inspired ELM-based networks simulating processing mechanism in primate visual cortex [22], self-organized clustering techniques using ELMs [23], and parsimonious extreme learning machines with sequential partial orthogonalization [24], while others focus on applications in a variety of domains [25–28]. The choice of activation functions may strongly influence performance of neural networks in complex problems.
A novel approach to extreme learning machine (ELM) ensembles is proposed. It incorporates majority voting into the recently proposed q-generalized random neural network (QRNN) to make the final decision for classification problems. Individual ELMs are trained with q-Gaussian activation functions using different values of the parameter q (called the entropic index). As a result, these classifiers are generally more accurate than traditional ELMs. Simulations on 45 machine learning data sets show that this method, termed voting based q-generalized extreme learning machine (V-QELM), outperforms other extreme learning machine ensembles. Statistical tests (Wilcoxon, Friedman, and Nemenyi) are used to validate statistical differences between our results. Kappa-error diagrams reveal that V-QELM constructs more accurate classifiers than those found in other ensemble methods. This implies that incorporating QRNNs can lead to higher performing ensembles of extreme learning machines.
Textile defect detection using multilevel and attentional deep learning network (MLMA-Net)
2022, Textile Research Journal
Novel patch selection based on object detection in HMAX for natural image classification
2022, Signal, Image and Video Processing

View all citing articles on Scopus

Lin Zhang received the B.S. in information and communication engineering from Zhejiang University, China, in 2012. And he is currently working toward the M.S. degree in the School of Aeronautics and Astronautics, Zhejiang University, China. His research interest includes artificial intelligence, computer vision and visual navigation.

Ping Li received the Ph.D. in industrial automation from Zhejiang University, China, in 1988. He was a post-doctor at Zhejiang University from 1988 to 1990. He is now a Professor of the School of Aeronautics and Astronautics and the Department of Control Science and Engineering, Zhejiang University, China. His research interests cover process control, UAV projects and intelligent transportation systems. He focuses on solving practical problems in scientific research work.

View full text

A novel biologically inspired ELM-based network for image recognition

Abstract

Introduction

Section snippets

Problem statement

Brief of the HMAX

Design inspiration

Experiments

Conclusion

Acknowledgments

Image Vis. Comput.

Neurocomputing

Trends Cognit. Sci.

Neurocomputing

Comput. Stat. Data Anal.

Neurocomputing

Neurocomputing

Neuron

Shape matching and object recognition using shape contexts

IEEE Trans. Pattern Anal. Mach. Intell.

Example-based object detection in images by components

IEEE Trans. Pattern Anal. Mach. Intell.

Distinctive image features from scale-invariant keypoints

Int. J. Comput. Vis.

Object class recognition and localization using sparse features with limited receptive fields

Int. J. Comput. Vis.

Hierarchical models of object recognition in cortex

Nat. Neurosci.

Robust object recognition with cortex-like mechanisms

IEEE Trans. Pattern Anal. Mach. Intell.