Evolutionary multi-objective visual cortex for object classification in natural images

doi:10.1016/j.jocs.2015.10.011

Journal of Computational Science

Volume 17, Part 1, November 2016, Pages 216-233

https://doi.org/10.1016/j.jocs.2015.10.011 Get rights and content

Highlights

•
We proposed a new methodology for image description.
•
We present a multi-objective approach for brain programming.
•
We match the state-of-the-art in classifying GRAZ-01 and outperform it in GRAZ-02.

Abstract

In recent years computer vision systems have used the human visual system as inspiration for solving different tasks such as object detection and classification. Computational models as the artificial visual cortex (AVC) have shown promising results in solving such problems. Thus, this paper proposes a new methodology for creating an image descriptor vector for classification, and at the same time, finding the objects’ location within the image. Also, this work implements the brain programming paradigm from a multi-objective perspective in order to improve the performance in the object classification task. This methodology is implemented for training the proposed model in order to classify the images from the GRAZ-01 and GRAZ-02 databases. The solutions found in this research match, and in some cases outperform, other techniques of the state-of-the-art for classifying the aforementioned databases.

Introduction

Numerous natural systems (brains, immune systems and societies) and artificial systems (parallel and distributed computing, artificial neural networks and evolutionary programs) are generally characterized by behaviors that emerge from non-trivial interactions between a large number of components often based on hierarchical structures [1]. The complexity of understanding and designing such systems while approaching difficult tasks resides in finding the best interactions. In this way, some research communities have focus their efforts in analyzing and creating such systems.

Holland describes a complex adaptive system as the integration of several interdependent entities, that collaborate to solve a given task, and are able to adapt to environmental changes or variations among the parts [2]. The elements that compose such systems are often called agents [3]. A well known example of an adaptive system for classification is the simple pattern recognition device presented in Holland's seminal work [4]. In such example, the complexity resides in the high amount of possible configurations of an array of binary sensors of size a × b, that is 2^ab, and the system's necessity in finding the right configuration for recognizing a given pattern. Today there exists examples of this kind of systems that focus on solving computer vision problems.

Sight is one of the most important senses for human beings, since it contributes approximately 70% of the information received by the brain. This information helps in the decision-making process performed during the interactions with the environment. Several scientific communities have focus their research in understanding the organization of the brain with the goal of emulating it. There are several computational models [5], [6], [7], [8], [9], [10], [11], [12], [13], [14] inspired in the hierarchical structure of the human visual system, its neurophysiological characteristics and neuropsychological theories such as: the feature integration theory [15], the biased competition theory [16], the recognition-by-components paradigm [17], the simple and complex cells model [18] and the two path cortical model [19]. These systems approach different visual tasks as object recognition, detection and classification.

Nowadays there are several evolutionary systems that focus in solving the aforementioned visual tasks. For example, Olague and Trujillo describe a Multi-objective Genetic Programming (MOGP) system for synthesizing interest point detectors for view-based object recognition [20]. Shao et al. proposed a feature learning system for classifying the objects in the Caltech-101 database. They evolve a set of two-dimensional operators using MOGP for creating what they call a “near-optimal” image descriptor [21]. Similarly, Al-Sahaf et al. present two GP based methods, the One-shot GP and Compound-GP systems, that aim to evolve a program for the task of binary classification of texture images [22].

A computational model of particular interest for this work is known as the artificial visual cortex (AVC) [23], which in turn is derived from the HMAX model [9], since it approaches the object recognition problem based on the human visual cortex. This model proposed by Olague et al. shows great performance at solving the absence/presence problem for object recognition. The AVC is based primarily on two models: a psychological model called feature integration theory and a neurophysiological model called the two pathway cortical model.

The first theory states that the visual attention task in human beings is performed in two stages. The first one is called the pre-attentive stage, where visual information is processed in parallel over different feature dimensions that compose the scene: shape, color, orientation, spacial frequency, brightness and motion direction. The second stage, called focal attention, integrates the extracted features from the previous stage in order to highlight a region of the scene. Thus, visual attention is the capability of a creature, living or artificial, to focus an object of interest on a visual environment [24]. Visual attention can be formally defined as “the process that establishes a relationship between the different properties in the scene, perceived through the visual system, with the objective of finding the best aspect for solving the task at hand” [25].

The second theory is the two pathway cortical paradigm. This neurophysiological model states that there are two information routes within the visual cortex, the dorsal and ventral streams. Both subsystems receive the same visual information as input, nevertheless they differ in the information transformations performed at each of them [27]. The dorsal stream is mainly related to the spacial detection of objects and visual attention [16], [26]. Additionally, the ventral stream is linked to object recognition and shape representation [19].

The natural visual system generally performs the detection process as part of solving the classification task. However, there is no evidence on which process has greater impact at the moment of processing the visual information. In this work, we focus primarily on the classification task; nevertheless, the idea is to optimize the model using a multi-objective perspective by emulating both processes of the natural system, as a strategy for obtaining better classification results.

The main contribution of this work is the study of object classification from a multi-objective perspective, based on the integration of the single objective approaches developed in [23], [30], while extending the preliminary results published in [31]. In this article the system's implementation is detailed following the SPEA2 algorithm.

This methodology has been implemented for classifying the Person class from the GRAZ-02 database [31], which is part of the European network PASCAL's Visual Objects Classes challenge. Moreover, in this work we optimize the system for classifying other four classes: Bike and Person from GRAZ-01, and Bike and Cars from GRAZ-02. We opted for these five classes since they are used as a testing standard for image classification [29], [11], [14], [13], and also because each evolutionary run requires a considerable computational time as will be explained in Section 3.

As part of the contributions, a cross-validation process is presented in this work, since GP has been criticized for over-training, and such test shows the performances of the methodology that we use to validate the results of the learning strategy that will be explained in the following section. In addition, a frequency of use analysis of the results allows us to observe how often the functions and terminals are applied in the proposed model while processing the GRAZ database.

The remainder of this paper is organized as follows. Section 2 details the stages of our approach using a multi-objective evolutionary framework, where we describe the AVCMO model focusing on the proposed methodology for building the image descriptor and the brain programming algorithm under a multi-objective approach. Then, Section 3 provides the performance of the AVCMO model for image classification of GRAZ-01 and GRAZ-02 classes. Finally, the conclusions for this work are given in Section 4.

Section snippets

Methodology

The AVC model was designed for classifying images regardless of color, orientation, illumination conditions, scale or position of the object of interest [28]. One of its innovations is the way it selects prominent image features in order to build an abstract representation of the object. Hence, the system seeks prominent points in the image in order to build an image descriptor which is later used for classification. For this reason, when processing images where the object of interest occupies

Experiments and results

In this work, we approach the classification problem from a presence/absence perspective. We follow a protocol composed of three steps; the first two define the training stage of the model, while the third one corresponds to the testing phase. Therefore, we need three image sets for the experiments, one per step. This protocol is described next:

1
Training: this step starts by evaluating each solution with an image set called training; one image descriptor is created per image. Then, these

Conclusions and future work

This paper proposed a methodology for creating an image descriptor vector using the AVCMO model for classification purposes. The system builds the descriptor using visual information taken from images of the object of interest by extracting information exclusively from the image region where the object is located; hence, implicitly finding its location. The AVCMO models were optimized through the evolutionary system called BP following a multi-objective design. The proposed strategy was applied

Acknowledgments

This work was funded by CONACyT México through the research project 155045 – “Evolución de Cerebros Artificiales en Visión por Computadora”. First author supported by scholarship 267339/220773.

Daniel E. Hernández is a Ph.D. candidate in Computer Science at Centro de Investigación Científica y de Educación Superior de Ensenada, B.C., (CICESE), Mexico. He received the M.Sc. Degree in Computer Science in 2011 from CICESE and he holds a Bachelor's degree in Computer Engineering from Universidad Autónoma de Baja California (UABC), Mexico. He is working in the EvoVision research team. His research interest includes computer vision, robotics, evolutionary computation and bio inspired

References (37)

A.M. Treisman et al.
A feature-integration theory of attention
Cogn. Psychol.
(1980)
M. Mishkin et al.
Object vision and spatial vision: two cortical pathways
TINS
(1983)
G. Olague et al.
Evolutionary-computer-assisted design of image operators that detect interest points using genetic programming
Image Vis. Comput.
(2011)
R. Farivar
Dorsal-ventral integration in object recognition
Brain Res. Rev.
(2009)
G. Olague et al.
Interest point detection through multiobjective genetic programming
Appl. Soft Comput.
(2012)
J. Behnamian
A parallel competitive colonial algorithm for JIT flowshop scheduling
J. Comput. Sci.
(2014)
S. Chan
Complex Adaptive Systems, ESD.83 Research Seminar in Engineering Systems
(2001)
J.H. Holland
Studying Complex Adaptive Systems
J. Syst. Sci. Complex.
(2006)
M. Wagner et al.
Evolving agent-based models using self-adaptive complexification
J. Computat. Sci.
(2015)
J.H. Holland
Complex Adaptive Systems
Daedalus Res. Libr.
(1992)

K. Fukushima

Neural network model for selective attention in visual pattern recognition and associative recall

Appl. Opt.

(1987)

B. Olshausen et al.

A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information

J. Neurosci.

(1993)

D. Walther et al.

Attentional selection for object recognition – a gentle way

Biol. Motiv. Comput. Vis.

(2002)

L. Itti et al.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(1998)

M. Riesenhuber et al.

Hierarchical models of object recognition in cortex

Nat. Neurosci.

(1999)

T. Serre et al.

Theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. Technical report

(2005)

J. Mutch et al.

Object class recognition and localization using sparse features with limited receptive fields

Int. J. Comput. Vis.

(2008)

H. Wersing et al.

Learning optimized features for hierarchical models of invariant object recognition

Neural Comput.

(2003)

Cited by (18)

Brain programming is immune to adversarial attacks: Towards accurate and robust image classification using symbolic learning
2022, Swarm and Evolutionary Computation
In recent years, the security concerns about the vulnerability of deep convolutional neural networks to adversarial attacks in slight modifications to the input image almost invisible to human vision make their predictions untrustworthy. Therefore, it is necessary to provide robustness to adversarial examples with an accurate score when developing a new classifier. In this work, we perform a comparative study of the effects of these attacks on the complex problem of art media categorization, which involves a sophisticated analysis of features to classify a fine collection of artworks. We tested a prevailing bag of visual words approach from computer vision, four deep convolutional neural networks (AlexNet, VGG, ResNet, ResNet101), and brain programming. The results showed that brain programming predictions’ change in accuracy was below 2% using adversarial examples from the fast gradient sign method. With a multiple-pixel attack, brain programming obtained four out of seven classes without changes and the rest with a maximum error of 4%. Finally, brain programming got four categories without changes using adversarial patches and for the remaining three classes with an accuracy variation of 1%. The statistical analysis confirmed that brain programming predictions’ confidence was not significantly different for each pair of clean and adversarial examples in every experiment. These results prove brain programming’s robustness against adversarial examples compared to deep convolutional neural networks and the computer vision method for the art media categorization problem.
Complex metaheuristics
2016, Journal of Computational Science
Citation Excerpt :
This thematic special issue revolves around the intersection of metaheuristic optimization techniques and complex systems from two different perspectives, namely the use of metaheuristics as a tool for analyzing, modeling or designing complex systems, or the utilization of metaheuristics approaches which are themselves complex systems due to its particular internal structure. We have gathered six papers [13–18] targeted to cover algorithmic and implementation aspects of such complex meta-heuristics in both discrete and continuous domains, as well as applications to complex systems. Some contributions to this thematic special issue are extended versions of results communicated at the EvoCOMPLEX track of the EvoApplications conference [19], held in Copenhagen, 8–10 April 2015 as a part of the EvoStar event.1
Complexity is a prevalent feature of numerous natural and artificial systems and as such has attracted much scientific interest in the last decades. The pursuit of computational tools capable of analyzing, modeling or designing systems exhibiting this complex nature – in which the properties of the system are not evident at the bottom level but emerge from its global structure – is a major issue. Metaheuristics can play here an important role due to its intrinsic adaptability and powerful optimization capabilities. In many regards, metaheuristics are also examples of complex systems since their behavior emanates from the orchestrated interplay of simpler algorithmic components. This bidirectional connection between metaheuristics and complex systems offers numerous avenues for fruitful research.
Multi-criteria Decision-Making Techniques for the Selection of Pareto-optimal Machine Learning Models in a Drinking-Water Quality Monitoring Problem
2024, International Journal of Information Technology and Decision Making
Crop Leaf Disease Recognition Network Based on Brain Parallel Interaction Mechanism
2022, Journal of Donghua University (English Edition)
Enhanced Connectivity Validity Measure Based on Outlier Detection for Multi-Objective Metaheuristic Data Clustering Algorithms
2022, Applied Computational Intelligence and Soft Computing
A recent survey on the applications of genetic programming in image processing
2021, Computational Intelligence

View all citing articles on Scopus

Eddie Clemente received the Ph.D. degree in Computer Science in 2015 and the M.Sc. degree in Computer Science in 2006, both from the Centro de Investigación Científica y de Educación Superior de Ensenada, B.C., (CICESE), Mexico. He holds a Bachelor's degree in Mechatronics Engineering from UPIITA-IPN, Mexico. He is working as a member of the Robotics and Control research team from the Instituto Tecnológico de Ensenada. His research interest includes evolutionary computer vision, robotics and evolutionary computation.

Gustavo Olague received the Ph.D. degree in Computer Vision, Graphics and Robotics from INPG and INRIA. He is currently a Professor in the Computer Science Department at CICESE in Ensenada. Professor Olague has written over hundred conference and journal papers and co-edited two special issues in Pattern Recognition Letters and Evolutionary Computation, as well as served as co-chair of the Real-World Application track at the Genetic and Evolutionary Computation Conference. Dr. Olague has received numerous distinctions such as the Talbert Abrams award offered by the ASPRS; best paper awards at major conferences like GECCO, EvoIASP, and EvoHOT; and received two times the Bronze Medal at the Human-Competitive awards at GECCO. He is the author of the book Evolutionary Computer Vision published by Springer.

José L. Briseño received the M.Sc. degree in Electronic Instrumentation and Telecommunications in 1983 from the Scientific Research Center and Higher Education of Ensenada (CICESE), Mexico. He holds a bachelor's degree in Communications and Electronics from the Guadalajara University, Mexico, graduated in 1977. Since 2000 he has been a full professor of artificial intelligence at CICESE, where he is an associate researcher at the EvoVision laboratory from the Computer Science Department. His research focus is machine learning and knowledge processing by ontologies.

View full text

Evolutionary multi-objective visual cortex for object classification in natural images

Highlights

Abstract

Introduction

Section snippets

Methodology

Experiments and results

Conclusions and future work

Acknowledgments

Cogn. Psychol.

TINS

Image Vis. Comput.

Brain Res. Rev.

Appl. Soft Comput.

J. Comput. Sci.

Complex Adaptive Systems, ESD.83 Research Seminar in Engineering Systems

Studying Complex Adaptive Systems

J. Syst. Sci. Complex.

Evolving agent-based models using self-adaptive complexification

J. Computat. Sci.

Complex Adaptive Systems

Daedalus Res. Libr.