A novel ant colony optimization based single path hierarchical classification algorithm for predicting gene ontology

doi:10.1016/j.asoc.2013.11.012

Applied Soft Computing

Volume 16, March 2014, Pages 34-49

https://doi.org/10.1016/j.asoc.2013.11.012 Get rights and content

Highlights

•
hAntMiner-C is a hierarchical classifier and handle tree and DAG topologies.
•
Our classifier can handle single path hierarchical classification.
•
Detailed review of hierarchical single and multi-label classification is included.
•
hAntMiner-C is tested over ion-channel datasets.
•
Our classifier is statistically significantly better as compared to the competitors.

Abstract

There exist numerous state of the art classification algorithms that are designed to handle the data with nominal or binary class labels. Unfortunately, less attention is given to the genre of classification problems where the classes are organized as a structured hierarchy; such as protein function prediction (target area in this work), test scores, gene ontology, web page categorization, text categorization etc. The structured hierarchy is usually represented as a tree or a directed acyclic graph (DAG) where there exist IS-A relationship among the class labels. Class labels at upper level of the hierarchy are more abstract and easy to predict whereas class labels at deeper level are most specific and challenging for correct prediction. It is helpful to consider this class hierarchy for designing a hypothesis that can handle the tradeoff between prediction accuracy and prediction specificity. In this paper, a novel ant colony optimization (ACO) based single path hierarchical classification algorithm is proposed that incorporates the given class hierarchy during its learning phase. The algorithm produces IF–THEN ordered rule list and thus offer comprehensible classification model. Detailed discussion on the architecture and design of the proposed technique is provided which is followed by the empirical evaluation on six ion-channels data sets (related to protein function prediction) and two publicly available data sets. The performance of the algorithm is encouraging as compared to the existing methods based on the statistically significant Student's t-test (keeping in view, prediction accuracy and specificity) and thus confirm the promising ability of the proposed technique for hierarchical classification task.

Graphical abstract

Introduction

Advanced sensing, capturing and computing technologies enable us to collect large amount of complex (possibly, raw) data in many fields of lives but how do we know which portion of data is important and gives us an insight to help our decision making process? Moreover, when this problem is faced in the medical domain and the decision become a matter of life and death of a patient, finding the correct set of information become more prominent and crucial. Data mining techniques can be used to extract implicit, previously unknown and potentially useful [1] patterns and knowledge of interests from these vast data stores for varied purposes. One such important data mining technique, known as classification, is successfully used in a myriad of applications e.g. decision making, fraud detection, medical diagnosis, credit scoring, customer relationship management, character recognition, speech recognition, protein function prediction etc.

Numerous classification techniques are proposed throughout the decades, such as Decision Trees, Neural Networks (NN), k-Nearest Neighbor's (k-NN), Logistic Regression and Support Vector Machine (SVM), etc. [2], [3], [4]. Some of these techniques (e.g. NN and SVM) produce incomprehensible classification models that are usually opaque to common users, while others e.g., IF–THEN rule list produced with decision trees, are easily comprehensible to the experts working in different domains. These techniques are reported to perform well in various domains and considered computationally efficient (e.g. SVM), robust to noisy data (e.g. decision trees) and easy to learn (e.g. k-NN). However, most of these classification techniques are designed to handle the data with binary or nominal class labels (where class labels are independent). These classical strategies lack the ability to handle the problems where the class labels are related and are organized based on a class hierarchical structure (CHS). The later one is a complex instance of classification problems, known as hierarchical classification, as compared to the one level flat classification problems [5]. Hierarchical classification has applications in various domains.

This research work focuses on hierarchical protein function prediction defined under the scheme of Gene Ontology (GO), specifically “molecular function” domain. The GO structure [6] represents the relationship among the protein functions using directed acyclic graph (DAG) CHS. It is well known that a protein can perform multiple functions (considered as class labels) and these functions are usually related (modeled as tree or DAG based IS-A relationship), makes protein function prediction a suitable and ideal problem for proposed algorithm. There is a large amount of uncharacterized protein data which is available for analysis that has led to an increased interest in computational methods to support the investigation of the role of proteins in an organism [7], [8]. Analyzing the functions of proteins in different organisms is crucial to improve biological knowledge, diagnosis and treatment of diseases. It is not possible to conduct the biological experiments for the functional essay analysis of every uncharacterized protein due to involvement of high cost and human based analysis [9]. It therefore raise the need of the development of computational methods (especially related to data mining domain, like the one proposed in this paper) to be used for this purpose.

In case of hierarchical classification, the class labels to be predicted, are naturally organized as a class hierarchy/taxonomy, typically represented as a tree or DAG, see e.g. Fig. 1a and b. The class labels in the hierarchy are represented as nodes and the relationships between the class labels are shown with undirected edges. For tree structure, a node can have only one parent whereas no such restriction is imposed over DAG CHS. Predicting a single class label in the hierarchy, implies that all the ancestor class labels are also predicted. In other words, a single class label is a path from root node to the predicted child node (explained later) that is consistent to the IS-A relationship.

Considering the class hierarchy, nodes at the upper levels represent more general class labels whereas the nodes at the lower levels represent the more specific class labels. General class labels are easy to predict as numerous examples related to them are available (to the hypothesis learner). On the other hand, classes at the deeper levels of the hierarchy (i.e. specific classes) are difficult to predict as less information is available to discriminate among them. There is always a tradeoff between generality and specificity in hierarchical classification. In Fig. 2, a dataset is given with corresponding CHS for different animals. Given a test example (from this dataset), if we predict the class label ‘Animal’, the prediction is 100% accurate but we get no valuable information about the specific class of animal. Predicting specific class of animal is more important but the chances of erroneous prediction is high.

In order to identify the types of problems that our proposed algorithm can handle, more information is provided in follows. Based on class label(s) associated with an example, the hierarchical classification has further two categories [9]:

(1)
Hierarchical Single Label (path) Classification: In this type of classification, an example is associated with only one class label at any level of the class hierarchy.
(2)
Hierarchical Multi-label (path) Classification: In this type of classification, an example can be associated with more than one class label at any level of the hierarchy (multi-paths in the hierarchy).

The hierarchical classification problems can further be divided in two categories based on the level (depth) of the predicted class label [9], [10]: (1) Mandatory Leaf Node Prediction (MLNP), and (2) Optional Leaf Node Prediction (OLNP) or Non-Mandatory Leaf Node Prediction (NMLNP). Based on MLNP, it is mandatory for a classifier to predict at least one of the leaf class labels (from CHS) for classifying a test example. The OLNP problems are somewhat flexible and classifier can predict class label(s) for a test example at any level of the class hierarchy. The proposed algorithm can only handle hierarchical single path classification problems, considering only the OLNP case.

The remainder of this paper is organized as follows. In the next section, we review related research for hierarchical classification task. In Section 3, we briefly present the basics and the background of ACO meta-heuristic. In Section 4, the architecture of the proposed solution will be discussed. Subsequently, in Section 5, we present simulation results to show the promising ability of our technique. Finally, Section 6 will conclude this work.

Section snippets

Related work

One simple approach to deal with the hierarchical classification problems is to completely ignore the given CHS by using a flat classification algorithm (e.g. decision tree or SVM, etc.), predicting only leaf class nodes. This approach provides an indirect solution to the hierarchical classification problem as if a class at leaf node is predicted, all the ancestor classes (considering the IS-A relationship) are also implicitly assigned to the instance being classified. However, this approach

Ant colony optimization

Swarm Intelligence [16], [17], [18], which deals with the collective behavior of small and simple entities, has been used in many application domains. ACO, proposed in the early 90s [19], [20], [21], [22], is one of the most famous meta-heuristic under the umbrella of Swarm Intelligence. Since its inception, ACO has been used to solve many complex problems including those related to data mining [23], [24], [25] as well as other combinatorial optimization problems. ACO is inspired by the food

Method

In this section, we discuss different stages of the proposed ACO based hierarchical classification algorithm. We begin with the definition of the problem tackled in this work followed by a brief general description of the proposed algorithm. Afterwards, each and every stage of the approach is further discussed in a fair amount of details. The stages are: search space design, rule construction based on pheromone and a correlation based heuristic function, rule evaluation, rule pruning, pheromone

Results and discussion

In this section, we present the simulation results of our proposed method (hAM-C) in comparison with another hierarchical single path classification ACO based algorithm (hAM), proposed in the work of Otero et al. [7]. The proposed algorithm is implemented in the Microsoft Visual Studio (2008) development environment using C-Sharp language. On the other hand, for hAM [7], JAVA based implementation is kindly made available by Otero. All the experiments are conducted on an Intel Core i3 Processor

Conclusion

In this article, we have presented a novel ant colony optimization based single path hierarchical classification algorithm, named hAM-C. A detailed review of different types of hierarchical classification problems and different categories of corresponding solutions is also provided to enhance the understanding regarding the target problem and to facilitate the readers. Extending on the ideas of our previous flat classification algorithm AntMiner-C, the hAM-C is tailor to handle the hierarchical

References (37)

Y.-L. Chen et al.
Constructing a decision tree from data with hierarchical class labels
Expert Systems with Applications
(2009)
J.R. Quinlan
C4.5: Programs for Machine Learning
(1993)
J.R. Quinlan
Generating production rules from decision trees
V.N. Vapnik
The Nature of Statistical Learning Theory
(1995)
A. Freitas et al.
A tutorial on hierarchical classification with applications in bioinformatics
M. Ashburner
The gene ontology: tool for the unification of biology
Nature Genetics
(2000)
F.E.B. Otero et al.
A hierarchical classification ant colony algorithm for predicting gene ontology terms
F.E.B. Otero et al.
A hierarchical multi-label classification ant colony algorithm for protein function prediction
Memetic Computing
(2010)
F.E.B. Otero
New Ant colony optimization algorithms for hierarchical classification of protein functions
(2010)
C.N. Cilla et al.
A survey of hierarchical classification across different application domains
Data Mining & Knowledge Discovery
(2010)

D. Koller et al.

Hierarchically classifying documents using very few words

C. Vens et al.

Decision trees for hierarchical multi-label classification

Machine Learning

(2008)

H. Blockeel et al.

Top-down induction of clustering trees

R.S. Parpinelli et al.

Data mining with an ant colony optimization algorithm

IEEE Transaction on Evolutionary Computation

(2002)

F. Otero et al.

cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes

A.P. Engelbrecht

Computational Intelligence, An Introduction

(2007)

A.P. Engelbrecht

Fundamentals of Computational Swarm Intelligence

(2005)

J. Kennedy et al.

Swarm Intelligence

(2001)

Cited by (22)

Adaptive search strategy based chemical reaction optimization scheme for task scheduling in discrete multiphysical coupling applications
2022, Applied Soft Computing
Parallel computing problems of multiphysical coupling applications based on discrete grids can be equivalently transformed into parallel computing problems based on a directed acyclic graph (DAG). Due to the problem of the discrete high-dimensional grids of the multiphysical coupling application, the directed data dependencies usually have parallelism. Moreover, this kind of scheduling problem based on DAG is NP-hard problem. Heuristic algorithms are often used to achieve optimal execution order scheduling of tasks and mapping between tasks and processors. In this paper, we propose an improved chemical reaction optimization algorithm based on adaptive search strategy (ASSCRO), which is used to solve the DAG task scheduling problem of discrete multiphysical coupling applications. ASSCRO is divided into two phases. The first phase is used to search the directed execution order of the tasks, and the second phase aims to use the heuristic strategy to map the tasks to the processors efficiently. In the four basic reactions of CRO, the algorithm can cover a larger search solution space by using an adaptive search strategy, it can obtain a better solution, and achieve less overhead and superior performance than the state-of-the-art. We conducted the experiment that applied our ASSCRO to deal with the multiphysical coupling applications. The experimental results showed that the proposed algorithm outperforms other algorithms in multiple metrics when dealing with DAG scheduling problems.
Mapping ontology vertices to a line using hypergraph framework
2020, International Journal of Cognitive Computing in Engineering
As a conceptual semantic tool, ontology is widely used in many disciplines such as genetics, nutrition, and social sciences. The key issues for ontology applications are similarity calculations and ontology alignment. In recent years, various machine learning methods and computational models have been widely used in ontology optimization and computation. The core idea is to map the entire ontology graph into one-dimensional data, such as on a real axis or on a natural number set. Through the analysis of the previous multi-dividing ontology algorithm, the technique of achieving dimensionality reduction comes from the pairwise comparison of the ontology sample vertices. The weakness of such tricks is that only two ontology vertices can be extracted for comparison at a time, which causes the number of vertex pairs to be compared in the optimization model to become very large as the totally ontology sample size increases. This paper proposes a new class of ontology learning strategies, which aims to arrange the ontology concepts into one-dimensional data according to the sequence of natural numbers. The ontology optimization model does not compare two ontology vertices, but compares a set of ontology vertices and calculates the weight of each vertex by means of random walk calculating. Each set of compared ontology vertices constitutes a hyperedge, and thus the ontology sample sets and the computational framework are represented by hypergraph and its associated bipartite graph. The algorithm proposed in this paper has potential guiding significance and theoretical value for engineering applications. In addition, two examples are presented to illustrate that our hypergraph based ontology learning algorithm is effective for a specific application background.
Low-carbon cold chain logistics using ribonucleic acid-ant colony optimization algorithm
2019, Journal of Cleaner Production
Citation Excerpt :
constructed the multi-objective optimization model based on the low-carbon method, which minimized the cost of the logistics and the amount of carbon emissions. Those prior models were solved and optimized using the ACO algorithm (Jangam and Chakraborti, 2007; Khan et al., 2014; Lan et al., 2015; Chen et al., 2019). Wang et al 2017, and Zohal and Soleimani (2016) constructed the multi-temperature, joint-delivery route optimization model which provides a time limit on the logistics under random demand.
Low-carbon economy is an emerging and inevitable pathway toward sustainable development of cold chain logistics. Low-temperature transportation is the crucial link of cold chain logistics to low-carbon economy and the industry is known to have the higher energy consumption. Prior studies are lacking in involving the carbon emission cost in the optimization process. Route optimization of low-temperature transportation is conducive to the low-carbon cold chain logistics. This study aims to introduce the low-carbon economy into the cold chain logistics. There are various costs needed to be considered in cold chain logistics, and a cold chain logistics route optimization model included the carbon emission cost was developed. Ribonucleic acid computing was combined with ant colony optimization to prevent the influence of unreasonable parameter selection on algorithm performance. This novel proposed approach was applied to solve the route optimization problem of a cold chain logistics firm located in the Xiong'an, China. The results showed this method reduced the overall cost of logistics and minimized the amount of carbon emissions. The finding shed the light on the low-carbon transformational development of cold chain logistics firms.
Priority-based and conflict-avoidance heuristics for multi-satellite scheduling
2018, Applied Soft Computing Journal
In this paper we address the problem of multi-satellite scheduling with limited observing ability. As with other computationally hard combinatorial optimization problems, a two-stage heuristic method is developed to obtain high quality solutions in a reasonable amount of computation time. The first stage involves the determination of an observing sequence and the generation of a feasible scheduling scheme. We propose several priority-based and conflict-avoidance heuristic strategies and develop the time-based greedy approaches, the weight-based greedy approaches, and an improved differential evolution (DE) algorithm. The second stage consists of further improvement strategies under different resource contentions, thus improving the scheduling results further. Finally, we design different classes of instances to test the efficiency and applicability of the methods. Computational results reveal that the new proposed methods routinely delivered very close to optimal solutions.
Prediction of water temperature in prawn cultures based on a mechanism model optimized by an improved artificial bee colony
2017, Computers and Electronics in Agriculture
Citation Excerpt :
Therefore, to overcome the deficiencies of traditional search approaches, it is necessary to research other advanced search technologies to improve the efficiency of parameter identification. With the enormous advances in machine learning and artificial intelligence over the last few decades, modern metaheuristic approaches have been developed, such as simulated annealing algorithms (SA) (Bahrami et al., 2016; Yannibelli and Amandi, 2013), differential evolution algorithms (DE) (Onan et al., 2016; Venske et al., 2014), genetic algorithms (GA) (Sawyerr et al., 2014; Liu et al., 2013a,b,c), particle swarm optimization algorithms (PSO) (Sugandhi et al., 2015), and ant colony algorithm (AC) (Beltramo et al., 2016; Khan et al., 2014). Among these techniques, swarm intelligence is usually embedded with the characteristics of a feedback mechanism, randomness, and synergy to develop a powerful and efficient mechanism, and has become increasingly popular for parameter identification in different application areas (Bahrami et al., 2016; Onan et al., 2016; Sawyerr et al., 2014; Liu et al., 2013a,b,c; Sugandhi et al., 2015; Beltramo et al., 2016).
To reduce aquaculture risk and optimize water quality management in prawn culture ponds, this paper uses mechanistic and statistical analytic methods to propose a hybrid water temperature forecasting model based on the water temperature mechanism model (WTMM) with optimal parameters selected by an improved artificial bee colony (IABC) algorithm. Because of existing problems with using an artificial bee colony algorithm in modeling, an improved ABC with a dynamically adjusted inertia weight based on the fitness function value was implemented to improve local and global search abilities. Then, IABC was employed to adaptively search for the optimal combinatorial parameters needed in the WTMM model, which overcomes the blindness of and limits to parameter selection for the traditional WTMM model. We adopted an IABC-WTMM algorithm to construct a non-linear mechanical prediction model. The IABC-WTMM was tested and compared to other algorithms by applying it to the prediction of water temperature in prawn culture ponds. Experimental results show that the proposed IABC-WTMM could increase prediction accuracy and execute generalization performance better than the original water temperature mechanism model (O-WTMM) and back-propagation neural network (BP-NN), but was inferior to the standard LSSVR model. Overall, it is a suitable and effective method for predicting water temperature in intensive aquacultures.
Ant colony optimization based hierarchical multi-label classification algorithm
2017, Applied Soft Computing Journal
Citation Excerpt :
The other solution strategy to deal with hierarchical classification problem is known as big-bang (or global) classification system, when a single classifier handling the entire class hierarchy (at once) is used, looking at the problem instance from a global perspective. The focus of article is on global hierarchical classification models, the readers are kindly referred to [49] for a detailed discussion on local classifier approaches for hierarchical classification. In this approach, all the related class labels as per the given CHS are considered at once from a global point of view.
There exist numerous state of the art classification algorithms that are designed to handle the data with nominal or binary class labels, where a sample belongs to only a single class label. In these problems, known as flat classification problems, class labels are independent of each other. Unfortunately, on the other hand, less attention is given to the genre of classification problems where samples may belong to several classes and at the same time the class labels are organized based on a structured hierarchy; such as gene ontology, protein function prediction, test scores, web page categorization, text categorization etc. This article presents a novel Ant Colony Optimization based hierarchical multi-label classification algorithm that can handle such a complex instance of classification problems and can incorporates the given class hierarchy during its learning phase. The algorithm produces IF-THEN ordered rule list to learn a comprehensible model which can easily be verified by experts. It exploits positive correlation between the domain values of two related attributes to improve the discrimination power of resultant classification model, up to a significant level. The paper contains rich details regarding hierarchical single label (or single path) and multi-label classification problems and different categories of corresponding solutions. The proposed method is evaluated on sixteen most challenging bioinformatics datasets; some of these containing hundreds of attributes and thousands of class labels. At the end, the proposed method is compared with four recent state of the art hierarchical multi-label classification algorithms. The empirical evaluation confirms the promising ability of the proposed technique for hierarchical multi-label classification task.

View all citing articles on Scopus

View full text

A novel ant colony optimization based single path hierarchical classification algorithm for predicting gene ontology

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Related work

Ant colony optimization

Method

Results and discussion

Conclusion

Expert Systems with Applications

C4.5: Programs for Machine Learning

Generating production rules from decision trees

The Nature of Statistical Learning Theory

A tutorial on hierarchical classification with applications in bioinformatics

The gene ontology: tool for the unification of biology

Nature Genetics

A hierarchical classification ant colony algorithm for predicting gene ontology terms

A hierarchical multi-label classification ant colony algorithm for protein function prediction

Memetic Computing

New Ant colony optimization algorithms for hierarchical classification of protein functions

A survey of hierarchical classification across different application domains

Data Mining & Knowledge Discovery