A novel ant colony optimization based single path hierarchical classification algorithm for predicting gene ontology
Graphical abstract
Introduction
Advanced sensing, capturing and computing technologies enable us to collect large amount of complex (possibly, raw) data in many fields of lives but how do we know which portion of data is important and gives us an insight to help our decision making process? Moreover, when this problem is faced in the medical domain and the decision become a matter of life and death of a patient, finding the correct set of information become more prominent and crucial. Data mining techniques can be used to extract implicit, previously unknown and potentially useful [1] patterns and knowledge of interests from these vast data stores for varied purposes. One such important data mining technique, known as classification, is successfully used in a myriad of applications e.g. decision making, fraud detection, medical diagnosis, credit scoring, customer relationship management, character recognition, speech recognition, protein function prediction etc.
Numerous classification techniques are proposed throughout the decades, such as Decision Trees, Neural Networks (NN), k-Nearest Neighbor's (k-NN), Logistic Regression and Support Vector Machine (SVM), etc. [2], [3], [4]. Some of these techniques (e.g. NN and SVM) produce incomprehensible classification models that are usually opaque to common users, while others e.g., IF–THEN rule list produced with decision trees, are easily comprehensible to the experts working in different domains. These techniques are reported to perform well in various domains and considered computationally efficient (e.g. SVM), robust to noisy data (e.g. decision trees) and easy to learn (e.g. k-NN). However, most of these classification techniques are designed to handle the data with binary or nominal class labels (where class labels are independent). These classical strategies lack the ability to handle the problems where the class labels are related and are organized based on a class hierarchical structure (CHS). The later one is a complex instance of classification problems, known as hierarchical classification, as compared to the one level flat classification problems [5]. Hierarchical classification has applications in various domains.
This research work focuses on hierarchical protein function prediction defined under the scheme of Gene Ontology (GO), specifically “molecular function” domain. The GO structure [6] represents the relationship among the protein functions using directed acyclic graph (DAG) CHS. It is well known that a protein can perform multiple functions (considered as class labels) and these functions are usually related (modeled as tree or DAG based IS-A relationship), makes protein function prediction a suitable and ideal problem for proposed algorithm. There is a large amount of uncharacterized protein data which is available for analysis that has led to an increased interest in computational methods to support the investigation of the role of proteins in an organism [7], [8]. Analyzing the functions of proteins in different organisms is crucial to improve biological knowledge, diagnosis and treatment of diseases. It is not possible to conduct the biological experiments for the functional essay analysis of every uncharacterized protein due to involvement of high cost and human based analysis [9]. It therefore raise the need of the development of computational methods (especially related to data mining domain, like the one proposed in this paper) to be used for this purpose.
In case of hierarchical classification, the class labels to be predicted, are naturally organized as a class hierarchy/taxonomy, typically represented as a tree or DAG, see e.g. Fig. 1a and b. The class labels in the hierarchy are represented as nodes and the relationships between the class labels are shown with undirected edges. For tree structure, a node can have only one parent whereas no such restriction is imposed over DAG CHS. Predicting a single class label in the hierarchy, implies that all the ancestor class labels are also predicted. In other words, a single class label is a path from root node to the predicted child node (explained later) that is consistent to the IS-A relationship.
Considering the class hierarchy, nodes at the upper levels represent more general class labels whereas the nodes at the lower levels represent the more specific class labels. General class labels are easy to predict as numerous examples related to them are available (to the hypothesis learner). On the other hand, classes at the deeper levels of the hierarchy (i.e. specific classes) are difficult to predict as less information is available to discriminate among them. There is always a tradeoff between generality and specificity in hierarchical classification. In Fig. 2, a dataset is given with corresponding CHS for different animals. Given a test example (from this dataset), if we predict the class label ‘Animal’, the prediction is 100% accurate but we get no valuable information about the specific class of animal. Predicting specific class of animal is more important but the chances of erroneous prediction is high.
In order to identify the types of problems that our proposed algorithm can handle, more information is provided in follows. Based on class label(s) associated with an example, the hierarchical classification has further two categories [9]:
- (1)
Hierarchical Single Label (path) Classification: In this type of classification, an example is associated with only one class label at any level of the class hierarchy.
- (2)
Hierarchical Multi-label (path) Classification: In this type of classification, an example can be associated with more than one class label at any level of the hierarchy (multi-paths in the hierarchy).
The hierarchical classification problems can further be divided in two categories based on the level (depth) of the predicted class label [9], [10]: (1) Mandatory Leaf Node Prediction (MLNP), and (2) Optional Leaf Node Prediction (OLNP) or Non-Mandatory Leaf Node Prediction (NMLNP). Based on MLNP, it is mandatory for a classifier to predict at least one of the leaf class labels (from CHS) for classifying a test example. The OLNP problems are somewhat flexible and classifier can predict class label(s) for a test example at any level of the class hierarchy. The proposed algorithm can only handle hierarchical single path classification problems, considering only the OLNP case.
The remainder of this paper is organized as follows. In the next section, we review related research for hierarchical classification task. In Section 3, we briefly present the basics and the background of ACO meta-heuristic. In Section 4, the architecture of the proposed solution will be discussed. Subsequently, in Section 5, we present simulation results to show the promising ability of our technique. Finally, Section 6 will conclude this work.
Section snippets
Related work
One simple approach to deal with the hierarchical classification problems is to completely ignore the given CHS by using a flat classification algorithm (e.g. decision tree or SVM, etc.), predicting only leaf class nodes. This approach provides an indirect solution to the hierarchical classification problem as if a class at leaf node is predicted, all the ancestor classes (considering the IS-A relationship) are also implicitly assigned to the instance being classified. However, this approach
Ant colony optimization
Swarm Intelligence [16], [17], [18], which deals with the collective behavior of small and simple entities, has been used in many application domains. ACO, proposed in the early 90s [19], [20], [21], [22], is one of the most famous meta-heuristic under the umbrella of Swarm Intelligence. Since its inception, ACO has been used to solve many complex problems including those related to data mining [23], [24], [25] as well as other combinatorial optimization problems. ACO is inspired by the food
Method
In this section, we discuss different stages of the proposed ACO based hierarchical classification algorithm. We begin with the definition of the problem tackled in this work followed by a brief general description of the proposed algorithm. Afterwards, each and every stage of the approach is further discussed in a fair amount of details. The stages are: search space design, rule construction based on pheromone and a correlation based heuristic function, rule evaluation, rule pruning, pheromone
Results and discussion
In this section, we present the simulation results of our proposed method (hAM-C) in comparison with another hierarchical single path classification ACO based algorithm (hAM), proposed in the work of Otero et al. [7]. The proposed algorithm is implemented in the Microsoft Visual Studio (2008) development environment using C-Sharp language. On the other hand, for hAM [7], JAVA based implementation is kindly made available by Otero. All the experiments are conducted on an Intel Core i3 Processor
Conclusion
In this article, we have presented a novel ant colony optimization based single path hierarchical classification algorithm, named hAM-C. A detailed review of different types of hierarchical classification problems and different categories of corresponding solutions is also provided to enhance the understanding regarding the target problem and to facilitate the readers. Extending on the ideas of our previous flat classification algorithm AntMiner-C, the hAM-C is tailor to handle the hierarchical
References (37)
- et al.
Constructing a decision tree from data with hierarchical class labels
Expert Systems with Applications
(2009) C4.5: Programs for Machine Learning
(1993)Generating production rules from decision trees
The Nature of Statistical Learning Theory
(1995)- et al.
A tutorial on hierarchical classification with applications in bioinformatics
The gene ontology: tool for the unification of biology
Nature Genetics
(2000)- et al.
A hierarchical classification ant colony algorithm for predicting gene ontology terms
- et al.
A hierarchical multi-label classification ant colony algorithm for protein function prediction
Memetic Computing
(2010) New Ant colony optimization algorithms for hierarchical classification of protein functions
(2010)- et al.
A survey of hierarchical classification across different application domains
Data Mining & Knowledge Discovery
(2010)
Hierarchically classifying documents using very few words
Decision trees for hierarchical multi-label classification
Machine Learning
Top-down induction of clustering trees
Data mining with an ant colony optimization algorithm
IEEE Transaction on Evolutionary Computation
cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes
Computational Intelligence, An Introduction
Fundamentals of Computational Swarm Intelligence
Swarm Intelligence
Cited by (22)
Mapping ontology vertices to a line using hypergraph framework
2020, International Journal of Cognitive Computing in EngineeringLow-carbon cold chain logistics using ribonucleic acid-ant colony optimization algorithm
2019, Journal of Cleaner ProductionCitation Excerpt :constructed the multi-objective optimization model based on the low-carbon method, which minimized the cost of the logistics and the amount of carbon emissions. Those prior models were solved and optimized using the ACO algorithm (Jangam and Chakraborti, 2007; Khan et al., 2014; Lan et al., 2015; Chen et al., 2019). Wang et al 2017, and Zohal and Soleimani (2016) constructed the multi-temperature, joint-delivery route optimization model which provides a time limit on the logistics under random demand.
Priority-based and conflict-avoidance heuristics for multi-satellite scheduling
2018, Applied Soft Computing JournalPrediction of water temperature in prawn cultures based on a mechanism model optimized by an improved artificial bee colony
2017, Computers and Electronics in AgricultureCitation Excerpt :Therefore, to overcome the deficiencies of traditional search approaches, it is necessary to research other advanced search technologies to improve the efficiency of parameter identification. With the enormous advances in machine learning and artificial intelligence over the last few decades, modern metaheuristic approaches have been developed, such as simulated annealing algorithms (SA) (Bahrami et al., 2016; Yannibelli and Amandi, 2013), differential evolution algorithms (DE) (Onan et al., 2016; Venske et al., 2014), genetic algorithms (GA) (Sawyerr et al., 2014; Liu et al., 2013a,b,c), particle swarm optimization algorithms (PSO) (Sugandhi et al., 2015), and ant colony algorithm (AC) (Beltramo et al., 2016; Khan et al., 2014). Among these techniques, swarm intelligence is usually embedded with the characteristics of a feedback mechanism, randomness, and synergy to develop a powerful and efficient mechanism, and has become increasingly popular for parameter identification in different application areas (Bahrami et al., 2016; Onan et al., 2016; Sawyerr et al., 2014; Liu et al., 2013a,b,c; Sugandhi et al., 2015; Beltramo et al., 2016).
Ant colony optimization based hierarchical multi-label classification algorithm
2017, Applied Soft Computing JournalCitation Excerpt :The other solution strategy to deal with hierarchical classification problem is known as big-bang (or global) classification system, when a single classifier handling the entire class hierarchy (at once) is used, looking at the problem instance from a global perspective. The focus of article is on global hierarchical classification models, the readers are kindly referred to [49] for a detailed discussion on local classifier approaches for hierarchical classification. In this approach, all the related class labels as per the given CHS are considered at once from a global point of view.