Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval

doi:10.1016/j.asoc.2012.10.019

Applied Soft Computing

Volume 13, Issue 2, February 2013, Pages 1292-1302

https://doi.org/10.1016/j.asoc.2012.10.019 Get rights and content

Abstract

This paper proposes a semantic-based image retrieval approach which refers to the ability of using keywords for searching within image datasets. This is possible by adding some textual metadata, called image annotation. Combination of classification and regression in decision tree (DT) has been employed for multi-labeling image annotation in which, more than one label will be considered for every single tuple. In the proposed approach, all concepts and their corresponding ranks will be stored in each DT leaf node instead of storing only a concept or a rank. We have used a hierarchical network of semantics to achieve a better performance. The main idea behind our approach is that in each leaf node, the system should give a higher rank to concepts with highest degree of purity and details according to prepared hierarchical semantic network. A segmented, feature extracted and annotated image dataset, SAIAPR-TC12, has been used for evaluation. A hierarchy of 256 semantic concepts which have been used in annotation process, made it very suitable for testing the approach. Experimental results confirmed that our approach illustrates better performance in comparison with single-labeling approaches which only assign one class to every single tuple and only support linear relationship among concepts.

Graphical abstract

Highlights

► The evaluation method has been changed to 4-fold cross validation. ► F-Measure has been added as another measure for performance evaluation. ► Some grammar errors have been fixed.

Introduction

According to the rapid growth of amount of multimedia information like digital images, systems for organizing them to search and retrieval seem to be necessary. In last decade, image retrieval (IR) has attracted a great deal of researches to simplify making huge number of images organized [5]. There are three main generations of IR systems [1]. Text-based image retrieval systems were the first ones which act only based on text metadata provided by human. Some people assigned some tags to images and system retrieved images based on those labels. Newer systems employ the web mining as their metadata provider [11]. Today's image search engines like Google¹ and Yahoo² can be mentioned as some modern text-based image retrieval systems which act only according to the provided texts around images in web pages. In this approach, a large collection of images needed a much deal of time to be annotated and it was really exhausting for human. There was another problem called subjectivity of human annotation which meant different people may induct different things from an image.

These two problems made text-based systems deficient and then, a system for automatic processing and retrieval of images was sought. At the second generation, developing content-based image retrieval (CBIR) systems or visual information retrieval systems (VIRS) for automatic processing of features of image had been became prevalent [5], [6]. The classical paradigm for content-based image retrieval is query by visual example [24]. The main difference between these two systems is that human is the main part of the former [4]. These systems presented the most similar images in database to the query image provided by the user [18]. One of the biggest lacks in these systems was that they did not seek for concepts within an image and ranked images similarity only according to their visual contents like color, texture and shape [21]. For example, two images with different concepts like Sunset and Orange might be considered as similar, because they have similar color histogram. Another problem was that there should be always one query image [1].

These matters lead not to satisfy users and then semantic-based image retrieval (SBIR) systems appeared as a solution. Semantic-based image retrieval systems can detect concepts of images and enable users to look for the high-level semantics within images regardless to their low-level features. Development of image retrieval systems leads to emerging of Automatic Image Annotation (AIA) systems in which a machine has the role of human in text-based systems and provide textual metadata for images based on their low-level features [8], [12], [13], [14], [25]. By using AIA, a user can retrieve images and look for semantics of an image. The main goal of image annotation is to make searching images by a keyword feasible [2], [22]. The main idea of AIA is to automatically learn a model from large number of image samples [1]. In this paper we used decision tree (DT) to learn that model.

Decision tree as a tool with capability of selecting the most discriminatory features, comprehensibility by human, being able to deal with noisy and incomplete data, etc. has been very applicable in classification and data mining problems [15], [16], [26], [27]. The similarity between the way human use for interpreting images and the way decision tree uses for inducing concepts makes this tool very applicable in image classification and retrieval [9].

Among different DT construction algorithms, ID3, C4.5 and CART can be mentioned as the most famous ones [1]. These algorithms are different from each other from three aspects: (1) the feature type they support (continuous vs. discrete), (2) feature selection criteria and (3) final node insertion process.

ID3 is known as the first and simplest DT construction algorithm. Although this algorithm has some common features with other ones, like selecting the most discriminatory features, but it has some special characteristics. Simplicity and comprehensibility can be considered as its advantages. On the other hand, it only supports discrete values as input and no matter how the discretization algorithm is efficient, ID3 has to work with unreal data when continuous values, like features of an image, are presented. The C4.5 [19] was developed to address this problem but it can be used just for data classification. CART [20] is another famous DT construction algorithm that creates a binary tree which can be used in regression problems.

Regardless of the advantages and disadvantages of these algorithms separately, all of them can be considered as a generator of a tree with only one class (or one value for regression trees) in leaf nodes. In some situations, for example working with True or False classification problems, assigning only one class to each tuple sounds good but for some problems it does not. This matter would be more challengeable in SBIR because an image could cover more than one concept simultaneously. For example, assigning just one class among Truck, Road and Sunset to Fig. 1 would be unfair and single-labeling, assigning only one class to each feature vector will not be satisfying. However it can be handled by Region-Based Image Retrieval (RBIR) [7], but we cannot still determine how much a region belongs to a particular class. So, we can only say crisply if an image or a region covers a concept or does not. Furthermore, segmentation is really hard to do.

Chen et al. [12] proposed an approach to construct a DT enable to deal with hierarchical class labels. They named that decision tree as Hierarchical Decision Tree (HDT) and achieved a higher accuracy rate for data classification. In their approach, the final concepts have been arranged hierarchically and by using a new measure for selecting features and splitting the DT, they made a more accurate DT. More descriptions will be brought in Section 2.

Although they used a hierarchical organization of semantic concepts and achieved a better performance in comparison with linear (flat) organization, but their work is still a single-labeling approach.

In this paper, we have tried a combination of classification and regression in DT construction process based on HDT to achieve multi-labeling image annotation. Multi-labeling image annotation provides more than one class label for every single tuple, corresponding to an image or a region. In our approach, instead of choosing only one concept in DT leaves, we will consider all concepts and their ranks. What we do is classification because we have discrete semantic classes in final nodes and it is also regression because we assign some continuous values to these discrete classes as their rank. The rank of a concept is impressed by amount of details according to its location in the hierarchical network of semantic concepts and the deal of purity it supplies. So, all tuples linked to all concepts by their corresponding ranks and the system can determine how much an image (or a region) covers a concept.

The rest of this paper is organized as follows. Section 2 will bring a description about building a DT from data with hierarchical structures. In Section 3, the proposed system and its components will be discussed. In Section 4 we will compare the results of the multi-labeling and single-labeling approaches, and finally in Section 5, we briefly conclude this paper.

Section snippets

DT building process

To achieve multi-labeling annotation, an algorithm enable to determine more than only one class for an image (or a region) is needed. This can be considered from two points of view. Whether labels are totally separated or there are a relationship among them. In the second state, when they are not completely irrelevant, a hierarchical structure can be used to represent them and their relationships.

System description

The overall diagram of the proposed system can be found in Fig. 5. The system can be divided into two parts. The first part is responsible to build a DT and comprises of data normalization, data discretization, preparing the ontology and building DT according to the proposed approach, respectively. Calculating the rank of each tuple for concepts, creating indices for each concept and bringing result according the query (a semantic concept) constitute the second part which is the retrieval

Performance evaluation

The system will work offline so that the annotation and retrieval processes are separated completely. In the first step, some images are used as training set and after mining the DT, the annotation process is done and semantic ranks for all images are stored in database. When a user looks for a particular concept, he has to just select a concept within a list. Then, system will bring images with higher rank for the selected concept.

One of the biggest challenges in image retrieval systems,

Conclusion

In this paper, multi-labeling image annotation was achieved by combination of classification and regression in DT. In doing so, instead of selecting just one class with highest rank, all concepts and their ranks were considered in final leaf nodes. Hierarchical network was employed to represent relationships of semantic classes. The proposed system covered the weakness of single-labeling approach in establishing trade off between accuracy and details. Multi-labeling approach considers all

References (27)

D. Zhang et al.
A review on automatic image annotation techniques
Pattern Recognition
(2012)
A.M. Tousch et al.
Semantic hierarchies for image annotation
Pattern Recognition
(2012)
A. Hanbury
A survey of methods for image annotation
Journal of Visual Languages and Computing
(2008)
Y. Liu et al.
Semantic clustering for region-based image retrieval
Journal of Visual Communication and Image Representation
(2009)
Z. Li et al.
Fusing semantic aspects for image annotation and retrieval
Journal of Visual Communication and Image Representation
(2010)
Y. Liu et al.
Region-based image retrieval with high-level semantics using decision tree learning
Pattern Recognition
(2008)
L. Hollink et al.
Patterns of semantic relations to improve image content search
Web Semantics: Science Services and Agents on the World Wide Web
(2007)
H.C. Yang et al.
Image semantics discovery from web pages for semantic-based image retrieval using self-organizing maps
Expert Systems with Applications
(2008)
Y. Chen et al.
Constructing a decision tree from data with hierarchical class labels
Expert Systems with Applications
(2009)
H.J. Escalante
The segmented and annotated IAPR TC-12 benchmark
Computer Vision and Image Understanding
(2010)

H. Müller et al.

Performance evaluation in content-based image retrieval: overview and proposals

Pattern Recognition Letters

(2001)

Z. Hong et al.

Query expansion by text and image features in image retrieval

Journal of Visual Communication and Image Representation

(1998)

Y. Liua et al.

A survey of content-based image retrieval with high-level semantics

Pattern Recognition

(2007)

Cited by (35)

Group-preserving label-specific feature selection for multi-label learning
2023, Expert Systems with Applications
In many real-world application domains, e.g., text categorization and image annotation, objects naturally belong to more than one class label, giving rise to the multi-label learning problem. The performance of multi-label learning greatly relies on the quality of available features, whereas the data generally involve a lot of irrelevant, redundant, even noisy features. This fact has led to that a surge of research on feature selection methods that select significant features for multi-label learning. Nevertheless, most of the previous approaches suffer from the deficiency that label-specific features are not taken into account, and they are also inefficient in exploiting labeling information such as local label correlations. Moreover, these methods lack interpretability, which can only find a feature subset for all labels, however, cannot show how features are related to different labels. Based on this, we present a new group-preserving label-specific feature selection (GLFS) framework for multi-label learning, which simultaneously considers the features special to the labels in the same group and specific features owned by each label to execute feature selection. In addition, we further consider to learn label-group and instance-group correlations for the exploitation of labeling information, and make a collaborative use of them to improve the model generalization. Extensive experiments validate the advantages of the proposed GLFS method.
GPU-based acceleration of evolutionary induction of model trees
2022, Applied Soft Computing
Citation Excerpt :
The gray level of each region represents a different class label (for a classification tree), while the height corresponds to the value of the prediction function (regression and model trees). Although regression trees are not as popular as classification trees, they are highly competitive with other machine learning algorithms [21] and are often applied to real-life problems [22,23]. Inducing an optimal DT is known as NP-complete [24].
Evolutionary algorithms (EAs) are naturally prone to parallel processing. However, when they are applied to data mining, the fitness calculations start to dominate and the typical population-based decomposition limits the parallel efficiency. When dealing with large-scale data, the scalable solution may become a real challenge. In this article, we propose a GPU-based parallelization of evolutionary induction of model trees. Such trees are a special case of decision tree (DT) that is designed to solve regression problems. The evolutionary approach allows not only a robust prediction but also to preserve the simplicity of DTs. However, the global approach is much more computationally demanding than state-of-the-art greedy inducers, and thus hard to apply to large-scale data mining directly. A parallelized induction of model trees (with univariate tests in the internal nodes and multiple linear regression models in the leaves) requires a carefully designed decomposition strategy. Six GPU-supported procedures are designed to successively: redistribute, sort and rearrange dataset samples, next, calculate models and fitness, and finally gather the results. Experimental validation is performed on real-life and artificial datasets, using various (low- and high-end) GPU accelerators. Results show that the GPU-supported solution enables time-efficient global induction of model trees on large-scale data, which until now was reserved for greedy methods. The obtained speedup is very satisfactory (even up to hundreds of times). The solution is scalable for datasets of different sizes and dimensions.
Multi-label feature selection based on label distribution and feature complementarity
2020, Applied Soft Computing Journal
Citation Excerpt :
Besides, it may accord with the practical situation compared to the classical supervised single-label classification problem where every instance has only one relevant candidate label. Furthermore, it exists in numerous practical applications, such as text classification [7–10], image recognition [11–14], and gene function annotation [15–17]. For instance, one news item can be related to any of the following three categories: academics, politics, and society; one image may be related to marine, sunrises, and ships; one gene could be related to several functional groups, such as transcription, cellular biogenesis, and protein synthesis.
In the real-world, data in various domains usually tend to be high-dimensional, which may result in considerable time complexity and poor performance for multi-label classification problems. Multi-label feature selection is an important preprocessing step in machine learning, which can effectively solve the so-called “curse of dimensionality” by removing irrelevant and redundant features. Nevertheless, the significance of related labels for each instance is generally different, which is an issue that most of the existing multi-label feature selection algorithms have not addressed. Hence, in this paper, we integrate label-distribution learning into multi-label feature selection from the perspective of granular computing with considering multiple feature correlations. Then, a novel multi-label feature selection algorithm based on label distribution and feature complementarity is developed. In addition, the proposed algorithm consists of two primary parts: first, the different significances of related labels for each instance in the multi-label data are obtained based on granular computing; second, the feature complementarity is estimated based on neighborhood mutual information without discretization. Moreover, the superiority of our proposed method over other state-of-the-art methods is demonstrated by conducting comprehensive experiments with 10 publicly available multi-label datasets on six widely-used metrics. Finally, the proposed method can significantly improve the performance of the classifier while reducing the dimension of the original data.
Modelling of soil permeability using different data driven algorithms based on physical properties of soil
2020, Journal of Hydrology
Citation Excerpt :
Leaf hub is a representation to an arrangement. The highest decision hub in a tree, which relates to the finest forecaster is called root hub (Lee et al., 2013; Fakhari and Moghadam, 2013; Nasridinov et al., 2013). The details of parameters used in decision tree algorithm is presented in Table 2.
Soil permeability is an important parameter for assessment of infiltration, runoff, ground water, drainage and structures design. In the current research, five different data driven algorithms namely Multilayer Perceptron (MLP), Co-Active Neuro-Fuzzy Inference System (CANFIS), Support Vector Machine (SVM), Decision Tree (DT) and Random Forest (RF) algorithms and also, their wavelets (W-MLP, W-CANFIS, W-SVM, W-DT and W-RF algorithms) were used to predict soil permeability based on physical properties of soil. Also, reliable information/input vectors were assessed based on Gamma Test (GT). Sand, silt, clay and organic content (OC) parameters were chosen as information vectors based on gamma test. The potential of data driven algorithms were evaluated based on different statistical indices during model development and validation phase. It was found that wavelet based algorithms viz. W-MLP, W-CANFIS, W-SVM, W-DT and W-RF simulated better results of soil permeability compared to non-wavelet (MLP, CANFIS, SVM, DT and RF) algorithms. Among all wavelet and non-wavelet algorithms, W-RF algorithm had the highest accuracy and efficiency of model. The results of sensitivity analysis indicated that clay > silt > sand > OC > BD > PD was the order of sensitive parameters for soil permeability prediction based on data driven algorithms.
Group preserving label embedding for multi-label classification
2019, Pattern Recognition
Multi-label learning is concerned with the classification of data with multiple class labels. This is in contrast to the traditional classification problem where every data instance has a single label. Due to the exponential size of output space, exploiting intrinsic information in feature and label spaces has been the major thrust of research in recent years and use of parametrization and embedding have been the prime focus. Researchers have studied several aspects of embedding which include label embedding, input embedding, dimensionality reduction and feature selection. These approaches differ from one another in their capability to capture other intrinsic properties such as label correlation, local invariance etc. We assume here that the input data form groups and as a result, the label matrix exhibits a sparsity pattern and hence the labels corresponding to objects in the same group have similar sparsity. In this paper, we study the embedding of labels together with the group information with an objective to build an efficient multi-label classifier. We assume the existence of a low-dimensional space onto which the feature vectors and label vectors can be embedded. In order to achieve this, we address three sub-problems namely; (1) Identification of groups of labels; (2) Embedding of label vectors to a low rank-space so that the sparsity characteristic of individual groups remains invariant; and (3) Determining a linear mapping that embeds the feature vectors onto the same set of points, as in stage 2, in the low-dimensional space. We compare our method with seven well-known algorithms on twelve benchmark data sets. Our experimental analysis manifests the superiority of our proposed method over state-of-art algorithms for multi-label learning.
Social relationships classification using social contextual features and SVDD-based metric learning
2019, Applied Soft Computing Journal
Family relationship is an important concern in image-based social relationships recognition, and there are very limited attempts to tackle diverse social relationships in the literature. In this paper, we propose the problem of social relationships classification in which we aim to model three types of social relationships( e.g., family, colleagues and friends) in the images. To this end, we introduce two types of social contextual features to capture detailed information( e.g., geometry or appearance) in images. Moreover, we present a new Support Vector Data Description-based metric learning( SML) method for social relationships classification. Motivated by the fact that the images are unavoidably degraded by noise due to some variation factors such as illumination and pose, we aim to learn a robust distance metric to suppress noise and model the spatial structure among multiple entities, such that more discriminative information can be exploited for classification. We also extend our method to multiview version-MSML, which helps to exploit multiple features to improve the social relationships classification performance. Extensive experiments on our newly released social relationships database demonstrate the feasibility and effectiveness of our proposed methods.

View all citing articles on Scopus

View full text

Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval

Abstract

Graphical abstract

Highlights

Introduction

Section snippets

DT building process

System description

Performance evaluation

Conclusion

Pattern Recognition

Pattern Recognition

Journal of Visual Languages and Computing

Journal of Visual Communication and Image Representation

Journal of Visual Communication and Image Representation

Pattern Recognition

Web Semantics: Science Services and Agents on the World Wide Web

Expert Systems with Applications

Expert Systems with Applications

Computer Vision and Image Understanding

Pattern Recognition Letters

Journal of Visual Communication and Image Representation

A survey of content-based image retrieval with high-level semantics

Pattern Recognition