Parallel Multi-Class Contour Preserving Classification

Piyabute Fuangkhon

doi:10.1515/jisys-2015-0038

Open Access Published by De Gruyter December 7, 2015

Parallel Multi-Class Contour Preserving Classification

Piyabute Fuangkhon

From the journal Journal of Intelligent Systems

https://doi.org/10.1515/jisys-2015-0038

Abstract

Serial multi-class contour preserving classification can improve the representation of the contour of the data to improve the levels of classification accuracy for feed-forward neural network (FFNN). The algorithm synthesizes fundamental multi-class outpost vector (FMCOV) and additional multi-class outpost vector (AMCOV) at the decision boundary between consecutive classes of data to narrow the space of data. Both FMCOVs and AMCOVs will assist the FFNN to place the hyper-planes in such a way that can classify the data more accurately. However, the technique was designed to utilize only one processor. As a result, the execution time of the algorithm is significantly long. This article presents an improved version of the serial multi-class contour preserving classification that overcomes its time deficiency by utilizing thread-level parallelism to support parallel computing on multi-processor or multi-core system. The parallel algorithm distributes the data set and the processing of the FMCOV and AMCOV generators to be operated on available threads to increase the CPU utilization and the speedup factors of the FMCOV and AMCOV generators. The technique has been carefully designed to avoid data dependency issue. The experiments were conducted on both synthetic and real-world data sets. The experimental results confirm that the parallel multi-class contour preserving classification clearly outperforms the serial multi-class contour preserving classification in terms of CPU utilization and speedup factor.

Keywords: Data reduction; intelligent systems; neural network; parallel computing; shape representation

MSC 2010: 68T01

1 Introduction

Feed-forward neural network (FFNN) [8, 11, 12] is a wildly used artificial neural network that maps a set of input instances onto a set of outputs. It functions by placing hyper-planes at the decision boundary between consecutive classes of data to partition the data belonging to different classes. It is known that the instances located at the decision boundary between consecutive classes of data generally have more impact on the placement of the hyper-planes than the instances located far away from the decision boundary. However, occasionally, the placement of the hyper-planes may not preserve the concave surface (curves inward) and convex surface (bulges outward) of the data correctly when the space between consecutive classes of data is large. This can adversely affect the levels of classification accuracy.

Consider a problem in Figure 1A, a two-dimensional three-class synthetic data set comprising three classes of instances that are separable by a two-neuron non-linear classifier is presented. When a two-layer FFNN having two hidden neurons is applied, it possibly learns to classify this problem as shown in Figure 1B. Lines 1 and 2 are the classifying hyper-plane that represents the two hidden neurons. The shape or distribution model of the problem is not correctly preserved. When a two-layer FFNN having four hidden neurons is applied, a possible solution may be as shown in Figure 1B. Lines 3, 4, 5, and 6 are the classifying hyper-plane that represents the four hidden neurons. However, applying a typical variation of back-propagation learning algorithm to learn those instances will normally converge to the solution similar to what shown in Figure 1C. The classifier may classify the data inaccurately, especially when the instances are located at the decision boundary between consecutive classes of data because it does not recognize the concave surface and convex surface of the data correctly.

Figure 1:

A comparison among possible classifications: (A) A two-dimensional three-class problem. (B) Possible classification with a two-layer perceptron with two hidden neurons. (C) Possible classification with a two-layer perceptron with four hidden neurons. (D) Typical result of four-hidden-neuron network trained by a variation of back-propagation algorithm [5].

Serial multi-class contour preserving classification [5] is a technique that can improve the representation of the contour of the multi-class data, a set of data having more than two classes, to improve the levels of classification accuracy for FFNN. The algorithm generates two types of multi-class outpost vector (MCOV): fundamental multi-class outpost vector (FMCOV) and additional multi-class outpost vector (AMCOV), from the instances at halfway between consecutive classes of data. Both MCOVs are inserted into the data set to narrow the space between consecutive classes of data to improve the representation of the contour of the data and assist the FFNN to classify the data more accurately. The technique has been significantly improved in [7] to reduce the number of FMCOVs and AMCOVs by maintaining only the FMCOVs and AMCOVs located at the decision boundary between consecutive classes of data based on the Fuangkhon boundary [6].

This article presents an augmentation of the serial multi-class contour preserving classification [5] that overcomes its time deficiency by utilizing thread-level parallelism to support parallel computing on multi-processor or multi-core system. The parallel multi-class contour preserving classification distributes the data set and the processing of the FMCOV and AMCOV generators to be operated on available threads to increase the CPU utilization and the speedup factors of the FMCOV and AMCOV generators. The technique has been carefully designed to avoid data dependency issue. The experiments were conducted on both synthetic and real-world data sets. The experimental results confirm that the speedup factor of the parallel multi-class contour preserving classification over the serial multi-class contour preserving classification is proportional to the number of available processors or processor cores. In other words, the speedup factor increases as the number of available processors or processor cores increases.

The article is organized as follows. Section 2 describes research works related to the serial multi-class contour preserving classification. Section 3 briefly introduces the serial multi-class contour preserving classification. Section 4 presents the implementation of the parallel multi-class contour preserving classification. Section 5 shows the runtime complexity of the parallel multi-class contour preserving classification. Section 6 presents the experimental results. Section 7 presents the conclusion.

2 Related Works

Tanprasert et al. [14] proposed a contour preserving classification technique, called outpost vector model, to preserve the shape or distribution model of two-class data to improve the robustness and weight fault tolerance of a neural network applied with a linearly separable problem. The technique augments a set of original instances with a set of outpost vectors that are generated from the original instances and are located at the boundary between both classes of data. As a result, the learning process is indirectly biased toward distributing classification workload around the set of hidden neurons, thereby forcing the network to perform nonlinear classification. Outpost vectors have some similarities to support vectors [3]. Nevertheless, they are synthesized rather than being selected from a set of original instances. The technique was found in [4] to improve the level of classification accuracy of two-class data. Mongkonsirivatana [10] proposed an improved version of the contour preserving classification technique [14], based on boundary detection, that can reduce the number of instances used to generate outpost vectors. However, determination of the appropriate value of the key parameter is not defined. As a result, the outpost vectors generated from the selected instances only correspond to the boundary of the shape or distribution model of those selected instances. Tanprasert and Kripruksawan [13] proposed an alternative to preserve the contour of the old data for a multi-layer perceptron network, called decayed prior sampling (DPS), that uses subtractive clustering [2]. The technique was designed for adaptive learning. However, the bounded range defined by the average distance of each cluster center will preserve the contour of the old data in a cycle shape only. In addition, the output value of synthesized vectors is defined by the output value of the cluster center, rather than being defined by the classification knowledge of the supervised neural network from the previous training. Kaitikunkajorn and Tanprasert [9] proposed an improved synthesis process of DPS algorithm, called new decayed prior sampling (NDPS), that solves drawbacks in [13] by calculating the bounded range of each feature in the cluster from the average error of each feature in the cluster, instead of the average distance of each cluster center, to maintain the contour of old data. In addition, the technique feeds the synthesized vector into the neural network to get the output value that will be used as the output value of the synthesized vector instead of using the output value of the cluster center. Tanprasert et al. [15] proposed an improved version of DPS algorithm for multi-layer perceptron network, called modified decayed prior sampling (MDPS). The algorithm allows the existing knowledge to age out in slow rate (adaptive) as a neural network is gradually retrained with consecutive sets of new instances, resembling the change of application locality under a consistent environment. It utilizes the outpost vector model [14] rather than subtractive clustering [13] to clarify the boundary between two classes of instances to assist the neural network to classify the data more accurately. The experimental results confirm that MDPS yields higher levels of generalization than NDPS [9]. However, the technique only maintains the contour of new data due to the fact that it generates outpost vectors from new instances only. As a result, the contour of old data is not maintained accurately. It is beneficial to include outpost vectors generated from old data in the final data set to maintain the contour of old data.

3 Serial Multi-class Contour Preserving Classification

The serial multi-class contour preserving classification [5] is a technique that can improve the representation of the contour of the multi-class data, a set of data having more than two classes, to improve the levels of classification accuracy for FFNN. It generates two types of MCOV, FMCOV and AMCOV, from all original instances at the decision boundary between two consecutive classes of data to improve the representation of the contour of the data and to assist the FFNN to classify a linearly separable problem “nonlinearly”. It is an augmented version of the original contour preserving classification [14] that supports multi-class data. The following are the characteristics of FMCOV and AMCOV.

FMCOV is a synthesized vector that is used to declare the decision boundary of the territory of an instance of one class, let us presume an instance i of class A (denoted by A_i ) against an instance of any other class and an instance j of class X (denoted by X_j ), which has smallest Euclidean distance to A_i . It is placed at the boundary of A_i ’s territory in the direction of X_j . X_j is designated as a paired vector of A_i (denoted by ϕ(i)) and mathematically described in Eq. (1).

(1)ϕ(i) = {j|min(d(i, j)), i ∈ A, j ∈ X, A ∩ X = ∅},

where i and j are instances and A and X are sets of instances.

A space between the boundary of the center of an instance and the center of its paired vector (denoted by κ) can be inserted to define the space between an FMCOV and an AMCOV of different class. It is intended to provide a small clearance for placing the hyper-planes. It is mathematically described in Eq. (2).

(2)κ = {x ∈ ℝ|0 < x ≤ 1}.

The location of dimension s of an FMCOV of instance i (denoted by o(i)_s ) is mathematically described in Eq. (3).

(3)o(i)s = {is − |is − ϕ(i)s|2 × (1 − κ)|∀s ∈ {1, 2, 3, …, d},if is ≥ ϕ(i)sis + |is − ϕ(i)s|2 × (1 − κ)|∀s ∈ {1, 2, 3, …, d},if is < ϕ(i)s},

where i is an instance, i_s is the value of the dimension s of an instance i, ϕ(i)_s is the value of the dimension s of a paired vector of an instance i, and d is the number of dimensions of an instance.

An FMCOV of instance i (denoted by o(i)) is mathematically described in Eq. (4). It is placed at the boundary of A_i ’s territory in the direction of X_j .

(4)o(i) = {o(i)s ∈ ℝ|∀s ∈ {1, 2, 3, …, d}}.

where i is an instance and d is the number of dimensions of an instance.

AMCOV is a synthesized vector that is used to declare the decision boundary of a paired vector of an instance, let us presume a paired vector of instance i of class A (ϕ(i)), against that instance i of class A (A_i ). It is placed at the boundary of ϕ(i)’s territory, called counter-boundary, in the direction of A_i . The radius of each dimension of a paired vector of instance i (denoted by r̅(i)_s ) used to generate an AMCOV is mathematically computed by Eq. (5).

(5)r¯(i)s = {|ϕ(i)s − ϕ(ϕ(i))s|2|∀s ∈ {1, 2, 3, …, d}},

where i is an instance and d is the number of dimensions of an instance.

Hence, an AMCOV of instance i (denoted by o′(i)) is mathematically described in Eq. (6). It is placed at the boundary of ϕ(i)’s territory in the direction of A_i . A space (κ) in Eq. (2) is also used to leave some space between an FMCOV and the AMCOV being generated. Many paired vectors may be generated at the same point in the problem space. To reduce the number of duplicated AMCOVs, an AMCOV of an instance will be generated only when a paired vector of a paired vector of that instance is different from that instance.

(6)o′(i) = {ϕ(i)s + r¯(i)s − κ|∀s ∈ {1, 2, 3, …, d},if is ≥ ϕ(i)s & ϕ(ϕ(i)) ≠ iϕ(i)s − r¯(i)s + κ|∀s ∈ {1, 2, 3, …, d},if is < ϕ(i)s & ϕ(ϕ(i)) ≠ i},

where i is an instance, i_s is the value of the dimension s of i, ϕ(i)_s is the value of the dimension s of a paired vector of an instance i, r̅(i)_s is the value of the dimension s of the radius of the territory of an instance i, ϕ(ϕ(i)) is a paired vector of a paired vector of an instance i, and d is the number of dimensions of an instance.

The concept of three-class outpost vector is illustrated in Figure 2. There are three classes of data designated as class A, B, and C. To find the territory of each instance, each instance is modeled to span its territory as a circle (sphere in case of three-dimensional space or hyper-sphere in case of more-dimensional space) until the territories collide with another. The territory of instance k of class A, denoted by A_k , is found by locating the instance in any other class that is nearest to A_k . In this case, B^*(A_k ) of class B is nearest to A_k and referred to as A_k ’s pair. Then, the territory of A_k is declared as halfway between A_k and B^*(A_k ). Consequently, the radius of A_k ’s territory is set at half of the distance between A_k and B^*(A_k ). This is to guarantee that if B^*(A_k ) sets its territory using the same radius, then the distance from the hyper-plane to either A_k or B^*(A_k ) will be at maximum. A_k then places its FMCOV against B^*(A_k ) at the decision boundary of A_k ’s territory. The territories of B^*(A_k ) of class B, C^*(B^*(A_k )) of class C, A_j of class A, and B^*(A_j ) of class B are also found by the same method done with A_k of class A. After that, AMCOVs will then be generated from all instances as well. The AMCOV of B^*(A_k ) of class B against A_k of class A is placed at the decision boundary of B^*(A_k )’s territory in the direction of A_k . The AMCOV of C^*(B^*(A_k )) of class C against B^*(A_k ) of class B is placed at the decision boundary of C^*(B^*(A_k ))’s territory in the direction of B^*(A_k ). The AMCOV of A_j of class A against C^*(B^*(A_k )) of class C is placed at the decision boundary of A_j ’s territory in the direction of C^*(B^*(A_k )). The AMCOV of B^*(A_j ) of class B against A_j of class A is placed at the decision boundary of B^*(A_j )’s territory in the direction of A_j .

Figure 2:

FMCOVs, AMCOVs, and Instance’s Territory in a Two-Dimensional Three-Class Problem.

In Figure 2, it is necessary to clarify the differences between the boundary of an instance and the counter-boundary of an instance.

The boundary of an instance is a circle (sphere in case of three-dimensional space or hyper-sphere in case of more-dimensional space) that identifies the outermost territory of that instance for placing its FMCOV and AMCOV.
- An FMCOV will be synthesized on this boundary between A_i and ϕ(i) on the territory of A_i .
- An AMCOV will be synthesized on this boundary between A_i and ϕ(i) on the territory of ϕ(i).
The counter-boundary of a paired vector identifies the outermost territory of the paired vector against the outermost territory of an instance in the opposite direction. It is not used to place any MCOV but to visualize the boundary of a paired vector against the boundary of an instance in the opposite direction.

Figure 3 illustrates how FMCOVs and AMCOVs are integrated to produce the final class boundary of the data set. Figure 3A illustrates a non-overlapping four-class synthetic data set comprising four sine waves representing four classes of data, designated as red, green, blue, and magenta points in a two-dimensional Euclidean space and determined by two coordinates, x and y. Figure 3B illustrates the synthesized FMCOVs in solid color at the new class boundary between consecutive classes of data. It is noticeable that the FMCOVs shift the original class boundary of one class toward the opposite class. They reduce the space between consecutive classes of data to assist the FFNN to place the hyper-plane in such a way that the contour of the data better preserved. However, some areas on the new class boundary may have very low population of FMCOVs. This area still allows the FFNN to place the hyper-plane that might not preserve the contour of the data accurately, leading to a higher misclassification rate. This is why it is time AMCOVs to come into play. Figure 3C illustrates the synthesized AMCOVs in solid color at the new class boundary between consecutive classes of data. AMCOVs also shift the original class boundary of one class toward the opposite class. They also reduce the space between consecutive classes of data. AMCOVs occasionally fill the space where the population of FMCOVs is low, and vice versa. As a result, the integration of FMCOVs and AMCOVs helps strengthen the new class boundary to ensure the maximum assistance toward the hyper-plane placement process.

Figure 3:

A comparison among various final training sets: (A) An original data set. (B) An original data set with synthesized FMCOVs. (C) An original data set with synthesized AMCOVs. (D) An original data set with synthesized FMCOVs and AMCOVs [6].

4 Parallel Multi-Class Contour Preserving Classification

This section presents the parallel multi-class contour preserving classification, which is an improved version of the serial multi-class contour preserving classification [5], that utilizes thread-level parallelism to offload the processing of the FMCOV generation algorithm and the AMCOV generation algorithm from just a single processor to all available processors or processor cores. This will significantly reduce the total execution time especially on a shared-memory multi-core processor computer. The parallel FMCOV generation algorithm (FMCOV generator) is presented in Algorithm 1.

In Line 1, two global inputs are defined: the original data set (denoted by T) and the number of available threads (denoted by Tr). Tr should be no larger than the number of thread that the computer can execute concurrently; otherwise, some thread will be idle.
In Line 2, a global output is defined: a set of FMCOVs (denoted by TF).
In Line 3, this parallel for-loop offloads the processing of the FMCOV generator from a single thread to all available threads as defined by Tr in Line 1.
In Line 4, the original data set (T) is divided into Tr partial data sets. Each partial data set (T_t ) is exclusively processed by one and only one thread (t). This for-loop generates FMCOVs (denoted by o(i)) from the partial data set (T_t ) of thread (t) and stores them in the thread (t)’s unshared local subset (S_tf ).
In Line 5, the algorithm identifies a paired instance of i (denoted by ϕ(i)), which is an instance in any class other than the class of i and has a minimum Euclidean distance to i.
In Line 6, the FMCOV of i (o(i)) is generated at almost halfway between i and ϕ(i) determined by κ at the boundary of i on the territory of i in the direction of ϕ(i).
In Line 7, the FMCOV (o(i)) is added into thread (t)’s unshared local subset (S_tf ).
In Line 8, this is the end of for-loop in Line 4. All FMCOVs (o(i)) of a partial data set (T_t ) are added into thread (t)’s unshared local subset (S_tf ).
In Line 9, all FMCOVs (o(i)) in thread (t)’s unshared local subset (S_tf ) are added into a set of FMCOVs (TF).
In Line 10, this is the end of parallel for-loop in Line 3. The FMCOVs in the unshared local subset (S_tf ) from all threads are added into the set of FMCOVs (TF).

The parallel AMCOV generation algorithm (AMCOV generator) is presented in Algorithm 2.

In Line 1, two global inputs are defined: the original data set (denoted by T) and the number of available threads (denoted by Tr). Tr should be no larger than the number of thread that the computer can execute concurrently; otherwise, some thread will be idle.
In Line 2, a global output is defined: a set of AMCOVs (denoted by TA).
In Line 3, this parallel for-loop offloads the processing of the AMCOV generator from a single thread to all available threads as defined by Tr in Line 1.
In Line 4, the original data set (T) is divided into Tr partial data sets. Each partial data set (T_t ) is exclusively processed by one and only one thread (t). This for-loop generates AMCOVs (denoted by o′(i)) from the partial data set (T_t ) of thread (t) and stores them in the thread (t)’s unshared local subset (S_ta ).
In Line 5, the algorithm verifies that a paired instance of a paired instance of i (denoted by ϕ(ϕ(i))) is not the same as i before generating AMCOV of i (o′(i)).
In Line 6, the AMCOV of i (o′(i)) is generated at almost halfway between i and ϕ(i) determined by κ at the boundary of ϕ(i) on the territory of ϕ(i) in the direction of i.
In Line 7, the AMCOV (o′(i)) is added into thread (t)’s unshared local subset (S_ta ).
In Line 8, this is the end of a paired instance verification of the AMCOV generator in Line 5.
In Line 9, this is the end of for-loop in Line 4. All AMCOVs (o′(i)) of a partial data set (T_t ) are added into thread (t)’s unshared local subset (S_ta ).
In Line 10, all AMCOVs (o′(i)) in thread (t)’s unshared local subset (S_ta ) are added into a set of AMCOVs (TA).
In Line 11, this is the end of parallel for-loop in Line 3. The AMCOVs in the unshared local subset (S_ta ) from all threads are added into the set of AMCOVs (TA).

After a set of FMCOVs (TF) from Algorithm 1 and a set of AMCOVs (TA) from Algorithm 2 are completely generated, they are combined by UNION to form a final training set. In practice, a set of FMCOVs (TF) must be generated prior to a set of AMCOVs (TA). The MCOV generation algorithm is divided into two algorithms, FMCOV generation algorithm and AMCOV generation algorithm (as shown in Algorithms 1 and 2), for ease of understanding.

Algorithm 1

Parallel FMCOV generation algorithm.

1:	{input: the original data set (T), the available threads (Tr)}
2:	{output: a set of FMCOVs (TF)}
3:	for (parallel) each thread t∈Trdo
4:	for each instance i∈T_t where T_t ∈Tdo
5:	find a paired instance ϕ(i)∉class(i) and has minimum Euclidean distance to i.
6:	generate a FMCOV o(i)∈class(i) at almost halfway between i and ϕ(i) determined by κ on the territory of i in the direction of ϕ(i).
7:	add o(i) into an unshared local subset (S_tf ).
8:	end for
9:	add S_tf into TF.
10:	end for

Algorithm 2

Parallel AMCOV generation algorithm.

1:	{input: the original data set (T), the available threads (Tr)}
2:	{output: a set of AMCOVs (TA)}
3:	for (parallel) each thread t∈Trdo
4:	for each instance i∈T_t where T_t ∈Tdo
5:	if (ϕ(ϕ(i))≠i), then
6:	generate an AMCOV o′(i)∈class(ϕ(i)) at almost halfway between i and ϕ(i) determined by κ on the territory of ϕ(i) in the direction of i.
7:	add o′(i) into an unshared local subset (S_ta ).
8:	end if
9:	end for
10:	add S_ta into TA.
11:	end for

5 Runtime Complexity

The runtime complexity of parallel MCOV generation algorithm (T_pmcov(n)) comprises the runtime complexity of parallel FMCOV generation algorithm (T_pfmcov(n)) in Algorithm 1 and the runtime complexity of parallel AMCOV generation algorithm (T_pamcov(n)) in Algorithm 2 as described in Eq. (7).

(7)Tpmcov(n) = Tpfmcov(n) + Tpamcov(n),

where n is the number of instances in all classes.

The FMCOVs and AMCOVs are generated from the algorithms presented in Algorithms 1 and 2 using the Euclidean distance function as described in Eq. (8) on a set of instances. The runtime complexity of the Euclidean distance is described in Eq. (9).

(8)d(p, q) = (p1 − q1)2 + (p2 − q2)2 + … +(pd − qd)2,= ∑i = 1d(pi − qi)2.

where p and q are two points in Euclidean space, p_i and q_i are the value of the feature i of instances p and q, respectively, and d is the number of features of the point.

(9)Td(p,q) = O(∑i = 1d(pi − qi)2)),= O(d × z),= O(d).

where p and q are two points in Euclidean space, p_i and q_i are the value of the feature i of instances p and q, respectively, d is the number of features of the point, and z is a constant.

The runtime complexity of the parallel FMCOV generation algorithm (T_pfmcov(n)) is described in Eq. (10).

(10)Tpfmcov(n) = O(∑z = 1c(nz × (∑z = 1c(nz) − nz)) × Td(p,q) × e),= O(∑z = 1c(nz × (cn − nz)) × d × e),= O(c(c − 1)n2 × d × e),= O(c2n2),= O(n2).

where n_z is the number of instances in class z, n is the number of instances in all classes, c is the number of classes of data, d is the number of features of an instance, and e is the number of processors or processor cores.

The runtime complexity of the parallel AMCOV generation algorithm (T_pamcov(n)) is described in Eq. (11).

(11)Tpamcov(n) = O(∑z = 1cnz × Td(p,q) × e),= O(cn × d × e),= O(n).

As a result, the runtime complexity of the parallel MCOV generation algorithm (T_pmcov(n)) described in Eq. (7) is formulated as Eq. (12).

(12)Tpmcov(n) = Tpfmcov(n) + Tpamcov(n)= O(n2 + n),= O(n2).

where n is the number of instances in all classes.

Although the runtime complexity of the parallel MCOV generation algorithm is a linear equation, the actual amount of execution time spent to generate all MCOVs is inversely proportional to the number of available processors or processor cores as demonstrated in Section 6. In other words, the execution time reduces as the number of available processors or processor cores increases.

6 Experimental Results

The experiments were conducted to present an improvement of the CPU utilization and the speedup factors of the parallel FMCOV generation algorithm over the serial FMCOV generation algorithm and the parallel AMCOV generation algorithm over the serial AMCOV generation algorithm. All algorithms were implemented in Microsoft Visual C# 2013 using the following libraries:

System.Threading.Tasks was required for “parfor” parallel for-loop.
System.Diagnostics was required for “stopwatch” execution time measurement.

A 1-thread process was used to evaluate the CPU utilization and the execution time of the original serial MCOV generation algorithm. A 2-thread process, a 4-thread process, an 8-thread process, and a 16-thread process were used to evaluate the CPU utilization and the execution time of the parallel MCOV generation algorithm. The machine used to conduct the experiments was powered by Intel^® Core i7 4770k processor having four processor cores with Intel^® Hyper-Threading technology.

The following groups of data sets were used to conduct the experiments. Their characteristics are presented in Table 1.

A non-overlapping four-class synthetic data set comprises four sine waves representing four classes of data, designated as red, green, blue, and magenta points in a two-dimensional Euclidean space and determined by two coordinates, x and y. This data set consists of 3200 instances from each class of data, which constitutes a total of 12,800 instances.
Eight highly overlapping two-class synthetic data sets, selected from the ELENA project [16], have a heavy intersection of the class distributions, a high degree of nonlinearity of the class boundaries, and various dimensions of the vectors. These data sets are Clouds, Gaussian 2D, Gaussian 3D, Gaussian 4D, Gaussian 5D, Gaussian 6D, Gaussian 7D, and Gaussian 8D.
Six real-world data sets, selected from the UCI machine learning repository [1], have a heavy intersection of the class distributions, a high degree of nonlinearity of the class boundaries, and various dimensions of the vectors. These data sets are Adult Income, Statlog (Landsat Satellite), Statlog (Shuttle Landing Control), Forest Cover Level, Pen-Based Recognition of Handwritten Digits, and Optical Recognition of Handwritten Digits.

Table 1

Characteristics of All Data Sets.

Data set	Attribute characteristics	Classes	Dimensions	Instances
Sine 3200	Integer	4	2	12,800
Clouds	Float	2	2	3750
Gaussian 2D	Float	2	2	3750
Gaussian 3D	Float	2	3	3750
Gaussian 4D	Float	2	4	3750
Gaussian 5D	Float	2	5	3750
Gaussian 6D	Float	2	6	3750
Gaussian 7D	Float	2	7	3750
Gaussian 8D	Float	2	8	3750
Adult Income	Integer/categorical	2	14	32,561
Forest Cover Level	Integer/categorical	7	54	58,104
Statlog (Landsat Satellite)	Integer	6	36	4435
Statlog (Shuttle Landing)	Integer	7	9	43,500
Pen-Based Recognition	Integer	10	16	7494
Optical Recognition	Integer	10	64	3823

In terms of CPU utilization, the CPU utilization was record during the execution of the parallel MCOV generation algorithm. The 1-thread process, the 2-thread process, the 4-thread process, the 8-thread process, and the 16-thread process utilized 13%, 25%, 50%, 99%, and 99% CPU times on all data sets, respectively. As mentioned earlier, the Intel^® Core i7 4770k processor with Hyper-Threading technology has only four processor cores and can execute up to 8 threads simultaneously. When the 8-thread process and the 16-thread process ran, the CPU utilization reached its maximum (99% of the CPU utilized). It can be concluded that the CPU utilization can be increased as the number of threads increases.

In terms of execution time, Tables 2 and 3 present the execution time in milliseconds of the parallel FMCOV generation algorithm and the parallel AMCOV generation algorithm on all data sets, respectively. It is noticeable that the reduction of the execution time of the parallel FMCOV generation algorithm and the parallel AMCOV generation algorithm is proportional to the number of threads in the process. In other words, the execution time decreases as the number of available processors or processor cores increases. However, the execution time is slightly reduced when the number of threads is increased from 8 to 16. As mentioned earlier, the Intel^® Core i7 4770k processor has only four processor cores that can operate 4 threads simultaneously. When there are more than 4 threads, Intel^® Hyper-Threading Technology comes into play. This technology allows each processor core to interleave 2 threads operating on the same processor core. As a result, 8 threads can be operated simultaneously. However, this technology shares the floating-point unit (FPU) between the two threads operating on the same processor core. When an FPU of each processor core is being used by one thread to execute the Euclidean distance function or floating-point instruction, another thread assigned to be executed on this processor core will be stalled waiting for the FPU to become available. Moreover, when the number of threads is more than the maximum number of threads that the CPU can handle simultaneously, the overhead of thread switching can adversely affect the execution time of both the parallel FMCOV generation algorithm and the parallel AMCOV generation algorithm. Tables 4 and 5 present the speedup factors of the parallel FMCOV generation algorithm over the serial FMCOV generation algorithm and the parallel AMCOV generation algorithm over the serial AMCOV generation algorithm, respectively, on all data sets. It is intended for clearer understanding.

Table 2

Execution Time, in Milliseconds, of the Serial and the Parallel FMCOV Generation Algorithms.

Data set	Number of threads
Data set	1	2	4	8	16
Sine 3200	20,381.83	10,267.99	5817.63	3823.13	3850.56
Clouds	1284.29	647.73	369.83	278.52	238.42
Gaussian 2D	1303.44	688.03	361.48	242.45	242.82
Gaussian 3D	1673.13	833.37	485.37	301.28	346.50
Gaussian 4D	2077.87	1071.87	570.49	427.87	407.12
Gaussian 5D	2523.14	1227.23	673.88	527.30	456.70
Gaussian 6D	2849.34	1485.77	823.27	614.42	549.34
Gaussian 7D	3248.84	1666.25	930.75	619.61	619.92
Gaussian 8D	3645.36	1845.91	1047.36	700.43	678.43
Adult Income	268,833.89	135,875.36	74,396.34	50,077.98	49,797.64
Forest Cover Type	3,833,784.29	1,917,245.83	1,076,208.35	754,112.12	756,466.40
Statlog (Landsat)	30,666.56	15,797.63	8700.96	6107.81	5531.93
Statlog (Shuttle)	371,805.28	185,928.46	103,920.99	67,404.82	67,413.37
Pen-Based Recognition	38,905.12	19,477.59	10,882.39	7077.87	7730.72
Optical Recognition	46,749.63	23,381.19	13,256.23	8997.68	8343.39

Table 3

Execution Time, in Milliseconds, of the Serial and the Parallel AMCOV Generation Algorithms.

Data set	Number of threads
Data set	1	2	4	8	16
Sine 3200	4.22	1.95	1.21	0.83	0.91
Clouds	1.25	0.64	0.35	0.23	0.26
Gaussian 2D	1.44	0.65	0.42	0.27	0.29
Gaussian 3D	1.86	0.84	0.51	0.38	0.42
Gaussian 4D	2.26	1.07	0.55	0.47	0.45
Gaussian 5D	2.61	1.24	0.76	0.55	0.53
Gaussian 6D	2.94	1.44	0.74	0.58	0.61
Gaussian 7D	3.36	1.64	0.94	0.75	0.70
Gaussian 8D	3.72	1.79	1.11	0.80	0.74
Adult Income	46.41	23.64	12.26	9.73	9.82
Forest Cover Type	236.49	123.22	68.12	61.39	49.43
Statlog (Landsat)	15.04	7.56	4.67	3.63	3.22
Statlog (Shuttle)	39.68	19.82	9.95	8.59	8.85
Pen-Based Recognition	20.70	10.24	6.25	4.48	4.53
Optical Recognition	12.27	6.10	3.65	2.64	2.61

Table 4

Speedup Factors of the Parallel FMCOV Generation Algorithm over the Serial FMCOV Generation Algorithm.

Data set	Number of threads
Data set	2	4	8	16
Sine 3200	1.98	3.50	5.33	5.29
Clouds	1.98	3.47	4.61	5.39
Gaussian 2D	1.89	3.61	5.38	5.37
Gaussian 3D	2.01	3.45	5.55	4.83
Gaussian 4D	1.94	3.64	4.86	5.10
Gaussian 5D	2.06	3.74	4.78	5.52
Gaussian 6D	1.92	3.46	4.64	5.19
Gaussian 7D	1.95	3.49	5.24	5.24
Gaussian 8D	1.97	3.48	5.20	5.37
Adult Income	1.98	3.61	5.37	5.40
Forest Cover Type	2.00	3.56	5.08	5.07
Statlog (Landsat)	1.94	3.52	5.02	5.54
Statlog (Shuttle)	2.00	3.58	5.52	5.52
Pen-Based Recognition	2.00	3.58	5.50	5.03
Optical Recognition	2.00	3.53	5.20	5.60

Table 5

Speedup Factors of the Parallel AMCOV Generation Algorithm over the Serial AMCOV Generation Algorithm.

Data set	Number of threads
Data set	2	4	8	16
Sine 3200	2.16	3.50	5.09	4.66
Clouds	1.96	3.60	5.44	4.81
Gaussian 2D	2.20	3.43	5.42	4.96
Gaussian 3D	2.21	3.62	4.96	4.49
Gaussian 4D	2.11	4.09	4.83	5.02
Gaussian 5D	2.09	3.45	4.71	4.90
Gaussian 6D	2.04	3.97	5.10	4.83
Gaussian 7D	2.05	3.59	4.48	4.83
Gaussian 8D	2.09	3.36	4.67	5.06
Adult Income	1.96	3.79	4.77	4.72
Forest Cover Type	1.92	3.47	3.85	4.78
Statlog (Landsat)	1.99	3.22	4.15	4.67
Statlog (Shuttle)	2.00	3.99	4.62	4.49
Pen-Based Recognition	2.02	3.31	4.62	4.57
Optical Recognition	2.01	3.36	4.65	4.70

From the experimental results, it can be concluded that the parallel FMCOV generation algorithm and the parallel AMCOV generation algorithm can utilize thread-level parallelism to enable the parallel computing on multi-processor or multi-core system effectively. The speedup factors of the parallel FMCOV generation algorithm over the serial FMCOV generation algorithm and the parallel AMCOV generation algorithm over the serial AMCOV generation algorithm are proportional to the number of available processors or processor cores. In other words, the speedup factors and the CPU utilization increase as the number of available processors or processor cores increases. The parallel MCOV generation algorithm clearly outperforms the original serial algorithm in terms of CPU utilization and speedup factor.

7 Conclusion

The serial multi-class contour preserving classification helps improve the representation of the contour of the data to improve the levels of classification accuracy for FFNN. The algorithm generates FMCOV and AMCOV to narrow the space between consecutive classes of data to preserve the contour of the data and assist the FFNN to classify the data more accurately. The algorithm has been proven to be able to increase the levels of classification accuracy. However, the algorithm was designed to support only uniprocessor system. This article presents the parallel implementation of the serial multi-class contour preserving classification that overcomes its time deficiency by utilizing thread-level parallelism to support parallel computing on multi-processor or multi-core system. The technique distributes the data set and the processing of the FMCOV and AMCOV generators to be operated on available threads to increase the CPU utilization and the speedup factor of the FMCOV and AMCOV generators. The technique has been carefully designed to avoid data dependency issue. The experiments were conducted on both synthetic and real-world data sets. The experimental results confirm that the speedup factors of the parallel FMCOV generation algorithm over the serial FMCOV generation algorithm and the parallel AMCOV generation algorithm over the serial AMCOV generation algorithm are proportional to the number of available processors or processor cores. In other words, the speedup factor increases as the number of available processors or processor cores increases.

Corresponding author: Piyabute Fuangkhon, Department of Business Information Systems, Assumption University, Samut Prakan 10540, Thailand, e-mail: piyabutefng@au.edu, piyabute@hotmail.com.

Bibliography

[1] K. Bache and M. Lichman, UCI machine learning repository, 2014. http://archive.ics.uci.edu/ml. Accessed July, 2014.Search in Google Scholar

[2] S. Chiu, Fuzzy model identification based on cluster estimation, J. Intell. Fuzzy Syst. 2 (1994), 267–278.10.3233/IFS-1994-2306Search in Google Scholar

[3] C. Cortes and V. Vapnik, Support-vector networks, Mach. Learn.20 (1995), 273–297.10.1007/BF00994018Search in Google Scholar

[4] P. Fuangkhon, An incremental learning preprocessor for feed-forward neural network, Artif. Intell. Rev.41 (2014), 183–210.10.1007/s10462-011-9304-0Search in Google Scholar

[5] P. Fuangkhon and T. Tanprasert, Multi-class contour preserving classification, in: Intelligent Data Engineering and Automated Learning, Lect. Notes Comput. Sci. 7435, pp. 35–42, Springer, Berlin, 2012.Search in Google Scholar

[6] P. Fuangkhon and T. Tanprasert, A training set reduction algorithm for feed-forward neural network using minimum boundary vector distance selection, in: International Conference on Information Science, Electronics and Electrical Engineering, IEEE, Sapporo, Japan, pp. 71–78, 2014, doi:10.1109/InfoSEEE.2014.6948071.10.1109/InfoSEEE.2014.6948071Search in Google Scholar

[7] P. Fuangkhon and T. Tanprasert, Reduced multi-class contour preserving classification, Neural Process. Lett. (2015), 1–46 (Online).10.1007/s11063-015-9446-1Search in Google Scholar

[8] S. Haykin, Neural networks: a comprehensive foundation, 2nd ed., Prentice Hall, Upper Saddle River, NJ, 1999.Search in Google Scholar

[9] S. Kaitikunkajorn and T. Tanprasert, Improving synthesis process of decayed prior sampling technique, in: International Conference on Intelligent Technology, Assumption University Press, Bangkok, Thailand, pp. 240–244, 2005.Search in Google Scholar

[10] J. Mongkonsirivatana, Neural network reliability enhancement with boundary detection contour preserving training, in: International Conference on Intelligent Technology, Assumption University Press, Bangkok, Thailand, pp. 234–239, 2005.Search in Google Scholar

[11] M. Negnevitsky, Artificial intelligence: a guide to intelligent systems, 2nd ed., Addison Wesley, Essex, UK, 2005.Search in Google Scholar

[12] S. Russell and P. Norving, Artificial intelligence: a modern approach, 2nd ed., Pearson Education, Delhi, India, 2004.Search in Google Scholar

[13] T. Tanprasert and T. Kripruksawan, An approach to control aging rate of neural networks under adaptation to gradually changing context, in: International Conference on Neural Information Processing, IEEE, Singapore, pp. 174–178, 2002, doi:10.1109/ICONIP.2002.1202154.10.1109/ICONIP.2002.1202154Search in Google Scholar

[14] T. Tanprasert, C. Tanprasert and C. Lursinsap, Contour preserving classification for maximal reliability, in: Int. Joint Conference on Neural Networks, IEEE, Anchorage, Alaska, USA, pp. 1125–1130, 1998, doi:10.1109/IJCNN.1998.685930.10.1109/IJCNN.1998.685930Search in Google Scholar

[15] T. Tanprasert, P. Fuangkhon and C. Tanprasert, An improved technique for retraining neural networks in adaptive environment, in: International Conference on Intelligent Technology, Assumption University Press, Bangkok, Thailand, pp. 77–80, 2008.Search in Google Scholar

[16] M. Verleysen, E. Bodt and V. Wertz, UCL enhanced learning for evolutive neural architectures (2014). https://www.elen.ucl.ac.be/neural-nets/Research/Projects/ELENA/elena.htm. Accessed July, 2014.Search in Google Scholar

Received: 2015-4-21

Published Online: 2015-12-7

Published in Print: 2017-1-1

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Parallel Multi-Class Contour Preserving Classification

Abstract

1 Introduction

2 Related Works

3 Serial Multi-class Contour Preserving Classification

4 Parallel Multi-Class Contour Preserving Classification

5 Runtime Complexity

6 Experimental Results

7 Conclusion

Bibliography

Journal and Issue

Articles in the same Issue