A new hyperbox selection rule and a pruning strategy for the enhanced fuzzy min–max neural network

doi:10.1016/j.neunet.2016.10.012

Neural Networks

Volume 86, February 2017, Pages 69-79

https://doi.org/10.1016/j.neunet.2016.10.012 Get rights and content

Abstract

In this paper, we extend our previous work on the Enhanced Fuzzy Min–Max (EFMM) neural network by introducing a new hyperbox selection rule and a pruning strategy to reduce network complexity and improve classification performance. Specifically, a new k-nearest hyperbox expansion rule (for selection of a new winning hyperbox) is first introduced to reduce the network complexity by avoiding the creation of too many small hyperboxes within the vicinity of the winning hyperbox. A pruning strategy is then deployed to further reduce the network complexity in the presence of noisy data. The effectiveness of the proposed network is evaluated using a number of benchmark data sets. The results compare favorably with those from other related models. The findings indicate that the newly introduced hyperbox winner selection rule coupled with the pruning strategy are useful for undertaking pattern classification problems.

Introduction

Artificial neural networks (ANNs), which are models of biological neural systems (Graupe, 1997, Li and Ma, 2010), have been widely used in many fields, which include healthcare (Lisboa, 2002), financial economics (Li & Ma, 2010), security (Obaidat & Macchairolo, 1994), power (Whei-Min, Chih-Ming, & Chiung-Hsing, 2011), robot motion control (Xia, Gang, & Jun, 2005), fault detection (Cho et al., 2010, Chow and Yee, 1991), Realization problem of multi-layer cellular (Ban & Chang, 2015), human action classification (Yu & Lee, 2015), and airline (Turkmen & Korkmaz, 2010). Pattern classification is one of the active ANN applications (Zhang, 2000). As an example, ANNs have been successfully applied to a variety of real-world pattern classification tasks in industry, business, and science (Zhang, 2000), medicine (Isola, Carvalho, & Tripathy, 2012), as well as industrial fault detection and diagnosis (Quteishat, Lim, Tweedale, & Jain, 2009). Besides that, ANNs are useful for handling noisy data collected from real environments. The learning properties of ANNs are robust against noise and are useful for recognizing different types of input patterns. In fact, one of the main problems of many learning algorithms is catastrophic forgetting that happens when they attempt to learn quickly in response to a changing world (Grossberg, 2013). Catastrophic forgetting, which is also known as the stability–plasticity dilemma, is concerned with the inability of a learning system to preserve what it has previously learned when new information is absorbed into its knowledge base. In other words, the learning systems forget previously learned information in the process of learning new information (Grossberg, 2013).

To overcome the stability–plasticity dilemma, a number of ANN models have been proposed, which include the adaptive resonance theory (ART) networks (Grossberg, 1976a, Grossberg, 1976b) and fuzzy min–max (FMM) networks (Simpson, 1992, Simpson, 1993). Among different ANN models, the FMM network and its variants have been the focus of many investigations (Davtalab et al., 2014, Gabrys and Bargiela, 2000, Nandedkar and Biswas, 2007a, Nandedkar and Biswas, 2007b, Quteishat and Lim, 2008, Quteishat et al., 2010, Simpson, 1992, Simpson, 1993, Zhang et al., 2011). The design of FMM variants is largely based on two original FMM networks introduced by Simpson, i.e., the supervised classification FMM network (Simpson, 1992) and later the unsupervised clustering FMM network (Simpson, 1993).

A number of neural–fuzzy models similar to FMM have been suggested in the literature, e.g. Abe and Lan (1995), Leite, Costa, and Gomide (2013) and Peters (2011). In Abe and Lan (1995), a new method to directly extract fuzzy rules (hyperboxes) from numerical data for pattern classification has been proposed. Specifically, a fuzzy classification model comparable with FMM for handling large-scale classification problems, but with less learning complexity, is formulated. In the proposed model, overlapping between different classes is solved by introducing two types of hyperboxes: activation and inhibition. The activation hyperboxes define the existence regions for classes, while the inhibition hyperboxes block the existence of data within the activation hyperboxes (Abe & Lan, 1995). While the concept of creating hyperboxes is similar to that of FMM, the number of hyperboxes in Abe and Lan (1995) increases by increasing the hyperbox size ( $θ$ ) (as opposed to that of FMM). In Peters (2011), the concept of granular box regression as a simple method that links independent and dependent variables by boxes (hyper-dimensional interval numbers) has been proposed. The idea of granular box regression is to establish relationships between independent and dependent variables, then to extract fuzzy rules from numerical data by a predefined number of boxes (Peters, 2011). These boxes are similar to the hyperboxes generated by FMM. However, the granular box regression remains transparent and does not behave like a black box as in FMM. Later, a granular neural network for evolving fuzzy system modeling from fuzzy data streams has been introduced by Leite et al. (2013).

The original FMM network uses hyperbox fuzzy sets to create and store knowledge (as hidden nodes) in its network structure. A number of hyperboxes are formed in the FMM structure. Each hyperbox occupies a region defined by its minimum (min) and maximum (max) points in the $n$ -dimensional pattern space. The fuzzy notion in FMM arises from the combination of the hyperbox min–max points with a fuzzy membership function. The fuzzy membership function determines the degree to which an input pattern belongs to a particular class. FMM has a number of useful properties to handle pattern classification problems (Simpson, 1992), which include online learning, nonlinear separability, non-overlapping classes, short training time, as well as soft and hard decisions. All these salient properties make FMM a unique pattern classifier. Because of the advantages of FMM, a number of FMM variants have been introduced in the literature (Davtalab et al., 2014, Gabrys and Bargiela, 2000, Mohammed and Lim, 2015, Nandedkar and Biswas, 2007a, Nandedkar and Biswas, 2007b, Quteishat and Lim, 2008, Quteishat et al., 2010, Zhang et al., 2011). While different FMM variants have been proposed, they are built primarily based on the original FMM network, and are inherently affected by some, if not all, limitations of the original FMM learning algorithm. As a result, we have proposed the Enhanced FMM (EFMM) model (Mohammed & Lim, 2015) to solve the following limitations associated with FMM and its variants:

i.
The possibility of the hyperbox expansion procedure to increase the overlapping regions between different classes;
ii.
The existing hyperbox overlap test rule is insufficient to detect all overlapping regions of hyperboxes;
iii.
The hyperbox contraction procedure is affected owing the existence of undetected overlapping regions after performing the hyperbox overlap test.

Table 1 shows a summary of FMM and its variants that are susceptible to the above-mentioned limitations in the hyperbox expansion, hyperbox overlap test, and hyperbox contraction procedures, as well as whether the respective models are sensitive to noise. Even though EFMM has shown its effectiveness in addressing the first three limitations in Table 1, issues related to network complexity (in terms of the number of hyperboxes created) and noise tolerance remain unsolved. In EFMM, learning with large data sets increases the network complexity, while learning with noisy data samples results in spurious knowledge stored as hyperboxes in the network structure. These limitations affect the EFMM performance. Therefore, we investigate techniques and strategies to solve the limitations of EFMM and improve its robustness for tackling pattern classification problems. Accordingly, we propose an extended EFMM network, known as EFMM-II, in this paper, with the following contributions:

i.
A new hyperbox selection rule is formulated to reduce the network complexity by avoiding the creation of too many small hyperboxes within the vicinity of the winning hyperbox. It also helps reduce the misclassification errors by minimizing the overlapping regions of hyperboxes from different classes during the hyperbox expansion procedure.
ii.
A new pruning strategy is devised to further reduce the network complexity pertaining to the presence of noise in the training data samples.

This paper is organized as follows. In Section 2, some related techniques for handling noise and complexity issues in neural network classifiers are described. The EFMM neural network is explained in Section 3. An analysis of the EFMM learning algorithm is presented in Section 4. The proposed hyperbox selection and expansion rule, as well as, a new pruning strategy for EFMM-II are detailed in Section 5. The network structure complexity, as well as algorithm complexity, are addressed in Section 6. Performance evaluation using a series of simulation studies is presented in Section 7. Finally, concluding remarks and suggestions for further work are included in Section 8.

Section snippets

Noise in neural network classification

In this section, a number of methods for tackling noise in neural network classification models are reviewed. Noise is a common phenomenon in real-world data. In pattern classification, there are two general types of noise that can lead to classification errors (Sluban, Gamberger, & Lavraè, 2014), namely class noise (labeled errors) and attribute noise. Class noise arises when the class labels are incorrectly assigned to the input samples, while attribute noise happens when one or more input

EFMM neural network

EFMM is one of the FMM variants capable of on-line learning. Its learning algorithm comprises a three-step process, viz., hyperbox expansion, hyperbox overlap test, and hyperbox contraction (Mohammed & Lim, 2015). Each EFMM hyperbox is represented by a set of minimum and maximum points in an $n$ -dimensional space within a unit cube ( $I^{n}$ ). When a data sample is contained in a hyperbox, the data sample has a full class membership of the hyperbox. The hyperbox size is controlled by a user-defined

Analysis of the EFMM learning algorithm

As explained previously, EFMM has two main limitations in its learning algorithm, as inherited from FMM. Both limitations compromise the EFMM performance. The details are as follows.

The proposed EFMM-II network

To overcome the network complexity and noise problems of EFMM, two heuristic rules are proposed, viz., (i) a hyperbox selection rule; (ii) a pruning strategy. The details are as follows.

Structure complexity and algorithm complexity

In FMM and its variants, the structure complexity refers to the number of hyperboxes created during the training phase. Many papers on FMM and its variants use the number of hyperboxes for comparing the network structure complexity, e.g. Davtalab et al. (2014), Gabrys and Bargiela (2000), Mohammed and Lim (2015), Quteishat and Lim (2008), Quteishat et al. (2010), and Zhang et al. (2011). Since the proposed EFMM-II model incorporates a pruning strategy, its network structure complexity is less

Performance evaluation

Four case studies with ten data sets were conducted to evaluate the effectiveness of EFMM-II comprehensively. Nine of the data sets were obtained from the University of California, Irvine (UCI) machine learning repository (Bache & Lichman, 2013) while the remaining was the noisy 4-circle-in-the-square problem. In all cases, the pruning threshold was set to $δ = 0.6$ . In the first case study, the effects of changing the expansion coefficient ( $θ$ ) were investigated using three benchmark data sets,

Conclusions

In this paper, a new FMM variant known as EFMM-II, which is an extension of EFMM, has been proposed. The main contributions of EFMM-II are two-fold, i.e. solving the network complexity and noise problems in EFMM as well as other relevant FMM variants. Specifically, the new $k$ -nearest hyperbox selection rule has been employed to reduce the network complexity. As such, a parsimonious network structure that avoids creating too many small hyperboxes within the vicinity of the winning hyperbox is

Acknowledgments

The authors gratefully acknowledges the financial support of the FRGS grant (RDU160104) and the RDU grants (RDU160366 and RDU150357) for this work.

References (43)

A.A. Ahmed et al.
Filtration model for the detection of malicious traffic in large-scale networks
Computer Communications
(2016)
J.-C. Ban et al.
Realization problem of multi-layer cellular neural networks
Neural Networks
(2015)
S. Grossberg
Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world
Neural Networks
(2013)
D. Leite et al.
Evolving granular neural networks from fuzzy data streams
Neural Networks
(2013)
P.J.G. Lisboa
A review of evidence of health benefits from artificial neural networks in medical intervention
Neural Networks
(2002)
E. Parrado-Hernández et al.
Study of distributed learning as a solution to category proliferation in Fuzzy ARTMAP based neural systems
Neural Networks
(2003)
A. Quteishat et al.
A modified fuzzy min-max neural network with rule extraction and its application to fault detection and classification
Applied Soft Computing
(2008)
A. Quteishat et al.
A neural network-based multi-agent classifier system
Neurocomputing
(2009)
S.-j. Wang et al.
Empirical analysis of support vector machine ensemble classifiers
Expert Systems with Applications
(2009)
Z. Yu et al.
Real-time human action classification using a dynamic neural model
Neural Networks
(2015)

S. Abe et al.

A method for fuzzy rules extraction directly from numerical data and its application to pattern classification

IEEE Transactions on Fuzzy Systems

(1995)

Bache, K., & Lichman, M. (2013). UCI Machine Learning Repository. School Inf. Comput. Sci., Univ. California, Irvine,...

M. Bianchini et al.

On the complexity of neural network classifiers: A comparison between shallow and deep architectures

IEEE Transactions on Neural Networks and Learning Systems

(2014)

G. Carpenter et al.

Rule extraction: from neural architecture to symbolic representation

Connection Science

(1995)

H.C. Cho et al.

Fault detection and isolation of induction motors using recurrent neural networks and dynamic Bayesian modeling

IEEE Transactions on Control Systems Technology

(2010)

M.-Y. Chow et al.

Methodology for on-line incipient fault detection in single-phase squirrel-cage induction motors using artificial neural networks

IEEE Transactions on Energy Conversion

(1991)

R. Davtalab et al.

Multi-level fuzzy min-max neural network classifier

IEEE Transactions on Neural Networks and Learning Systems

(2014)

B. Efron

Bootstrap methods: Another look at the jackknife

The Annals of Statistics

(1979)

B. Gabrys et al.

General fuzzy min-max neural network for clustering and classification

IEEE Transactions on Neural Networks

(2000)

D. Graupe

Principles of artificial neural networks, Vol. 3

(1997)

S. Grossberg

Adaptive pattern classification and universal recording: I. Parallel development and coding of neural feature detectors

Biological Cybernetics

(1976)

Cited by (27)

A scalable dynamic ensemble selection using fuzzy hyperboxes
2024, Information Fusion
Dynamic ensemble selection (DES) systems work by estimating the level of competence of each classifier from a pool of classifiers and selecting the most competent ones for the classification of a given test instance during inference time. The majority of dynamic ensemble selection (DES) methods evaluate the competence of classifiers using the K-Nearest Neighbors to the unknown query sample. However, KNN is very sensitive to local data distribution and needs to store all data in memory. Moreover, it performs several computations for each individual query sample. Thus, relying on the KNN technique hampers the use of DES approaches for large-scale problems and situations where data distributions are non-uniform. This article introduces a novel DES framework called FH-DES, which employs fuzzy hyperboxes to generate a competence map or incompetence map for each classifier. The competence map is generated from correctly classified samples to indicate the competence level of the classifier at each data point in the feature space, whereas the incompetence map, which shows regions where the classifier has low accuracy, is generated from misclassified samples. In this way, we can assess the competence or incompetence level of the classifier just by using the map without having to process previous samples. This feature results in a more accurate dynamic selection system with lower computational complexity compared to other dynamic selection methods. Moreover, we introduce several hyperbox expansion and contraction strategies that add incremental learning capability to the framework while keeping the computational cost low. Experimental results demonstrate that FH-DES achieves high classification accuracy with lower complexity than state-of-the-art dynamic selection methods. The source code for FH-DES is available at https://github.com/redavtalab/FH-DES.
A compact fuzzy min max network with novel trimming strategy for pattern classification
2022, Knowledge-Based Systems
Citation Excerpt :
In [39], an enhanced FMM (EFMM) was presented that decreases the overlapping of hyperboxes during expansion. Hyperbox pruning strategy was introduced in [40] which is an extension of EFMM network. In [41], an improved FMM network was presented that uses semi-perimeter of the hyperbox along with k-nearest mechanism to select the expandable hyperbox, and also modifies all the contraction rules of conventional FMM and its variants.
Hyperbox classifier has large contribution to the field of pattern classification, because of its efficiency and transparency. Hyperbox classifier is efficiently implemented by using fuzzy min–max (FMM) neural network. FMM was modified many times to improve the classification accuracy. Moreover, there still exists a space for increasing the accuracy of hyperbox based classifiers. In this paper, four modifications are proposed to FMM network for increasing the classification accuracy rate. First, centroid and K-highest (CCK) based criteria to select the expandable hyperbox. Second, a new set of overlap test cases to consider all types of overlapping regions. Third, a new set of contraction rules to settle the overlapped regions. Fourth, novel hyperbox trimming strategy to reduce the system complexity. The proposed method is compared with FMM, enhanced FMM (EFMM) and Kn_FMM using five datasets. Experimental results clearly reflect the improved efficiency of proposed method. Proposed FMM (PFMM) network is also used to classify the histopathological images for knowing the best magnifying factor.
Class label altering fuzzy min-max network and its application to histopathology image database
2021, Expert Systems with Applications
Citation Excerpt :
The contraction Cases I and II were directly taken from FMM model (Eqs. (7) and (8)). Mohammed et al. (Mohammed & Lim, 2017a, 2017b), have introduced a variant of traditional FMM neural network, in which K-nearest rule is applied to the traditional FMM neural network. FMM considers the hyperbox with maximum MF value for expansion.
Hyperbox classifier is efficiently implemented using fuzzy min max neural network, where the input patterns present in the training phase place a vital role. In the training phase, a set of hyperboxes are constructed which are used to classify a testing pattern. A better selection of input patterns of the training set will generate an efficient set of hyperboxes. Inappropriate selection of training samples leads to an erroneous set of hyperboxes, which will degrade the accuracy rate. Therefore, instead of using a single training set an extra training set, secondary training set, may be used to reshuffle the hyperboxes generated during the primary training set. In this paper, we have used a secondary training set to update the hyperboxes generated during the primary training phase. When a secondary training set is used, two cases are arised. First, change the class label of some inefficient hyperboxes created during the primary training phase. Second, fix the class label of efficient hyperboxes to the class allotted during the primary training set. By using the above secondary training set mechanism, a novel class label altering fuzzy min max (CLAFMM) network is proposed to alter the class labels depending on the secondary training set. Experimental results prove that the proposed approach provides more accuracy rate than the FMM, Enhanced FMM and K-nearest FMM. Simultaneously, the proposed approach reduces the complexity of the network by reducing the number of hyperboxes generated by the above said state-of-the-art methods. The proposed method is also applied to the breast cancer histopathological images to identify the best magnifying factor for classification.
Structured pruning of recurrent neural networks through neuron selection
2020, Neural Networks
Recurrent neural networks (RNNs) have recently achieved remarkable successes in a number of applications. However, the huge sizes and computational burden of these models make it difficult for their deployment on edge devices. A practically effective approach is to reduce the overall storage and computation costs of RNNs by network pruning techniques. Despite their successful applications, those pruning methods based on Lasso either produce irregular sparse patterns in weight matrices, which is not helpful in practical speedup. To address these issues, we propose a structured pruning method through neuron selection which can remove the independent neuron of RNNs. More specifically, we introduce two sets of binary random variables, which can be interpreted as gates or switches to the input neurons and the hidden neurons, respectively. We demonstrate that the corresponding optimization problem can be addressed by minimizing the $L_{0}$ norm of the weight matrix. Finally, experimental results on language modeling and machine reading comprehension tasks have indicated the advantages of the proposed method in comparison with state-of-the-art pruning competitors. In particular, nearly 20 $\times$ practical speedup during inference was achieved without losing performance for the language model on the Penn TreeBank dataset, indicating the promising performance of the proposed method.
Adaptive rough radial basis function neural network with prototype outlier removal
2019, Information Sciences
Citation Excerpt :
An ANN learns the nonlinearity of the input–output data mapping through a topological structure [23] and is also self-adaptive, with a universal functional approximation capability [11]. An ANN generates a good knowledge base that represents different patterns from data samples [32]. In this study, an automated constructive neural network, i.e., the radial basis function network with dynamic decay adjustment (RBFNDDA) [5], is chosen as the base model for learning and classifying data samples.
A new rough neural network (RNN)-based model is proposed in this paper. The radial basis function network with dynamic decay adjustment (RBFNDDA) is applied to learn information directly from a data set and group it in terms of prototypes. Then, a neighborhood rough set-based procedure is applied to detect prototype outliers. This hybrid model is named rough RBFNDDA1. However, the removal of all outliers may cause information loss because some outliers may represent rare yet useful information in a classification task. As such, the parameters of a prototype outlier, i.e., its radius and weight, are exploited to gauge whether the information encoded by the prototype is meaningful. This hybrid model is named rough RBFNDDA2. The results from a benchmark experimental study show that rough RBFNDDA2 can retain meaningful prototype outliers and, at the same time, significantly reduce the number of prototypes from the original RBFNDDA model while maintaining classification accuracy. A real-world application in a power generation plant is used to evaluate and demonstrate the effectiveness of the proposed model.
Design of double fuzzy clustering-driven context neural networks
2018, Neural Networks
Citation Excerpt :
With the development of neural network and fuzzy logic, a number of neuro fuzzy systems emerged and were applied to many applications in different research and problem domains (Kar, Das, & Ghosh, 2014; Karakuzu, Karakaya, & Çavuşlu, 2016; Mohammed & Lim, 2017; Wu & Zeng, 2016).
In this study, we introduce a novel category of double fuzzy clustering-driven context neural networks (DFCCNNs). The study is focused on the development of advanced design methodologies for redesigning the structure of conventional fuzzy clustering-based neural networks. The conventional fuzzy clustering-based neural networks typically focus on dividing the input space into several local spaces (implied by clusters). In contrast, the proposed DFCCNNs take into account two distinct local spaces called context and cluster spaces, respectively. Cluster space refers to the local space positioned in the input space whereas context space concerns a local space formed in the output space. Through partitioning the output space into several local spaces, each context space is used as the desired (target) local output to construct local models. To complete this, the proposed network includes a new context layer for reasoning about context space in the output space. In this sense, Fuzzy C-Means (FCM) clustering is useful to form local spaces in both input and output spaces. The first one is used in order to form clusters and train weights positioned between the input and hidden layer, whereas the other one is applied to the output space to form context spaces. The key features of the proposed DFCCNNs can be enumerated as follows: (i) the parameters between the input layer and hidden layer are built through FCM clustering. The connections (weights) are specified as constant terms being in fact the centers of the clusters. The membership functions (represented through the partition matrix) produced by the FCM are used as activation functions located at the hidden layer of the “conventional” neural networks. (ii) Following the hidden layer, a context layer is formed to approximate the context space of the output variable and each node in context layer means individual local model. The outputs of the context layer are specified as a combination of both weights formed as linear function and the outputs of the hidden layer. The weights are updated using the least square estimation (LSE)-based method. (iii) At the output layer, the outputs of context layer are decoded to produce the corresponding numeric output. At this time, the weighted average is used and the weights are also adjusted with the use of the LSE scheme. From the viewpoint of performance improvement, the proposed design methodologies are discussed and experimented with the aid of benchmark machine learning datasets. Through the experiments, it is shown that the generalization abilities of the proposed DFCCNNs are better than those of the conventional FCNNs reported in the literature.

View all citing articles on Scopus

View full text

A new hyperbox selection rule and a pruning strategy for the enhanced fuzzy min–max neural network

Abstract

Introduction

Section snippets

Noise in neural network classification

EFMM neural network

Analysis of the EFMM learning algorithm

The proposed EFMM-II network

Structure complexity and algorithm complexity

Performance evaluation

Conclusions

Acknowledgments

Computer Communications

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Applied Soft Computing

Neurocomputing

Expert Systems with Applications

Neural Networks

A method for fuzzy rules extraction directly from numerical data and its application to pattern classification

IEEE Transactions on Fuzzy Systems

On the complexity of neural network classifiers: A comparison between shallow and deep architectures

IEEE Transactions on Neural Networks and Learning Systems

Rule extraction: from neural architecture to symbolic representation

Connection Science

Fault detection and isolation of induction motors using recurrent neural networks and dynamic Bayesian modeling

IEEE Transactions on Control Systems Technology

Methodology for on-line incipient fault detection in single-phase squirrel-cage induction motors using artificial neural networks

IEEE Transactions on Energy Conversion

Multi-level fuzzy min-max neural network classifier

IEEE Transactions on Neural Networks and Learning Systems

Bootstrap methods: Another look at the jackknife

The Annals of Statistics

General fuzzy min-max neural network for clustering and classification

IEEE Transactions on Neural Networks

Principles of artificial neural networks, Vol. 3

Adaptive pattern classification and universal recording: I. Parallel development and coding of neural feature detectors

Biological Cybernetics