Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation

doi:10.1016/j.neunet.2006.10.003

Neural Networks

Volume 20, Issue 1, January 2007, Pages 109-128

https://doi.org/10.1016/j.neunet.2006.10.003 Get rights and content

Abstract

Fuzzy ARTMAP neural networks have been proven to be good classifiers on a variety of classification problems. However, the time that Fuzzy ARTMAP takes to converge to a solution increases rapidly as the number of patterns used for training is increased. In this paper we examine the time Fuzzy ARTMAP takes to converge to a solution and we propose a coarse grain parallelization technique, based on a pipeline approach, to speed-up the training process. In particular, we have parallelized Fuzzy ARTMAP without the match-tracking mechanism. We provide a series of theorems and associated proofs that show the characteristics of Fuzzy ARTMAP’s, without matchtracking, parallel implementation. Results run on a BEOWULF cluster with three large databases show linear speedup as a function of the number of processors used in the pipeline. The databases used for our experiments are the Forrest CoverType database from the UCI Machine Learning repository and two artificial databases, where the data generated were 16-dimensional Gaussian distributed data belonging to two distinct classes, with different amounts of overlap (5% and 15%).

Introduction

Neural Networks have been used extensively and successfully to tackle a wide variety of problems. As computing capacity and electronic databases grow, there is an increasing need to process considerably larger databases. In this context, the algorithms of choice tend to be ad hoc algorithms (Agrawal & Srikant, 1994) or tree based algorithms such as CART (King, Feng, & Shutherland, 1995) and C4.5 (Quinlan, 1993). Variations of these tree learning algorithms, such as SPRINT (Shafer, Agrawal, & Mehta, 1996) and SLIQ (Mehta, Agrawal, & Rissanen, 1996) have been successfully adapted to handle very large data sets.

Neural network algorithms, on the other hand, can have a prohibitively slow convergence to a solution, especially when they are trained on large databases. Even one of the fastest (in terms of training speed) neural network algorithms, the Fuzzy ARTMAP algorithm ((Carpenter et al., 1992, Carpenter et al., 1991), and its faster variations (Kasuba, 1993, Taghi et al., 2003)) tend to converge slowly to a solution as the size of the network increases.

One obvious way to address the problem of slow convergence to a solution is by the use of parallelization. Extensive research has been done on the properties of parallelization of feed-forward multi-layer perceptrons (Mangasarian and Solodov, 1994, Torresen et al., 1995, Torresen and Tomita, 1998). This is probably due to the popularity of this neural network architecture, and also because the backpropagation algorithm (Rumelhart, Hinton, & Williams, 1986), used to train these types of networks, can be characterized mathematically by matrix and vector multiplications, mathematical structures that have been parallelized with extensive success.

Regarding the parallelization of ART neural networks, the work by Manolakos (Manolakos, 1998) implements the ART1 neural network (Carpenter et al., 1991) on a ring of processors. To accomplish this Manolakos divides the communication in two bidirectional rings, one for the $F_{1}$ layer of ART1 and another for the $F_{2}$ layer of ART1. Learning examples are pipelined through the ring to optimize network utilization. Experimental results of Manolakos’ work indicate close to linear speed-up as a function of the number of processors. This approach is efficient for ring networks and it is an open question of whether it can be extended for Fuzzy ARTMAP. Another parallelization approach that has been used with ART and other types of neural networks is the systems integration approach where the neural network is not implemented on a network of computers but on parallel hardware. Zhang (Zhang, 1998) shows how a fuzzy competitive neural network similar to ARTMAP can be implemented using a systolic array. Asanović (Asanović et al., 1998) uses a special purpose parallel vector processor SPERT-II to implement back-propagation and Kohonen neural networks. In Malkani and Vassiliadis (1995), a parallel implementation of the Fuzzy-ARTMAP algorithm, similar to the one investigated here, is presented. However, in his paper, a hypercube topology is utilized for transferring data to all of the nodes involved in the computations. While it is trivial to map the hypercube to the more flexible switched network typically found in a Beowulf, this would likely come with a performance hit. In this approach each one of the processors maintains a subset of the architecture’s templates, and finds the template with the maximum match in its local collection. Finally, in the $d$ -dimensional hypercube, all the processors cooperate to find the global maximum through $d$ different synchronization operations. This can eventually limit the scalability of this approach, since the value $d$ grows with the size of the hypercube, while the network bandwidth remains constant.

Mining of large databases is an issue that has been addressed by many researchers. Mehta (Mehta et al., 1996), developed SLIQ, a decision-tree based algorithm that combines techniques of tree-pruning and sorting to efficiently manage large datasets. Furthermore, Shafer (Shafer et al., 1996), proposed SPRINT, another decision-tree based algorithm, that removed memory restrictions imposed by SLIQ and is designed to be amenable to parallelization. The Fuzzy ARTMAP neural network has many desirable characteristics, such as the ability to solve any classification problem, the capability to learn from data in an on-line mode, the advantage of providing interpretations for the answers that it produces, the capacity to expand its size as the problem requires, and the ability to recognize novel inputs, among others. Due to all these virtues we investigate Fuzzy ARTMAP’s parallelization in an effort to improve its convergence speed to a solution when it is trained with large datasets.

There are many variants within the Fuzzy ARTMAP family of neural networks. Kasuba (Kasuba, 1993), with only classification problems in mind, develops a simplified Fuzzy ARTMAP structure (simplified Fuzzy ARTMAP), while Taghi et al., in Taghi et al. (2003), describe variants of simplified Fuzzy ARTMAP, called Fast Simplified Fuzzy ARTMAP, variants. These Fuzzy ARTMAP variants are faster than the original Fuzzy ARTMAP algorithm, because they eliminated all the computations performed in the ${ART}_{b}$ module of Fuzzy ARTMAP, and because they have simplified the computations performed in the ${ART}_{a b}$ module of Fuzzy ARTMAP; the results produced by these simplified Fuzzy ARTMAP variants are the same as the results produced by the original Fuzzy ARTMAP, when the problem at hand is a classification problem.

One of the Fuzzy ARTMAP fast algorithmic variants presented in Taghi et al. (2003) is called SFAM2.0 and it is this algorithmic Fuzzy ARTMAP variant (that is equivalent to Fuzzy ARTMAP for classification problems) that is the focus of our paper. Furthermore, in this paper, we only concentrate on the no-match tracking version of SFAM2.0. No-match tracking was a concept introduced by Anagnostopoulos in the framework of the ART networks (Anagnostopoulos & Georgiopoulos, 2003). No match-tracking is a specific ART network behavior, where whenever an input pattern is presented to the ART network and a category is chosen that maximizes the bottom-up input, passes the vigilance, but is mapped to the incorrect output, this category is deactivated and a new category (uncommitted category) is activated next that will encode the input pattern. As a reminder, in that case, the typical ART network behavior is to engage the match-tracking mechanism that deactivates the chosen category, increases the vigilance threshold and searches for another appropriate category that might be or might not be an uncommitted category. Anagnostopoulos has shown through experimentation in Anagnostopoulos and Georgiopoulos (2003) that no-match-tracking Fuzzy ARTMAP increases the number of categories created in the category representation layer compared to Fuzzy ARTMAP but it does so while providing improved generalization performance. No Match-tracking in Fuzzy ARTMAP should not be confused with the on-line operation in Fuzzy ARTMAP. On-line Fuzzy ARTMAP operation implies that an input–output pair is presented only once in Fuzzy ARTMAP’s training phase, and it can be used in a match-tracking or a no-match tracking Fuzzy ARTMAP. The reason that we focus on the no match-tracking Fuzzy ARTMAP is because it gives us the opportunity to first parallelize the competitive aspect of Fuzzy ARTMAP, while ignoring the complications of the feedback mechanism that matchtracking introduces. Finally, we focus on the on-line version of this network, since a parallelization of the on-line version extends in a straightforward fashion to the off-line version of the network.

For simplicity, we refer to this Fuzzy ARTMAP variant (on-line, no-matchhtracking SFAM2.0) simply as Fuzzy ARTMAP, or FAM. If we demonstrate the effectiveness of our parallelization strategies for FAM, extension to other ART structures can be accomplished without a lot of effort. This is due to the fact that the other ART structures share a lot of similarities with FAM, and as a result, the advantages of the proposed parallelization approach can be readily extended to other ART variants (for instance Gaussian ARTMAP (Williamson, 1996) and Ellipsoidal ARTMAP (Anagnostopoulos & Georgiopoulos, 2001), among others).

The remainder of this paper is organized as follows: Section 2 presents the Fuzzy ARTMAP neural network architecture and a few Fuzzy ARTMAP variants. Section 3 continues with the pseudo-code of the off-line, match-tracking Fuzzy ARTMAP, on-line match tracking Fuzzy ARTMAP, and on-line no-match tracking Fuzzy ARTMAP (referred to simply as FAM). Section 4 focuses on the computational complexity of the on-line, match-tracking Fuzzy ARTMAP, and serves as a necessary motivation for the parallelization approach introduced in this paper. Section 5 presents a discussion of the Beowulf cluster as our platform of choice. Section 6 continues with the pseudocode of the parallel Fuzzy ARTMAP, referred to as Pfam, and associated discussion to understand the important aspects of this implementation. Section 7 focuses on theoretical results related to the proposed parallelization approach. In particular, we prove there that Pfam is equivalent to FAM, and that the processors in the parallel implementation will be reasonably balanced by considering a worst case scenario. Furthermore, Section 8 proceeds with experiments and results comparing the performance of Pfam and FAM on three databases, one of them real and two artificial. The article concludes with Section 9, where a summary of our experiences, from the conducted work, and future research are delineated.

Section snippets

The Fuzzy ARTMAP neural network architecture/Fuzzy ARTMAP variations

The Fuzzy ARTMAP neural network and its associated architecture was introduced by Carpenter and Grossberg in their seminal paper (Carpenter et al., 1992). Since its introduction, a number of Fuzzy ARTMAP variations and associated successful applications of this ART family of neural networks have appeared in the literature (for instance, ARTEMAP (Carpenter & Ross, 1995), ARTMAP-IC (Carpenter & Markuzon, 1998), Ellipsoid-ART/ARTMAP (Anagnostopoulos & Georgiopoulos, 2001), Fuzzy Min–Max (Simpson,

The Fuzzy ARTMAP variants’ pseudo-code

The off-line, match-tracking, Fuzzy ARTMAP algorithm is shown in Fig. 2. The on-line, match-tracking Fuzzy ARTMAP algorithm is shown in Fig. 3. Notice that in the off-line, match tracking Fuzzy ARTMAP training, the learning process (lines 4 through 30) of the algorithm are performed until no more network weight changes are made or until the number of iterations reached a maximum number (designated as epochs). In on-line, match-tracking Fuzzy ARTMAP training, the learning process (lines 3–24)

Complexity analysis of the on-line, match tracking Fuzzy ARTMAP

We concentrate on analyzing the time complexity of the on-line Fuzzy ARTMAP variants because this is the focus of the paper. Our approach requires making a few assumptions about the size of the networks created and the match-tracking cycles. This complexity analysis will motivate the pipelined implementation of Fuzzy ARTMAP.

We can see from the pseudocode (2,3) that the on-line, match-tracking Fuzzy ARTMAP algorithm tests every input pattern $I$ in the training set against each template $w_{j}^{a}$ at

The Beowulf parallel platform

The Beowulf cluster of workstations is a network of computers where processes exchange information through the network’s communications hardware. In our case, it consisted of 96 AMD nodes, each with dual AthlonMP 1500+ processors and 512 MB of RAM. The nodes are connected through a Fast Ethernet network.

In general, the Beowulf cluster configuration is a parallel platform that has a high latency. This implies that to achieve optimum performance communication packets must be of large size and of

Beowulf Fuzzy ARTMAP implementation

The parallel implementation of Fuzzy ARTMAP (on-line, no-match tracking Fuzzy ARTMAP algorithm) is discussed here. We call this implementation Parallel Fuzzy ARTMAP (Pfam). A depiction of the pipeline is shown in Fig. 8. The elimination of matchtracking makes the learning of a pattern a one-pass over the pipeline procedure and different patterns can be processed on the different pipeline steps to achieve optimum parallelization. For the understanding of Pfam we need the following definitions:

•
$n$

Properties of the Pfam algorithm

We present and prove a series of fourteen (14) theorems. These theorems are distinguished in two groups. The group of theorems associated with the correctness of the Pfam, and the group of theorems associated with the performance of the Pfam. For ease of reference, Table 1 lists the theorems and their names dealing with the correctness of the algorithm, while Table 2 lists the theorems dealing with the performance of the algorithm.

The major purpose of these theorems is to prove that Pfam (a) is

Experiments

Experiments were conducted on three databases: one real-world database and two artificially-generated databases (Gaussian distributed data). Training set sizes of $1000 \times 2^{i}, i \in {5, 6, \dots, 9}$ , that is 32,000 to 512,000 patterns were used for the training of Pfam and FAM. The test set size was fixed at 20,000 patterns. The number of processors in the pipeline varied from $p = 1$ to $p = 32$ . Pipeline sizes were also increased in powers of 2. The packet sizes used were 64 and 128 for the CoverType and the

Summary — conclusions

We have produced a pipelined implementation of Fuzzy ARTMAP. This implementation can be extended to other ART neural network architectures that have a similar competitive structure as Fuzzy ARTMAP. It can also be extended to other neural networks that are designated as “competitive” neural networks, such as PNN, RBFs, as well as other “competitive” classifiers. We have introduced and proven a number of theorems pertaining to our pipeline implementation. The major purpose of these theorems was

Acknowledgments

The authors would like to thank the Computer Research Center of the Technological Institute of Costa Rica, the Institute of Simulation and Training (IST) and the Link Foundation Fellowship program for partially funding this project. This work was supported in part by a National Science Foundation (NSF) grant CRCD 0203446, and the National Science Foundation grant DUE 05254209. Georgios C. Anagnostopoulos and Michael Georgiopoulos also acknowledge the partial support from the NSF grant CCLI

References (29)

G.A. Carpenter et al.
ARTMAP–IC and medical diagnosys: Instance counting and inconsistent cases
Neural Networks
(1998)
S. Kirkpatrick et al.
A very fast shift-register sequence random number generator
Journal of Computational Physics
(1981)
J.R. Williamson
Gaussian ARTMAP: A neural network for fast incremental learning of noisy multidimensional maps
Neural Networks
(1996)
R. Agrawal et al.
Fast algorithms for mining association rules in large databases
Anagnostopoulos, G. (2000). Novel approaches in adaptive resonance theory for machine learning. Unpublished doctoral...
G.C. Anagnostopoulos et al.
Anagnostopoulos, G.C., & Georgiopoulos, M. (2003). Putting the utility of match tracking in Fuzzy ARTMAP training to...
K. Asanović et al.
Training neural networks with SPERT-II
Blackard, J. A. (1999). Comparison of neural networks and discriminant analysis in predicting forest cover types....
G.A. Carpenter et al.
Fuzzy ARTMAP: A neural network architecture for incremental learning of analog multidimensional maps
IEEE Transactions on Neural Networks
(1992)

G.A. Carpenter et al.

ART–EMAP: A neural network architecture for object recognition by evidence accumulation

IEEE Transactions on Neural Networks

(1995)

T.P. Caudell et al.

T. Kasuba

Simplified Fuzzy ARTMAP

AI Expert

(1993)

Cited by (6)

A granular extension of the fuzzy-ARTMAP (FAM) neural classifier based on fuzzy lattice reasoning (FLR)
2009, Neurocomputing
The fuzzy lattice reasoning (FLR) classifier was introduced lately as an advantageous enhancement of the fuzzy-ARTMAP (FAM) neural classifier in the Euclidean space $R^{N}$ . This work extends FLR to space $F^{N}$ , where $F$ is the granular data domain of fuzzy interval numbers (FINs) including (fuzzy) numbers, intervals, and cumulative distribution functions. Based on a fundamentally improved mathematical notation this work proposes novel techniques for dealing, rigorously, with imprecision in practice. We demonstrate a favorable comparison of our proposed techniques with alternative techniques from the literature in an industrial prediction application involving digital images represented by histograms. Additional advantages of our techniques include a capacity to represent statistics of all orders by a FIN, an introduction of tunable (sigmoid) nonlinearities, a capacity for effective data processing without any data normalization, an induction of descriptive decision-making knowledge (rules) from the training data, and the potential for input variable selection.
Parallelization of Fuzzy ARTMAP Architecture on FPGA: Multispectral Classification of ALSAT-2A Images
2017, IEEE Transactions on Industrial Electronics
Efficient identification of faces in video streams using low-power multi-core devices
2016, Handbook Of Pattern Recognition And Computer Vision (5th Edition)
A sustainable model for integrating current topics in machine learning research into the undergraduate curriculum
2009, IEEE Transactions on Education
Analyzing the fuzzy ARTMAP matchtracking mechanism with co-objective optimization theory
2007, IEEE International Conference on Neural Networks - Conference Proceedings
Granular enhancement of fuzzy-ART/SOM neural classifiers based on lattice theory
2007, Studies in Computational Intelligence

View full text

Pipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation

Abstract

Introduction

Section snippets

The Fuzzy ARTMAP neural network architecture/Fuzzy ARTMAP variations

The Fuzzy ARTMAP variants’ pseudo-code

Complexity analysis of the on-line, match tracking Fuzzy ARTMAP

The Beowulf parallel platform

Beowulf Fuzzy ARTMAP implementation

Properties of the Pfam algorithm

Experiments

Summary — conclusions

Acknowledgments

Neural Networks

Journal of Computational Physics

Neural Networks

Fast algorithms for mining association rules in large databases

Training neural networks with SPERT-II

Fuzzy ARTMAP: A neural network architecture for incremental learning of analog multidimensional maps

IEEE Transactions on Neural Networks

ART–EMAP: A neural network architecture for object recognition by evidence accumulation

IEEE Transactions on Neural Networks

Simplified Fuzzy ARTMAP

AI Expert