Elsevier

Pattern Recognition

Volume 42, Issue 11, November 2009, Pages 2764-2786
Pattern Recognition

A swarm-inspired projection algorithm

https://doi.org/10.1016/j.patcog.2009.03.020Get rights and content

Abstract

In this paper a new data projection algorithm which was inspired by the foraging behaviors of doves is proposed. We name the new data projection the swarm-inspired projection (SIP) algorithm. The algorithm allows us to visually estimate the number of clusters existing in a data set. Based on the projection result, we may then partition the data set into the corresponding number of clusters. The SIP algorithm regards each data pattern in a data set as a piece of crumb which will be sequentially tossed to a flock of doves on the ground. The doves will adjust their physical positions to compete for crumbs. Gradually, the flock of doves will be divided into several groups according to the distributions of the crumbs. The formed groups will naturally correspond to the underlying data structures in the data set. By viewing the scatter plot of the final positions of the doves we can estimate the number of clusters existing in the data set. Several data sets were used to demonstrate the effectiveness of the proposed SIP algorithm.

Introduction

Recently, many properties of social insect (or animals) collective behaviors have attracted a great amount of attention from researchers [1], [2]. Social insects (or animals) provide us with a powerful concept to create decentralized systems of simple interacting, and often mobile, agents (e.g., ants, bees, birds). A rich source of mechanisms in social insect collective behaviors may serve as metaphors for designing the so-called swarm-intelligence-based systems. Swarm intelligence, a form of artificial intelligence, is the emergent collective intelligence of groups of simple agents. Different swarm-intelligence-based systems are inspired by different subsets of the available metaphors. Ant colony systems (ACS) [3], [4] and particle swarm optimization (PSO) [5], [6] are two well-known kinds of swarm intelligence. The applications of these systems are widely spread from optimization, communications networks, to robots [7], [8], [9], [10], [11].

Cluster analysis is one of the basic tools for exploring the underlying structure of a given data set and is being applied in a wide variety of engineering and scientific disciplines such as medicine, psychology, biology, society, pattern recognition, and image processing. The primary objective of cluster analysis is to partition a given data set into so-called homogeneous clusters such that patterns within a cluster are more similar to each other than patterns belonging to different clusters. There are two major difficulties encountered in clustering data: (1) cluster geometric shapes are full of variability and (2) the number of clusters is not always known a priori. Different distance measures lead to different types of clusters (e.g., compact hyper-spheres, compact hyper-ellipsoids, lines, shells, etc.). Recently, several clustering algorithms with different distance measures have been developed for clustering data sets with different geometric shapes [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22]. These algorithms are used to detect compact clusters [12], [13], [14], [15], straight lines [14], [15], [16], shells [17], [18], [19], contours with polygonal boundaries [19], [20], or clusters of different geometrical structures [21], [22].

In fact, if cluster analysis is to make a significant contribution to engineering applications, much more attention must be paid to the determination of the optimal number of clusters. Basically, there are four different approaches to the determination of the cluster numbers. The first approach is to use a certain global validity measure to validate clustering results for a range of cluster numbers. Generally, the determination of the optimum cluster number using global validity measures is very expensive since clustering has to be carried out for a variety of possible cluster numbers. Therefore, the idea of performing progressive clustering was proposed [23], [24], [25], [26]. Since the optimal partition cannot be found in a single run, one could filter the presumably good clusters, eliminate spurious clusters, and merge compatible clusters into larger clusters. This process is continued until the final partition contains good clusters only. The third approach is to perform competitive clustering [27], [28]. By the mechanism of competing for data points only good clusters can survive. Another approach is the use of the family of projection algorithms. Projection algorithms can project high-dimensional data onto a low-dimensional space to facilitate visual inspection of the data. They allow us to visualize high-dimensional data as a two-dimensional scatter plot. As it turns out, this can provide better insight into the data since clustering tendencies in the data may become apparent from the projection. The Sammon's nonlinear mapping is one of the most popular projection algorithms [29]. Most of the conventional projection algorithms utilize the gradient descent method to optimize some kind of objective function; therefore, they are susceptible to premature convergence on a local optimum and usually are computationally expensive.

Recently, many different methods have been proposed to utilize the topology preserving property of the self-organizing feature map (SOM) algorithm to project high-dimensional data onto a low-dimensional space to facilitate visual inspection on the data [30], [31], [32], [33], [34]. Several swarm-intelligence-based data clustering algorithms have also been proposed in recent several years [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49]. For example, while some researchers adopted the PSO algorithm rather than the gradient-based method to optimize some kind of cluster validity function (e.g., the quantization error, intra-cluster distance, or the inter-cluster distance) to cluster data [35], [36], [37], [38], Edwards et al. proposed to use the particle swarm optimization (PSO) algorithm to effectively map high-dimensional data points onto a lower-dimensional space so as to identify clusters inherent in the data set [39]. The advantage of these approaches [35], [36], [37], [38], [39] is that they may avoid the local minimization problem incurred from the use of gradient-based optimization tools; however, they still need to pre-specify the number of clusters before they can proceed to the optimization procedure. In addition to those PSO-based approaches [35], [36], [37], [38], [39], a new PSO-based dynamic clustering approach (DCPSO) was proposed to automatically determine the optimum number of clusters and simultaneously cluster the data set [40]. Many different approaches based on the collective behaviors of ants have also been proposed for data clustering and data visualization [41], [42], [43], [44], [45]. The general idea of these researches is that isolated items will be picked up by artificial ants and then dropped at some other location where more items of that type are present. It is a kind of positive feedback that leads to the formation of large clusters. In addition, a novel data visualization method based on the schooling behavior of fish was proposed to allow the user to see complex correlations between data items through the amount of time each fish spends near others [46], [47], [48], [49].

In this paper, a new data projection algorithm, which was inspired by collective behaviors of doves, is proposed. We name the new clustering algorithm the swarm-inspired projection (SIP) algorithm. The paper is organized as follows. In Section 2 the motivations of the new data projection algorithm will be given and then detailed descriptions of the method are discussed. In Section 3, we try to explore the similarities and differences between the proposed SIP algorithm and some existing swarm-based algorithms. In Section 4 a number of computer simulations are given. Some discussions about the SIP algorithm are given in Section 5 and the conclusions of this study are presented in Section 6.

Section snippets

The swarm-inspired projection algorithm

In fact, the study of bird flocking and fish schooling was already a research topic for social psychology in the 1930s [2]. A very influential simulation of bird flocking was proposed by Reynolds [50]. Reynolds assumed that flocking birds were attributed to the following three local forces: collision avoidance, velocity matching, and flocking centering [2], [50]. In addition to Reynolds, Heppner and Grenander proposed a similar (but with some differences) idea about bird flocking at about the

Relations to other algorithms

In this section, we explore the similarities and differences between the proposed SIP algorithm and some existing swarm-based algorithms. As mentioned in the introduction section, several swarm-intelligence-based data clustering algorithms have been proposed in recent years [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49]. Each approach has its own applicability, advantages, and limitations.

The basic idea of the PSO-based approaches [35], [36], [37], [38]

Simulation results

To test the performance of the SIP algorithm, five data sets consisting of two artificial data set and three real data sets. The projection capability of the SIP algorithm was compared with the Sammon's projection algorithm, the LF algorithm, and the DSOM algorithm.

For the SIP algorithm, the maximum number of epochs for every data set except the iris data set and the 20-dimensional data set were set to be 5. The maximum numbers of epochs for the iris data set and the 20-dimensional data set

The factors affecting the SIP algorithm

In this section, we discuss how the number of doves and the values of parameters affect the SIP algorithm. Due to the limited space, we just present the results for the iris data set. Similar observations can be drawn from the simulations on the remaining data sets. First, we used another three different population sizes to rerun the SIP algorithm to investigate how the swarm size affects the SIP algorithm. Note that the values of the parameters used in the SIP algorithm were set to be the

Conclusion

In this paper, a new kind of swarm-inspired projection algorithm is proposed. The development of this new algorithm was motivated by the foraging behavior of doves. The focus of the paper is not to precisely model the doves’ foraging behavior, but to show that some basic foraging principles may serve as metaphors for data clustering.

The SIP algorithm can project high-dimensional data points into a 2-dimensional data space. The scatter plots of position vectors p̲j facilitate visual inspection

Acknowledgement

This work was partly supported by the National Science Council, Taiwan, ROC, under NSC 97-2631-S-008-003 and NSC 97-2631-H-008-001.

About the Author—MU-CHUN SU received the B.S. degree in electronics engineering from National Chiao Tung University, Taiwan, in 1986, and the M.S. and Ph.D. degrees in electrical engineering from University of Maryland, College Park, in 1990 and 1993, respectively. He was the IEEE Franklin V. Taylor Award recipient for the most outstanding paper co-authored with Dr. N. DeClaris and presented to the 1991 IEEE SMC Conference. He has authored more than 100 journals and refereed conference papers.

References (55)

  • R. Eberhart, J. Kennedy, A new optimizer using particle swarm theory, in: Proceedings of the Sixth International...
  • E. Bonabeau et al.

    Inspiration for optimization form social insect behaviour

    Nature

    (2000)
  • J.S. Bay

    Design of the army-ant cooperative lifting robot

    IEEE Robotics and Automation Magazine

    (1995)
  • O.E. Holland et al.

    Stigmergy, self-organization, and sorting in collective robotics

    Artificial Life

    (1999)
  • E. Bonabeau, F. Henaux, S. Guerin, D. Snyers, P. Kuntz, G. Theraulaz, Routing in telecommunications networks with smart...
  • J. Bezdek

    Fuzzy Mathematics in Pattern Classification

    (1973)
  • A.K. Jain et al.

    Algorithms for Clustering Data

    (1988)
  • D. Gustafson, W. Kessel, Fuzzy clustering with a fuzzy covariance matrix, in: Proceedings of IEEE Conference on...
  • I. Gath et al.

    Unsupervised optimal fuzzy clustering

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1989)
  • R. Dave

    Use of the adaptive fuzzy clustering algorithm to detect lines in digital images

    Intelligent Robots Computer Vision VIII

    (1989)
  • R. Dave

    Fuzzy shell-clustering and application to circle detection in digital images

    International Journal of General Systems

    (1990)
  • Y. Man et al.

    Detection and separation of ring-shaped clusters using fuzzy clustering

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (1994)
  • R. Dave et al.

    Adaptive fuzzy c-shells clustering and detection of ellipses

    IEEE Transactions on Neural Networks

    (1992)
  • F. Höppner

    Fuzzy shell clustering algorithms in image processing: fuzzy c-rectangular and 2-rectangular shells

    IEEE Transactions on Fuzzy Systems

    (1997)
  • M.C. Su et al.

    A modified version of the k-means algorithm with a distance based on cluster symmetry

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2001)
  • R. Dave, K. Patel, Progressive fuzzy clustering algorithms for characteristic shape recognition, in: Proceedings of...
  • R. Krishnapuram et al.

    Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation—Part I and II

    IEEE Transactions on Fuzzy Systems

    (1995)
  • Cited by (28)

    • Swarm ascending: Swarm intelligence-based exemplar group detection for robust clustering

      2021, Applied Soft Computing
      Citation Excerpt :

      In this study, we propose a swarm intelligence-based clustering method to cluster local and nonlinear pattern data. Swarm intelligence is the collective behavior of decentralized and self-organized systems [13]. The system consists of a population of simple agents interacting locally to solve a global problem.

    • A survey on river water quality modelling using artificial intelligence models: 2000–2020

      2020, Journal of Hydrology
      Citation Excerpt :

      Various optimisers have been developed over the years, and many remain unexplored in river WQ. Such optimisers include bacterial foraging optimisation (Das et al., 2009), amoeba-based algorithm (Zhang et al., 2013), artificial plant optimisation (Cui et al., 2012), flower pollination algorithms (Yang, 2012), grasshopper-insect-based algorithm (Saremi et al., 2017), wasp-insect-based algorithm (Theraulaz, 1991), fruitfly-insect-based algorithm (Xing and Gao, 2014), glow worm-insect-based algorithm (Krishnanand and Ghose, 2009), dragonfly-insect-based algorithm (Mirjalili, 2016), shark optimisation (Hersovici et al., 1998), whale optimisation (Mirjalili and Lewis, 2016), bean optimisation (Xiaoming Zhang et al., 2010a,b), dove-bird-based algorithm (Su et al., 2009), eagle-bird-based algorithm (Yang and Deb, 2010), cuckoo search (Yang and Suash Deb, 2009), bird mating (Askarzadeh and Rezazadeh, 2013), monkey-animal-based algorithm (Mucherino et al., 2007), wolf-animal-based algorithm (Liu et al., 2011), lion-animal-based algorithms (Yazdani and Jolai, 2016), artificial fish-swarm algorithm (Li, 2003). Termite (Roth, 2005), marriage in honey bees (Abbass, 2001), bee collecting pollen algorithm (Lu and Zhou, 2008), krill herd (Gandomi and Alavi, 2012), grey wolf optimiser (Mirjalili et al., 2014), Earthworm (Wang et al., 2018), salp swarm, (Mirjalili et al., 2017) nomadic people (Salih and Alsewari, 2019), sooty tern (Dhiman and Kaur, 2019), harris hawks (Heidari et al., 2019), side-blotched (Maciel et al., 2020) and color hormony (Zaeimi and Ghoddosian, 2020) can be implemented in future studies.

    • Bio inspired computing - A review of algorithms and scope of applications

      2016, Expert Systems with Applications
      Citation Excerpt :

      Further newer and newer algorithms are rapidly getting introduced in the domain of bio inspired algorithm. Many of the less explored algorithms like the amoeba based algorithm (Zhang et al., 2013), bean optimization algorithm (Zhang et al., 2010), individual bird based algorithms based on doves and eagles (Su et al., 2009; Yang & Deb, 2010), individual insect based algorithms like fruit fly, wasp and glow-worm (Krishnanand & Ghose, 2005; Pan, 2012; Theraulaz et al., 1991), individual animal based algorithms like monkey, shark, wolf and lions (Hersovici et al., 1998; Liu et al., 2011; Mucherino & Seref, 2007; Yazdani & Jolai, 2015) could be explored, both for developing the algorithm itself and also for exploring their scope of applications across domains. Also, the scholars exploring any of the algorithms could start by exploring the recent publications of the dominant contributors, who have been identified in Table 1.

    • Exploring and weighting features for financially distressed construction companies using Swarm Inspired Projection algorithm

      2016, Advanced Engineering Informatics
      Citation Excerpt :

      The objective is to provide a new approach to resolve tie-breaks in clustering outcomes. Other researchers used clustering techniques as partitioning techniques to come up with similarity traits in data analysis [28–32]. This section includes the collection of data, data screening, and analysis failure diagnosis and weighting and adjusting.

    • Passing vehicle search (PVS): A novel metaheuristic algorithm

      2016, Applied Mathematical Modelling
      Citation Excerpt :

      Thus, these algorithms can be broadly classified as: (a) animal-based algorithms, (b) plant-based algorithms, (c) physics-based algorithms, and (d) human activity-based algorithms. The animal-based algorithms include bee-inspired algorithms [6], biogeography-based optimization (BBO) algorithms [7], bacteria-inspired algorithms [8], bat-inspired algorithm (BA; [9]), cat optimization algorithms [10], cuckoo search algorithm (CSA; [11]), luminous insect-based algorithms [12,13], fish-inspired algorithms [14,15], frog-based algorithms [16], rat-inspired algorithms [17], cockroach-inspired algorithms [18], dove-based algorithms [19], eagle-based algorithms [20], goose-based algorithms [21], monkey search algorithms [22], and wolf colony-inspired algorithms [23]. Very few plant-based optimization algorithms have been developed but they include the invasive weed optimization algorithm [24] and flower-pollinating algorithm (FPA; [25]).

    View all citing articles on Scopus

    About the Author—MU-CHUN SU received the B.S. degree in electronics engineering from National Chiao Tung University, Taiwan, in 1986, and the M.S. and Ph.D. degrees in electrical engineering from University of Maryland, College Park, in 1990 and 1993, respectively. He was the IEEE Franklin V. Taylor Award recipient for the most outstanding paper co-authored with Dr. N. DeClaris and presented to the 1991 IEEE SMC Conference. He has authored more than 100 journals and refereed conference papers. He is currently a professor of computer science and information engineering at National Central University, Taiwan. He is a senior member of the IEEE Computational Intelligence Society and Systems, Man, and Cybernetics Society. His current research interests include neural networks, fuzzy systems, swarm intelligence, assistive technologies, affective computing, human–computer interfaces, robotics, pattern recognition, biomedical signal processing, and image processing.

    About the Author—SHI-YONG SU received the M.S. degree in computer science and information engineering at National Central University, Taiwan, in 2007. Her research interests include neural networks, pattern recognition, and human–computer interfaces.

    About the Author—Yu-Xiang Zhao received the B.S. and M.S. degrees in electrical engineering from Tamkang University, Taiwan, in 2000 and 2002, respectively, and the Ph.D. degree in computer science and information engineering from National Central University, Taiwan, in 2007. He is currently an assistant professor of computer science and information engineering at Ta Hwa Institute of Technology, Taiwan. His research interests include neural networks, pattern recognition, swarm intelligence, and image processing.

    View full text