Elsevier

Applied Soft Computing

Volume 13, Issue 4, April 2013, Pages 1853-1868
Applied Soft Computing

An enriched game-theoretic framework for multi-objective clustering

https://doi.org/10.1016/j.asoc.2012.12.001Get rights and content

Abstract

The framework of multi-objective clustering can serve as a competent technique in nowadays human issues ranging from decision making process to machine learning and pattern recognition problems. Multi-objective clustering basically aims at placing similar objects into the same groups based on some conflicting objectives, which substantially supports the use of game theory to come to a resolution. Based on these understandings, this paper suggests Enriched Game Theory K-means, called EGTKMeans, as a novel multi-objective clustering technique based on the notion of game theory. EGTKMeans is specially designed to optimize two intrinsically conflicting objectives, named, compaction and equi-partitioning. The key contributions of the proposed approach are three folds. First, it formulates an elegant and novel payoff definition which considers both objectives with equal priority. The presented payoff function incorporates a desirable fairness into the final clustering results. Second, EGTKMeans performs better off by utilizing the advantages of mixed strategies as well as those of pure ones, considering the existence of mixed Nash Equilibrium in every game. The last but not the least is that EGTKMeans approaches the optimal solution in a very promising manner by optimizing both objectives simultaneously. The experimental results suggest that the proposed approach significantly outperforms other rival methods across real world and synthetic data sets with reasonable time complexity.

Highlights

► Multi-objective clustering methods partitions similar objects into the same groups based on some conflicting objectives. ► Game theory is a proper mathematical tool to support clustering. ► We present Enriched Game Theory K-means, called EGTKMeans, as a novel multi-objective clustering technique based on the notion of game theory. ► EGTKMeans optimizes two intrinsically conflicting objectives, compaction and equi-partitioning, simultaneously.

Introduction

We are living in a world with a large amount of data which need to be analyzed or managed. One of the crucial matters in dealing with these data is to classify or categorize them into a set of groups or clusters [1]. A clustering method categorizes data into subgroups, in such a manner that there are: (1) high intra-cluster similarity and (2) low inter-cluster similarity [2].

Nowadays, data clustering is a well-stablished field which is growing rapidly in many domains such as pattern-analysis and grouping, and decision-making [3]. During years, many clustering methods have been proposed to satisfy these application optimization requirements. However, in many such problems, there is more than a single-objective needed to be optimized in the context of the application. Therefore, clustering can be considered as a multi-objective optimization problem rather than single-objective one. The multi-objective clustering methods attempt to identify clusters in such a manner that several objectives are optimized during the procedure [3]. In recent years, several new practical applications have been developed which need object clustering at various levels with multiple criteria which may be conflicting in nature. The practical disciplines are as diverse as urban search and rescue (USAR), ad hoc networks, sensor networks, facility location and multi-core architecture.

For further explanation, consider a detection system for wildfires which is one of the applications of wireless sensor networks. Wildfires, also known as forest fires, are uncontrolled fires occurring in wild areas and cause significant damage to natural and human resources [4]. Hence, the system which detects fires can prevent the intolerable damage to public safety and natural resources. This system can be provided by a wireless sensor networks (WSN), in which the nodes can sense various phenomena including temperature, relative humidity, and smoke which are all helpful for fire detection systems [4]. Large-scale wireless sensor networks can be easily deployed using airplanes at a low cost in comparison to the resulting destructions and loss of properties by forest fires [5]. Although, we do not possess any information regarding the exact location of sensors, they communicate with each other over wireless network and exchange information. Indeed, the nodes transfer data to the base station. The key point is that there are several limitations for WSN, such as processing capability, wireless bandwidth, battery power and storage space [5] in order to resolve some of these issues, nodes must be partitioned into several groups.

In pursuit of better clarification, consider two-dimensional GTD data set [6] which can be mapped to the location coordinates of 59 sensors. Here, we need a clustering algorithm in order to create groups for different locations of nodes in order to reduce the high communication overhead. Since K-means [7] is one of the most popular clustering algorithms and determines compact clusters, it is considered to be an obvious choice for this problem. Fig. 1a displays final groups determined by K-means, and as shown there, five clusters are obtained whose centers can be the positions of headers which transfer data to the outside of the cluster. Although clusters are compact, they are not balanced. As a result, the nodes in clusters suffer from non-uniform power distribution. In this case, clusters with more nodes have much more communication with each other as well as the header which cause tremendous power consumption. Similarly, in clusters with smaller number of nodes, there are not enough units to be substituted by the previous header with no power. Hence, there exists another objective which should be optimized in this clustering technique so that the system does not expel the groups due to immediate power consumption.

This entails the investigation of a new clustering method which tries to optimize two important objectives: (1) compaction, to minimize the power dissipation utilized for the purpose of communication inside a team and (2) equi-partitioning, to outline teams with uniform power distribution [8]. These objectives are competitive in nature and need to be optimized in a simultaneous manner. Therefore, there is an indispensability to develop a promising technique for simultaneous optimization of conflicting objectives, while a single-objective clustering such as K-means focuses on compaction, and it identifies clusters that may not be equi-partitioned. As it can be seen from Fig. 1b, a multi-objective method, which provides these objectives, suggests much better clusters. The final clusters are compact while they support almost equal data points. However, there is a requirement for the clustering algorithm. Since no information is provided concerning the locations of sensors and for large-scale sensors a centralized control manner is not practical, hence the clustering algorithm should be completely distributed [9].

In addition to sensor networks, there are different fields which need a multi-objective clustering algorithm on the basis of compaction and equi-partitioning. Ad hoc networks and other sensor networks must be partitioned, due to high communication overhead and constrain of power consumption. Moreover, a type of facility location, the so-called Load Balanced Facility Location problem, need an optimization over these two primary objectives in order to minimize customers’ access distance and form groups with uniform customers. Recently, a novel approach is developed in order to solve the latter problem by Gupta and Ranganathan [19]. This algorithm comprises three components: (1) initial step which includes an iterative hill-climbing-based partitioning, (2) a multistep normal form game formulation that identifies the initial clusters as players and resources on the basis of certain properties, and (3) a Nash Equilibrium method to evaluate optimal clusters. The presented method by Gupta and Ranganathan, so-called GTKMeans, achieves significant results. However, GTKMeans does not achieve the total fairness in optimization of both objectives. Moreover, this approach is developed for pure-strategies game. We model a game mechanism which performs optimization on the two naturally conflicting objectives, simultaneously. We also present a novel payoff function with superior performance on both objectives including high performance and fairness within clusters.

In the next section, we briefly review existing clustering techniques and various application domains of game theory. Section 3, specifically, explains about game-theoretic clustering, different proposed approaches and the most important applications of this field. In Sections 4 The proposed algorithm, 5 Ensemble-based game theoretic clustering, we describe our proposed algorithm which is microeconomic clustering methodology for simultaneous optimization of compaction and equi-partitioning, in detail. Next in Section 6, the experimental results for the performance of the algorithm on various real and artificial data sets are presented. Moreover, the proposed approach is analyzed on basis of two fairness metrics as well as time and game complexity. Eventually, the conclusion is discussed in Section 7.

Section snippets

Clustering

Data clustering is a challenging task, whose difficulty is caused by a lack of unique and exact definition of a cluster. The clustering problem (Ω, P) is formally defined as an optimization problem which is shown in Eq. (1). Ω is a set of feasible clusters, C is a hard partitioning of a given set of input patterns X = {x1, x2, …, xn} and Ω  R is the criterion function which is on the basis of similarity or dissimilarity between data objects of X [2], [10].P(C*)=mincΩP(C)

Game theory

Game theory is a

Related works

So far, numerous clustering algorithms have been reported in the literature. Different survey researches have been provided several comprehensive reviews of clustering techniques in assorted domains [1]. Although categorization of clustering is not straightforward or canonical, they are broadly divided in hierarchical and partitioning methods [3], [21]. Each technique tries to optimize a certain objective function based on a particular objective. Traditionally, the important clustering

The proposed algorithm

This section outlines Enriched Game-Theoretic clustering algorithm using K-means, EGTKMeans in short. The presented approach is an extension to GTKMeans [19]. EGTKMeans differs from the GTKMeans in the sense that it offers an elegant introduction of payoff function to deal with existing challenges within multi-objective clustering based on game theory. In the presented approach, both objectives receive equal priority in order to be optimized. Therefore we achieve a much better performance over

Ensemble-based game theoretic clustering

The proposed algorithm consists of multiple game iterations where in each iteration, a multi-step game is performed. As explained in the previous section, strategies set for each player is arise by growth of the number of players during each local game, and number of each game is greatly dependent on number of all players and resource. In fact, it increases with number of clusters as well as number of data objects. Therefore, as GTKMeans, the proposed methodology is ideally suited for

The experimental results

In this section, to explore the ability of EGTKMeans in comparison with rival approaches in optimizing the objectives of concern in this study, namely L and SSE, a series of experiments were conducted of real-world and artificial data sets. Subsequently, we report the results compared to K-means and GTKMeans to obtain an exhaustively understanding of what happens during the algorithm. The performance of the proposed algorithm also has been evaluated in terms of both game complexity and time

Conclusion

In this paper, we present a new payoff function for a multi-objective clustering algorithm on the basis of game theory framework. This method tries to optimize two important metrics in the terms of compaction and equi-partitioning via a combinational algorithm. This algorithm comprises single iteration of K-means and multiplayer normal form of game with Nash Equilibrium. We adapt a non-cooperative game in case of conflicted resource, so that clusters, players, define set of strategies for

Acknowledgment

The authors would like to acknowledge the support from Iranian Telecommunication Research Center (ITRC) under Grant No. T/500/13266.

References (44)

  • R. Xu et al.

    Survey of clustering algorithms

    IEEE Transactions on Neural Networks

    (2005)
  • A.K. Jain et al.

    Data clustering: a review

    ACM Computing Surveys (CSUR)

    (1999)
  • M.H.C. Law et al.

    Multiobjective data clustering

  • M. Hefeeda et al.

    Forest fire modeling and early detection using wireless sensor networks

  • I.F. Akyildiz et al.

    Wireless sensor network: a survey

    Computer Networks

    (2002)
  • H. Spath

    Cluster Analysis Algorithms for Data Reduction and Classification of Objects

    (1980)
  • L. MacQueen

    Some methods for classification and analysis of multi variate observations

  • A. Zarnani et al.

    Spatial data mining for optimized selection of facility locations in field-based services

  • M. Liu et al.

    An energy-aware routing protocol in wireless sensor networks

    International Journal of Sensors

    (2009)
  • J. Handl et al.

    An evolutionary approach to multiobjective clustering

    IEEE Transactions on Evolutionary Computation

    (2007)
  • M.J. Osborne et al.

    A Course in Game Theory

    (1998)
  • N. Nisan et al.

    Algorithmic Game Theory

    (2007)
  • Y. Shoham et al.

    Multiagent Systems, Algorithmic, Game-Theoretic and Logical Foundations

    (2009)
  • S. Rota Bulò et al.

    A Game-Theoretic Approach to Hypergraph Clustering

    (2009)
  • J. Nash

    Equilibrium points in n-person games

    Proceedings of the National Academy of Sciences of the United States of America

    (1950)
  • J. Widgerand et al.

    Parallel computation of nash equilibria in n-player games

  • R. McKelvey et al.

    Gambit: Software Tools for Game Theory. The Gambit Project

  • B. Blum, D. Koller, C. Shelton. Game Theory: Game-Tracer....
  • U. Gupta et al.

    A game theoretic approach for simultaneous compaction and equipartitioning of spatial data sets

    IEEE Transcation on Knowledge and Data Engineering

    (2010)
  • A. Vetta

    Nash equilibria in competitive societies, with applications to facility location, traffic routing and auction

  • P. Berkhin, Survey of clustering data mining techniques, Technical report, Accrue Software 10, 2002, pp....
  • H. Edelsbrunner et al.

    Efficient algorithms for agglomerative hierarchical clustering methods

    Journal of Classification

    (1984)
  • Cited by (11)

    • Clustering ensembles: A hedonic game theoretical approach

      2018, Pattern Recognition
      Citation Excerpt :

      By achieving the Nash equilibrium [9] of the game, the final data partition is obtained. Badami et al. [12] extended the previous work by also considering mixed strategies to be available to the players, which might lead to better equilibria and thus clustering solutions. On the other hand, Pelillo et al. [13] formulated the clustering problem in terms of a non-cooperative clustering game and showed that a natural interpretation of a cluster turns out to be equivalent to an evolutionary game-theoretic equilibrium concept.

    • A multi-act sequential game-based multi-objective clustering approach for categorical data

      2017, Neurocomputing
      Citation Excerpt :

      However, pure Nash equilibrium does not always exist which limits the performance of this algorithm. Based on this method, Badami et al. [25] have proposed a novel formulation of the payoffs function which models both compaction and equipartitioning objectives in equal priority. Their approach can be applied to mixed strategies as well as to pure ones.

    • Automatic multi-objective clustering based on game theory

      2017, Expert Systems with Applications
      Citation Excerpt :

      However, pure Nash equilibrium does not always exist which limits the performance of this algorithm. Based on this method, Badami, Hamzeh, and Hashemi (2013) have proposed a novel formulation of the payoffs function which models both compaction and equipartitioning objectives in an equal priority. Their approach can be applied to mixed strategies as well as to pure ones.

    View all citing articles on Scopus
    View full text