ABSTRACT
Size and complexity of data repositories collaboratively created by Web users generate a need for new processing approaches. In this paper, we study the problem of detection of fine-grained communities of users in social networks, which can be defined as clustering with a large number of clusters. The practical size of social networks makes the traditional evolutionary based clustering approaches, which represent the entire clustering solution as one individual, hard to apply. We propose an Agglomerative Clustering Genetic Algorithm (ACGA): a population of clusters evolves from the initial state in which each cluster represents one user to a high quality clustering solution. Each step of the evolutionary process is performed locally, engaging only a small part of the social network limited to two clusters and their direct neighborhood. This makes the algorithm practically useful independently of the size of the network. Evaluation on two social network models indicates that ACGA is potentially able to detect communities with accuracy comparable or better than two typical centralized clustering algorithms even though ACGA works under much stricter conditions.
- Robert Busa-Fekete, Andras Kocsor, and Csaba Bagyinka. A multi-stack based phylogenetic tree building method. Bioinformatics Research and Applications, pages 49--60, 2007. Google ScholarDigital Library
- R. I. M. Dunbar. Coevolution of neocortical size, group size and language in humans. Behavioral and Brain Sciences, 16(4):681--735, 1993.Google ScholarCross Ref
- Zhidan Feng, Xiaowei Xu, Nurcan Yuruk, and Thomas Schweiger. A novel similarity-based modularity function for graph partitioning. Data Warehousing and Knowledge Discovery, pages 385--396, 2007. Google ScholarDigital Library
- Rong Ge, Martin Ester, Byron J. Gao, Zengjian Hu, Binay Bhattacharya, and Boaz Ben-Moshe. Joint cluster analysis of attribute data and relationship data: The connected k-center problem, algorithms and applications. ACM Trans. Knowl. Discov. Data, 2(2):1--35, 2008. Google ScholarDigital Library
- M. Girvan and M. E. J. Newman. Community structure in social and biological networks. PNAS, 99(12):7821--7826, 2002.Google ScholarCross Ref
- J. Handl and J. Knowles. An evolutionary approach to multiobjective clustering. Evolutionary Computation, IEEE Transactions on, 11(1):56--76, Feb. 2007. Google ScholarDigital Library
- Julia Handl and Joshua Knowles. On semi-supervised clustering via multiobjective optimization. In GECCO '06: Proceedings of the 8th annual conference on Genetic and evolutionary computation, pages 1465--1472, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- Ji He, Ah-Hwee Tan, Chew-Lim Tan, and Sam-Yuan Sung. On Quantitative Evaluation of Clustering Systems. Kluwer Academic Publishers, 2003.Google Scholar
- Yi Hong, Sam Kwong, Hui Xiong, and Qingsheng Ren. Genetic-guided semi-supervised clustering algorithm with instance-level constraints. In GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 1381--1388, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- Lawrence Hubert and Phipps Arabie. Comparing partitions. J. of Classification, 2(1):193--218, 1985.Google ScholarCross Ref
- Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data, 1(1):2, 2007. Google ScholarDigital Library
- Nobukazu Matake, Tomoyuki Hiroyasu, Mitsunori Miki, and Tomoharu Senda. Multiobjective clustering with automatic k-determination for large-scale data. In GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, pages 861--868, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- M E Newman and M Girvan. Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys, 69(2):026113.1--15, Feb 2004.Google Scholar
- Clara Pizzuti. Ga-net: A genetic algorithm for community detection in social networks. In PPSN, volume 5199 of Lecture Notes in Computer Science, pages 1081--1090. Springer, 2008.Google Scholar
- Peter Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math., 20(1):53--65, 1987. Google ScholarDigital Library
- Jianbo Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000. Google ScholarDigital Library
- Mursel Tasgin and Haluk Bingol. Community detection in complex networks using genetic algorithm. In ECCS '06: Proc. of the European Conference on Complex Systems, Apr 2006.Google Scholar
- A. Sima Uyar and Sule Gunduz Oguducu. A new graph-based evolutionary approach to sequence clustering. In ICMLA '05: Proceedings of the Fourth International Conference on Machine Learning and Applications, pages 273--278. IEEE, 2005. Google ScholarDigital Library
Index Terms
- Agglomerative genetic algorithm for clustering in social networks
Recommendations
A quantum-inspired genetic algorithm for k-means clustering
The number of clusters has to be known in advance for the conventional k-means clustering algorithm and moreover the clustering result is sensitive to the selection of the initial cluster centroids. This sensitivity may make the algorithm converge to ...
A genetic k-medoids clustering algorithm
We propose a hybrid genetic algorithm for k -medoids clustering. A novel heuristic operator is designed and integrated with the genetic algorithm to fine-tune the search. Further, variable length individuals that encode different number of medoids (...
A genetic clustering algorithm using a message-based similarity measure
In this paper, a genetic clustering algorithm is described that uses a new similarity measure based message passing between data points and the candidate centers described by the chromosome. In the new algorithm, a variable-length real-value chromosome ...
Comments