Abstract
This paper presents a high-performance method to reduce the time complexity of particle swarm optimization (PSO) and its variants in solving the partitional clustering problem. The proposed method works by adding two additional operators to the PSO-based algorithms. The pattern reduction operator is aimed to reduce the computation time, by compressing at each iteration patterns that are unlikely to change the clusters to which they belong thereafter while the multistart operator is aimed to improve the quality of the clustering result, by enforcing the diversity of the population to prevent the proposed method from getting stuck in local optima. To evaluate the performance of the proposed method, we compare it with several state-of-the-art PSO-based methods in solving data clustering, image clustering, and codebook generation problems. Our simulation results indicate that not only can the proposed method significantly reduce the computation time of PSO-based algorithms, but it can also provide a clustering result that matches or outperforms the result PSO-based algorithms by themselves can provide.
Similar content being viewed by others
Notes
That is, \(X=\cup _{i=1}^k\pi _i\) and \(\forall i\ne j, \pi _i\cap \pi _j=\emptyset \).
In other words, the modified acceleration coefficients \(a_1\) and \(a_2\) begin with the value \(\dot{a}_{\cdot }\), increase or decrease linearly proportional to the difference between \(\ddot{a}_{\cdot }\) and \(\dot{a}_{\cdot }\) as the number of iterations grows, and end with the value \(\ddot{a}_{\cdot }\).
The approach is fast because no sorting is required.
Since no confusion is possible, throughout the rest of the paper, we will use MPREPSO and \(\text {MPR}_2\) interchangeably to mean the proposed algorithm using both detection methods and the multistart operator from time to time.
These datasets are available for download at http://archive.ics.uci.edu/ml/datasets.html.
These datasets are available for download at http://www.inf.uni-konstanz.de/cgip/lehre/dip_w0910/demos.html.
The number of clusters is set equal to 8.
These datasets are available for download at http://photojournal.jpl.nasa.gov/catalog/PIA14873 and http://photojournal.jpl.nasa.gov/catalog/PIA14872.
References
Abraham A, Das S, Konar A (2007) Kernel based automatic clustering using modified particle swarm optimization algorithm, In: Proceedings of the Annual Conference on Genetic and Evolutionary Computation, pp 2–9
Ahmadi A, Karray F, Kamel M (2007a) Cooperative swarms for clustering phoneme data, In: Proceedings of the IEEE/SP Workshop on Statistical, Signal Processing, pp 606–610
Ahmadi A, Karray F, Kamel M (2007b) Multiple cooperating swarms for data clustering, In: Proceedings of the IEEE Swarm Intelligence Symposium, pp 206–212
Ahmadyfard A, Modares H (2008) Combining PSO and \(k\)-means to enhance data clustering, In: Proceedings of the International Symposium on Telecommunications, pp 688–691
Bagirov AM, Ugon J, Webb D (2011) Fast modified global \(k\)-means algorithm for incremental cluster construction. Patt Recogn 44(4):866–876
Banks A, Vincent J, Anyakoha C (2008) A review of particle swarm optimization. part II: hybridisation, combinatorial, multicriteria and constrained optimization, and indicative applications. Nat Comput 7(1):109–124
Bradley PS, Fayyad UM (1998) Refining initial points for \(k\)-means clustering, In: Proceedings of the International Conference on Machine Learning, pp 91–99
Bradley PS, Fayyad UM, Reina C (1998) Scaling clustering algorithms to large databases, In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 9–15
Bratton D, Kennedy J (2007) Defining a standard for particle swarm optimization, In: Proceedings of the IEEE Swarm Intelligence Symposium, pp 120–127
Buzo A, Gray AH Jr, Gray RM, Markel JD (1980) Speech coding based upon vector quantization. IEEE Trans Acoust Speech Signal Proc 28(5):562–574
Cai W, Chen S, Zhang D (2007) Fast and robust fuzzy \(c\)-means clustering algorithms incorporating local information for image segmentation. Patt Recogn 40(3):825–838
Chen CY, Ye F (2004) Particle swarm optimization algorithm and its application to clustering analysis, In: Proceedings of the IEEE International Conference on Networking, Sensing & Control, 2:789– 794
Chen CY, Feng HM, Ye F (2006) Automatic particle swarm optimization clustering algorithm. Intern J Electr Eng 13(4):379–387
Cheng TW, Goldgof DB, Hall LO (1998) Fast fuzzy clustering. Fuzzy sets and systems 93(1):49–56
Chen Q, Yang J, Gou J (2005) Image compression method using improved PSO vector quantization, In: Proceedings of the Advances in Natural Computation, pp 490–495
Chiang MC, Tsai CW, Yang CS (2011) A time-efficient pattern reduction algorithm for \(k\)-means clustering. Info Sci 181(4):716–731
Cohen SCM, de Castro LN (2006) Data clustering with particle swarms, In: Proceedings of the IEEE Congress on Evolutionary Computation, pp 1792–1798
Das S, Abraham A, Konar A (2008) Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm. Patt Recogn Lett 29(5):688–699
Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Ding C, He X (2004) \(K\)-means clustering via principal component analysis, In: Proceedings of the International Conference on Machine Learning, 69:225–232
Elkan C (2003) Using the triangle inequality to accelerate \(k\)-means, In: Proceedings of the International Conference on Machine Learning, pp 147–153
Engelbrecht AP (2006) Fundamentals of computational swarm intelligence. Wiley, West Sussex, England
Eschrich S, Ke J, Hall LO, Goldgof DB (2003) Fast accurate fuzzy clustering through data reduction. IEEE Trans Fuzzy Syst 11(2):262–270
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise, In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 226–231
Feng HM, Chen CY, Ye F (2007) Evolutionary fuzzy particle swarm optimization vector quantization learning scheme in image compression. Exp Syst Appl 32(1):213–222
Getz G, Gal H, Kela I, Notterman DA, Domany E (2003) Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data. Bioinformatics 19(9):1079–1089
Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515–528
Hammouda KM, Kamel MS (2004) Efficient phrase-based document indexing for web document clustering. IEEE Trans Knowl Data Eng 16(10):1279–1296
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Jarboui B, Cheikh M, Siarry P, Rebai A (2007) Combinatorial particle swarm optimization (CPSO) for partitional clustering problem. Appl Math Comput 192(2):337–345
Karthi R, Arumugam S, RameshKumar K (2009) A novel discrete particle swarm clustering algorithm for data clustering, In: Proceedings of the Bangalore Annual Compute Conference, pp 16:1–16:4
Kaukoranta T, Fränti P, Nevalainen O (2000) A fast exact GLA based on code vector activity detection. IEEE Trans Image Proc 9(8):1337–1342
Kekre HB, Sarode TK (2009) Fast codebook search algorithm for vector quantization using sorting technique, In: Proceedings of the International Conference on Advances in Computing, Communication and Control, pp 317–325
Kogan J (2007) Introduction to clustering large and high-dimensional data. Cambridge University Press, New York
Kulkarni RV, Venayagamoorthy GK (2011) Particle swarm optimization in wireless-sensor networks: a brief survey. IEEE Trans Syst Man Cybernet Part C 41(2):262–267
Kuo RJ, Wang MJ, Huang TW (2011) An application of particle swarm optimization algorithm to clustering analysis. Soft Comput 15(3):533–542
Lai JZC, Liaw YC, Liu J (2008) A fast VQ codebook generation algorithm using codeword displacement. Patt Recogn 41(1):315– 319
Lai JZC, Huang TJ, Liaw YC (2009) A fast \(k\)-means clustering algorithm using cluster center displacement. Patt Recogn 42(11):2551–2556
Leuski A (2001) Evaluating document clustering for interactive information retrieval, In: Proceedings of the International Conference on Information and Knowledge Management, pp 33–40
Li C, Zhou J, Kou P, Xiao J (2012) A novel chaotic particle swarm optimization based fuzzy clustering algorithm. Neurocomputing 83:98–109
Lughofer E (2008) Extensions of vector quantization for incremental clustering. Patt Recogn 41(3):995–1011
Lu Y, Lu S, Fotouhi F, Deng Y, Brown SJ (2004) FGKA: a fast genetic \(k\)-means clustering algorithm, In: Proceedings of the ACM Symposium on Applied, Computing, pp 622–623
Marinakis Y, Marinaki M, Matsatsinis N (2008) A stochastic nature inspired metaheuristic for clustering analysis. Intern J Bus Intel Data Mining 3(1):30–44
Miranda V, Keko H, Duque AJ (2008) Stochastic star communication topology in evolutionary particle swarms (EPSO). Intern J Comput Intel Res 4(2):105–116
Ng RT, Han J (2002) CLARANS: a method for clustering objects for spatial data mining. IEEE Trans Knowl Data Eng 14(5):1003–1016
Niknam T, Amiri B, Olamaei J, Arefi A (2009) An efficient hybrid evolutionary optimization algorithm based on PSO and SA for clustering. J Zhejiang Univ SCI A 10(4):512–519
Omran MGH, Salman AA, Engelbrecht AP (2002) Image classification using particle swarm optimization, In: Proceedings of the Asia-Pacific Conference on Simulated Evolution and Learning, pp 370–374
Omran MGH, Engelbrecht AP, Salman AA (2005a) Particle swarm optimization method for image clustering. Intern J Patt Recogn Artif Intel 19(3):297–321
Omran MGH, Engelbrecht AP, Salman AA (2005b) Dynamic clustering using particle swarm optimization with application in unsupervised image segmentation. Proc World Acad Sci Eng Technol 2005:199–204
Omran MGH, Salman AA, Engelbrecht AP (2006) Dynamic clustering using particle swarm optimization with application in image segmentation. Patt Anal Appl 8(4):332–344
Ordonez C, Omiecinski E (2004) Efficient disk-based \(k\)-means clustering for relational databases. IEEE Trans Knowl Data Eng 16(8):909–921
Parsopoulos KE, Vrahatis MN (2010) Particle swarm optimization and intelligence: advances and applications. IGI Global Snippet
Paterlini S, Krink T (2006) Differential evolution and particle swarm optimisation in partitional clustering. Comput Stat Data Anal 50(5):1220–1247
Ratnaweera A, Halgamuge SK, Watson HC (2004) Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients. IEEE Trans Evol Comput 8(3):240–255
Shi Y, Eberhart RC (1999) Empirical study of particle swarm optimization, In: Proceedings of the Congress on Evolutionary Computation, 3:1945–1950
Theodoridis S, Koutroumbas K (2009) Chapter 16: cluster validity, in pattern recognition, 4th edn. Academic Press, Boston
Tillett JC, Rao RM, Sahin F, Rao TM (2003) Particle swarm optimization for the clustering of wireless sensors, In: Proceedings of SPIE 5100:73–83
Tsai CW, Yang CS, Chiang MC (2007) A time efficient pattern reduction algorithm for \(k\)-means based clustering, In: Proceeding of the IEEE International Conference on Systems, Man and Cybernetics, pp 504–509
Tsai CW, Lin CF, Chiang MC, Yang CS (2010) A fast particle swarm optimization algorithm for vector quantization. ICIC Expr Lett Part B 1(2):137–143
van der Merwe DW, Engelbrecht AP (2003) Data clustering using particle swarm optimization, In: Proceedings of IEEE Congress on Evolutionary Computation, 1:215–220
Xiang S, Nie F, Zhang C (2008) Learning a Mahalanobis distance metric for data clustering and classification. Patt Recogn 41(12):3600–3612
Xiao X, Dow ER, Eberhart R, Miled ZB, Oppelt RJ (2003) Gene clustering using self-organizing maps and particle swarm optimization, In: Proceedings of the International Symposium on Parallel and Distributed Processing
Xu R, Wunsch-II DC (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Xu R, Wunsch-II DC (2008) Clustering. Wiley, Hoboken, New Jersey
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization, In: Proceedings of the International ACM SIGIR Conference on Research and Development in, Information Retrieval, pp 267–273
Yang CS, Chuang LY, Ke CH, Yang CH (2008) Comparative particle swarm optimization (CPSO) for solving optimization problems, In: Proceedings of the International Conference on Research, Innovation and Vision for the Future in Computing & Communication Technologies, pp 86–90
Zhang WF, Liu CC, Yan H (2010) Clustering of temporal gene expression data by regularized spline regression and an energy based similarity measure. Patt Recogn 43(12):3969–3976
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases, In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp 103–114
Acknowledgments
The authors would like to thank the editors and anonymous reviewers for their valuable comments and suggestions on the paper that greatly improve the quality of the paper. The authors would also like to thank Mr. Jui-Le Chen for the implementation of standard PSO to make the comparisons given in the paper more complete. This work was supported in part by the National Science Council of Taiwan, R.O.C., under Contracts NSC102-2221-E-041-006, NSC102-2221-E-110-054, and NSC102-2219-E-006-001.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by W. Pedrycz.
Rights and permissions
About this article
Cite this article
Tsai, CW., Huang, KW., Yang, CS. et al. A fast particle swarm optimization for clustering. Soft Comput 19, 321–338 (2015). https://doi.org/10.1007/s00500-014-1255-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1255-3