Abstract
Processes that simulate natural phenomena have successfully been applied to a number of problems for which no simple mathematical solution is known or is practicable. Such meta-heuristic algorithms include genetic algorithms, particle swarm optimization and ant colony systems and have received increasing attention in recent years.
This paper extends ant colony systems and discusses a novel data clustering process using Constrained Ant Colony Optimization (CACO). The CACO algorithm extends the Ant Colony Optimization algorithm by accommodating a quadratic distance metric, the Sum of K Nearest Neighbor Distances (SKNND) metric, constrained addition of pheromone and a shrinking range strategy to improve data clustering. We show that the CACO algorithm can resolve the problems of clusters with arbitrary shapes, clusters with outliers and bridges between clusters.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dorigo, M., Maniezzo, V., Colorni, A.: Ant system: optimization by a colony of cooperating agents. IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics 26, 29–41 (1996)
Dorigo, J.M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. on Evolutionary Computation 1, 53–66 (1997)
Maniezzo, V., Colorni, A.: The ant system applied to the quadratic assignment problem. IEEE Trans. on Knowledge and Data Engineering 11, 769–778 (1999)
Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: Data mining with an ant colony optimization algorithm. IEEE Trans. on Evolutionary Computation 6, 321–332 (2002)
Bland, J.A.: Space-planning by ant colony optimization. International Journal of Computer Applications in Technology 12, 320–328 (1999)
Chu, S.C., Roddick, J.F., Pan, J.S., Su, C.J.: Parallel ant colony systems. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 279–284. Springer, Heidelberg (2003)
Chu, S.C., Roddick, J.F., Pan, J.S.: Ant colony system with communication strategies. Information Sciences (2004) (to appear)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: 5th Berkeley symposium on mathematics, statistics and Probability, vol. 1, pp. 281–296 (1967)
Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. John Wiley and Sons, New York (1990)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient clustering method for very large databases. In: ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Montreal, Canada, pp. 103–114 (1996)
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD International Conference on the Management of Data, Seattle, WA, USA, pp. 73–84 (1998)
Karypis, G., Han, E.H., Kumar, V.: CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. Computer 32, 32–68 (1999)
Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS – clustering categorical data using summaries. In: Chaudhuri, S., Madigan, D. (eds.) Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, pp. 73–83. ACM Press, New York (1999)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U. (eds.) Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, pp. 226–231. AAAI Press, Menlo Park (1996)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multiresolution clustering approach for very large spatial databases. In: 1998 International Conference Very Large Data Bases (VLDB 1998), New York, pp. 428–439 (1998)
C, A.C., S, Y.P.: Redefining clustering for high-dimensional applications. IEEE Trans. on Knowledge and Data Engineering 14 (2002) 210–225
Estivill-Castro, V., Lee, I.: AUTOCLUST+: Automatic clustering of point-data sets in the presence of obstacles. In: Roddick, J., Hornsby, K.S. (eds.) TSDM 2000. LNCS (LNAI), vol. 2007, pp. 133–146. Springer, Heidelberg (2001)
Ng, R.T., Han, J.: Clarans: A method for clustering objects for spatical data mining. IEEE Transactions on Knowledge and Data Engineering 14, 1003–1016 (2002)
Tsai, C.F., Wu, H.C., Tsai, C.W.: A new data clustering approach for data mining in large databases. In: International Symposium on Parallel Architectures, Algorithms and Networks, pp. 278–283. IEEE Press, Los Alamitos (2002)
Kirkpatrick, S., Gelatt, J.C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
21, A.: Genetic algorithms for function optimization. PhD thesis, University of Alberta, Edmonton, Canada (1981)
Pan, J.S., McInnes, F.R., Jack, M.A.: Bound for minkowski metric or quadratic metric applied to VQ codeword search. In: IEE Proc. Vision Image and Signal Processing, vol. 143, pp. 67–71 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chu, SC., Roddick, J.F., Su, CJ., Pan, JS. (2004). Constrained Ant Colony Optimization for Data Clustering. In: Zhang, C., W. Guesgen, H., Yeap, WK. (eds) PRICAI 2004: Trends in Artificial Intelligence. PRICAI 2004. Lecture Notes in Computer Science(), vol 3157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28633-2_57
Download citation
DOI: https://doi.org/10.1007/978-3-540-28633-2_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22817-2
Online ISBN: 978-3-540-28633-2
eBook Packages: Springer Book Archive