Skip to main content
Log in

Data clustering using bacterial foraging optimization

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Clustering divides data into meaningful or useful groups (clusters) without any prior knowledge. It is a key technique in data mining and has become an important issue in many fields. This article presents a new clustering algorithm based on the mechanism analysis of Bacterial Foraging (BF). It is an optimization methodology for clustering problem in which a group of bacteria forage to converge to certain positions as final cluster centers by minimizing the fitness function. The quality of this approach is evaluated on several well-known benchmark data sets. Compared with the popular clustering method named k-means algorithm, ACO-based algorithm and the PSO-based clustering technique, experimental results show that the proposed algorithm is an effective clustering technique and can be used to handle data sets with various cluster sizes, densities and multiple dimensions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Arthur, D., & Vassilvitskii, S. (2007). k-means++: The advantages of careful seeding. In N. Bansal, K. Pruhs, & C. Stein (Eds.), Proc. of the eighteenth anual ACMSIAM symposium on discrete algorithms, SODA (pp. 1027–1035).

  • Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms (pp. 95–107). New York: Plenum Press.

    MATH  Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38.

    MathSciNet  MATH  Google Scholar 

  • Dhillon, I. S., Guan, Y., & Kulis, B. (2005). A unified view of kernel k-means, spectral clustering and graph partitioning. Technical Report TR-04–25, UTCS.

  • Dhillon, I. S., Guan, Y., & Kulis, B. (2007). Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 1944–1957.

    Article  Google Scholar 

  • Dorigo, M., & Maniezzo, V. (1996). Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 26(1), 29–41.

    Article  Google Scholar 

  • Englebrecht, A. P. (2002). Computational intelligence: An introduction. New York: Wiley.

    Google Scholar 

  • Filippone, M., Camastra, F., Masulli, F., & Rovetta, S. (2008). A survey on spectral and kernel methods for clustering. Pattern Recognition, 41(1), 176–190.

    Article  MATH  Google Scholar 

  • Guha, S., Rastogi, R., & Shim, K. (1998). Cure: An efficient clustering algorithm for large databases. In Proceedings of ACM SIGMOD conference on management of data (pp. 73–84).

  • Guney, K., & Basbug, S. (2008). Interference suppression of linear antenna arrays by amplitude-only control using a bacterial foraging algorithm. Progress in Electromagnetics Research, 79, 475–497.

    Article  Google Scholar 

  • Handl, J, & Knowles, J. (2008). Cluster generators: synthetic data for the evaluation of clustering algorithms. http://dbkgroup.org/handl/generators/.

  • Handl, J., Knowles, J., & Dorigo, M. (2006). Ant-based clustering and topographic mapping. Artificial Life, 12(1), 35–62.

    Article  Google Scholar 

  • Hinneburg, A., & Keim, D. (1998). An efficient approach to clustering in large multimedia databases with noise. In Proceedings of the 4th international conference on knowledge discovery and data mining (KDD-98) (pp. 58–65).

  • Hruschka, E., Campello, R., & de Castro, L. (2006). Evolving clusters in gene-expression data. Information Sciences, 176(13), 1898–1927.

    Article  MathSciNet  Google Scholar 

  • Jain, A. K., Murty, M. N., & Flyn, P. J. (1999). Data clustering: A review. ACM Computing Surveys, 31(3), 264–323.

    Article  Google Scholar 

  • Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu, A. Y. (2004). A local search approximation algorithm for k-means clustering. Computational Geometry, 28(2–3), 89–112.

    Article  MathSciNet  MATH  Google Scholar 

  • Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of the IEEE international joint conference on neural networks (ICW) (Vol. 4, pp. 1942–1948). Perth, Australia.

  • Kim, D. H., Abraham, A., & Cho, J. H. (2007). A hybrid genetic algorithm and bacterial foraging approach for global optimization. Information Sciences, 177(18), 3918–3937.

    Article  Google Scholar 

  • Kim, D. H., & Cho, J. H. (2005). Bacterial foraging based neural network fuzzy learning (pp. 2030–2036). IICAI.

  • Li, L., Yang, Y., Peng, H., & Wang, X. (2006). An optimization method inspired by chaotic ant behavior. International Journal of Bifurcation and Chaos, 16, 2351–2364.

    Article  MathSciNet  MATH  Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley symposium on mathematical statistics and probability (pp. 281–297).

  • Mishra, S., & Bhende, C. N. (2007). Bacterial foraging technique-based optimized active power filter for load compensation. IEEE Transactions on Power Delivery, 22(1), 457–465.

    Article  Google Scholar 

  • Ng, R. T., & Han, J. (1994). Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th international conference on very large data bases conference (pp. 144–155).

  • Pal, S. K., Ghosh, A., & Uma Shankar, B. (2000). Segmentation of remotely sensed images with fuzzy thresholding and quantitative evaluation. International Journal on Remote Sensing, 21(11), 2269–2300.

    Article  Google Scholar 

  • Passino, K. M. (2002). Biomimicry of bacterial foraging for distributed optimization and control. IEEE Control Systems Magazine, 22(3), 52–67.

    Article  MathSciNet  Google Scholar 

  • Sheikholeslami, G., Chatterjee, S., & Zhang, A. D. (1998). WaveCluster: A multi-resolution clustering approach for very large spatial databases. In Proceedings of the 24th international conference on very large data bases (pp. 428–439).

  • Shelokar, P. S., Jayaraman, V. K., & Kulkarni, B. D. (2004). An ant colony approach for clustering. Analytica Chimica Acta, 509, 187–195.

    Article  Google Scholar 

  • Tan, P.-N., Steinbach, M., & Kumar, V. (2006). Introduction to data mining. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Theodoridis, S., & Koutroumbas, K. (2006). Pattern recognition 3rd ed. New York: Academic.

    MATH  Google Scholar 

  • UCI Machine Learning Repository (2007). http://archive.ics.uci.edu/ml/index.html. Univ. of California, Irvine, Dept. of Information and Computer Science, Center for Machine Learning and Intelligent Systems.

  • van den Bergh, F. (2002). An analysis of particle swarm optimizers. PhD Thesis, Department of Computer Science, University of Pretoria, Pretoria, South Africa.

  • van der Merwe, D. W., & Engelbrecht, A. P. (2003). Data clustering using particle swarm optimization. In Proceedings of IEEE congress on evolutionary computation (pp. 215–220).

  • Wan, M., Li, L., Xiao, J., Yang, Y.,Wang, C., & Guo, X. (2010). CAS based clustering algorithm for web users. Nonlinear Dynamics, 61(3), 347–361.

    Article  MATH  Google Scholar 

  • Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82.

    Article  Google Scholar 

  • Zhang, J., & Leung, Y. (2004). Improved possibilistic C-means clustering algorithms. IEEE Transactions on Fuzzy Systems, 12(2), 209–217.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

I would like to thank the editor and all the reviewers for their great supports to our work. Our study is also supported by the National Basic Research Program of China (973 Program) (2007CB311203), the National Natural Science Foundation of China (Grant No. 60805043, 60821001), the Beijing Natural Science Foundation (Grant No. 4092029), the Huo Ying-Dong Education Foundation of China (Grant No. 121062), and the Foundation for the Author of National Excellent Doctoral Dissertation of PR China (FANEDD) (Grant No. 200951).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miao Wan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wan, M., Li, L., Xiao, J. et al. Data clustering using bacterial foraging optimization. J Intell Inf Syst 38, 321–341 (2012). https://doi.org/10.1007/s10844-011-0158-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0158-3

Keywords

Navigation