Abstract
Clustering is an essential data mining tool for analyzing big data. In this article, an overview of literature methods is undertaken. Following this study, a new algorithm called BSO-CLARA is proposed for clustering large data sets. It is based on bee behavior and k-medoids partitioning. Criteria like effectiveness, eficiency, scalability and control of noise and outliers are discussed for the new method and compared to those of the previous techniques. Experimental results show that BSO-CLARA is more effective and more efficient than PAM, CLARA and CLARANS, the well-known partitioning algorithms but also CLAM, a recent algorithm found in the literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Leonard, K., Peter, J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, issue 14, pp. 921–926 (1967)
Nguyen, Q.H., Rayward-Smith, V.J.: CLAM: clustering large applications using metaheuristics. J. Math. Model. Algorithms 10, 57–78 (2011)
Omran, M.G., Engelbrecht, A.P., Salman, A.: An overview of clustering methods. Intell. Data Anal. 11(583–605), 6 (2007)
Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Very Large Data Bases (VLDB 1994), pp. 144–155 (1994)
Ng, R.T., Han, J.: Clarans: a method for clustering objects for spatial data mining. IEEE Trans. Knowl. Data Eng. 14(5), 1003–1016 (2002)
Sadeg, S., Drias, H., Yahi, S.: Cooperative bees swarm for solving the maximum weighted satisfiability problem. In: Cabestany, J., Prieto, A.G., Sandoval, F. (eds.) IWANN 2005. LNCS, vol. 3512, pp. 318–325. Springer, Heidelberg (2005)
Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., Herawan, T.: Big data clustering: a review. In: Murgante, B., et al. (eds.) ICCSA 2014, Part V. LNCS, vol. 8583, pp. 707–720. Springer, Heidelberg (2014)
Tsai, C.-W., Huang, W.-C., Chiang, M.-C.: Recent development of metaheuristics for clustering. In: Park, J.J.J.H., Adeli, H., Park, N., Woungang, I. (eds.) Mobile, Ubiquitous, and Intelligent Computing. LNEE, vol. 274, pp. 629–636. Springer, Heidelberg (2014). http://dblp.uni-trier.de/db/conf/music/music2013.html#TsaiHC13a
Tsutomu, S., Fumihiko, Y., Yoshiaki, T.: A new algorithm based on metaheuristics for data clustering. Zhejiang Univ. Sci. A 12, 921–926 (2010)
WIlliam H, W.: UCI Repository of Machine Learning Databases. University of California, Irvine (1992)
Yeh, I.C.: UCI Repository of Machine Learning Databases. University of California, Irvine (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Aboubi, Y., Drias, H., Kamel, N. (2015). BSO-CLARA: Bees Swarm Optimization for Clustering LARge Applications. In: Prasath, R., Vuppala, A., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2015. Lecture Notes in Computer Science(), vol 9468. Springer, Cham. https://doi.org/10.1007/978-3-319-26832-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-26832-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26831-6
Online ISBN: 978-3-319-26832-3
eBook Packages: Computer ScienceComputer Science (R0)