skip to main content
10.1145/1900008.1900027acmconferencesArticle/Chapter ViewAbstractPublication Pagesacm-seConference Proceedingsconference-collections
research-article

Applying hybrid Kepso clustering to web pages

Published:15 April 2010Publication History

ABSTRACT

Various optimization methods are used along with the standard clustering algorithms to make the clustering process simpler and quicker. In this paper we propose a new hybrid technique of clustering known as K-Evolutionary Particle Swarm Optimization (KEPSO) based on the concept of Particle Swarm Optimization (PSO). The proposed algorithm uses the K-means algorithm as the first step and the Evolutionary Particle Swarm Optimization (EPSO) algorithm as the second step to perform clustering. The experiments were performed using the clustering benchmark data. This method was compared with the standard K-means and EPSO algorithms. The results show that this method produced compact results and performed faster than other clustering algorithms. Later, the algorithm was used to cluster web pages. The web pages were clustered by first cleaning the unnecessary data and then labeling the obtained web pages to categorize them.

References

  1. Alam, S., Dobbie, G., Riddle, P. 2008. An evolutionary particle swarm optimization algorithm for data clustering. In Proceedings of the 2008 IEEE Swarm Intelligence Symposium (ST Louis, MO, USA, September 21--23, 2008). IEEE, New York, NY, 1--6. DOI = 10.1109/SIS.2008.4668294Google ScholarGoogle ScholarCross RefCross Ref
  2. Al-Sultan, K. S. and Tabu, A. 1995. Search approach to a clustering problem. Pattern Recognition. 28, 9 (Sept. 1995), 1443--1451. DOI = 10.1016/0031--3203(95)00022--RGoogle ScholarGoogle ScholarCross RefCross Ref
  3. Carpineto, C., Osinski, S., Romano, G., and Weiss. D. 2009. A survey of web clustering engines. ACM Computing Surveys (CSUR). 41, 3 (July 2009), 17:1--17:38. DOI= http://doi.acm.org/10.1145/1541880.1541884. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, C.-Y. and Ye, F. 2004. Particle swarm optimization algorithm and its application to clustering analysis. In Proceedings of the 2004 IEEE International Conference on Networking, Sensing and Controls (Taipei. Taiwan, March 21--23, 2004). IEEE, New York, NY, 789--794. DOI = 10.1109/ICNSC.2004.1297047Google ScholarGoogle Scholar
  5. Chen, J. and Zhang, H. 2007. Research on application of clustering algorithm based on PSO for the web usage pattern. In Proceedings of the 2007 International Conference on Wireless Communications, Networking and Mobile Computing (Shanghai, China, September 21--25, 2007). IEEE, New York, NY, 3705--3708. DOI = 10.1109/WICOM.2007.916Google ScholarGoogle Scholar
  6. Hartigan, J. A. and Wong, M. A. 1979. Algorithm AS136 - A K-means clustering algorithm. Journal of the Royal Statistical Society, Series C (Applied Statistics). 28, 1 (1979), 100--108.Google ScholarGoogle ScholarCross RefCross Ref
  7. Kennedy, J. and Eberhart, R. C. 1995. Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks Vol. IV (Perth, Australia, November 27-December 1, 1995). IEEE, New York, NY, 1942--1948. DOI = 10.1109/ICNN.1995.488968Google ScholarGoogle ScholarCross RefCross Ref
  8. MacQueen, J. B. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability Vol. 1 (Berkeley, California, USA, June 21-July 18, 1965 and December 27, 1965-January 7, 1966). University of California Press, Berkeley, California, 281--297. DOI = http://projecteuclid.org/euclid.bsmsp/1200512992Google ScholarGoogle Scholar
  9. Mahdavi, M., Haghir Chehreghani, M., Abolhassani, H., and Forsati, R. 2008. Novel meta-heuristic algorithms for clustering web documents. Applied Mathematics and Computation. 201, 1--2 (July 2008), 441--451. DOI = 10.1016/j.amc.2007.12.058Google ScholarGoogle ScholarCross RefCross Ref
  10. Pison, G., Struyf, A., and Rousseeuw, P. J. 1999. Displaying a clustering with CLUSPLOT. Computational Statistics and Data Analysis. 30, 4 (June 1999), 381--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Saatchi, S. and Hung, C. C. 2005. Hybridization of ant colony optimization with the K-means algorithm for clustering. In Proceedings of the 14th Scandinavian Conference (Joensuu, Finland, June 19--22, 2005). Springer, Berlin, Germany, LNCS vol. 3540, 511--520. DOI = 10.1007/11499145_52 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. van der Merwe, D. W. and Engelbrecht, A. P. 2003. Data clustering using particle swarm optimization. In Proceedings of the 2003 Congress on Evolutionary Computation (Canberra, Australia, December 08--12, 2003). IEEE, New York, NY, 215--220. DOI = 10.1109/CEC.2003.1299577Google ScholarGoogle Scholar
  13. Website: Particle Swarm Optimization (PSO) tutorial http://www.swarmintelligence.org/tutorials.php.Google ScholarGoogle Scholar

Index Terms

  1. Applying hybrid Kepso clustering to web pages

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ACM SE '10: Proceedings of the 48th Annual Southeast Regional Conference
        April 2010
        488 pages
        ISBN:9781450300643
        DOI:10.1145/1900008

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 April 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ACM SE '10 Paper Acceptance Rate48of94submissions,51%Overall Acceptance Rate134of240submissions,56%
      • Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader