Skip to main content

Evaluation of Frequent Pattern Growth Based Fuzzy Particle Swarm Optimization Approach for Web Document Clustering

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10404))

Abstract

Soft and hard clustering efficiency evaluation of novel approach of frequent pattern growth based fuzzy particle swarm optimization for clustering web documents is studied and analyzed in this paper. The conventional approaches K-Means and Fuzzy c-means (FCM) fails with regard to random initialization and local minima hookups. To overcome this drawbacks, bio inspired mechanisms like genetic algorithm, ant colony optimization and particle swarm optimization (PSO) are used to optimize the K-means and FCM clustering. The major contribution of the novel method are three fold. Primarily in its ways to automatically find effective cluster numbers, cluster centroids and swarms for the bio inspired fuzzy particle swarm optimization. Second in yielding fuzzy overlapping clusters using the FCM objective function overcoming the drawbacks of the existing methods. Third, the methodology discusses in this paper prunes out the irrelevant elements from the search space and thereby retains all relationships with search query as semantic conditionally relatable sets. The evaluation results show that our proposed approach performs better for Adjusted Rand Index (ARI), Normalized Mutual Information (NMI) and Adjusted Concordance Index (ACI) against various distance based similarity measures and FCMPSO.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3, 32–57 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)

    Book  MATH  Google Scholar 

  3. Liu, H., Pei, T., Zhou, T., Zhu, A.X.: Multi-temporal MODIS-data-based PSO-FCM clustering applied to wetland extraction in the Sanjiang Plain. In: International Conference on Earth Observation Data Processing and Analysis, Wuhan, China, vol. 7285 (2008)

    Google Scholar 

  4. Silva Filho, T.M., Pimentel, B.A., Souza, R.M.C.R., Oliveira, A.L.I.: Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst. Appl. 42(17–18), 6315–6328 (2015)

    Article  Google Scholar 

  5. Lam, Y.-K., Tsang, P.W.M., Leung, C.-S.: PSO-based K-Means clustering with enhanced cluster matching for gene expression data. Neural Comput. Appl. 22(7–8), 1349–1355 (2013)

    Article  Google Scholar 

  6. Feng, Y., Teng, G.F., Wang, A.X., Yao, Y.M.: Chaotic inertia weight in particle swarm optimization. In: Second International Conference on Innovative Computing, Information and Control, pp. 475–501. IEEE (2008)

    Google Scholar 

  7. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  8. Izakian, H., Abraham, A.: Fuzzy C-means and fuzzy swarm for fuzzy clustering problem. Expert Syst. Appl. 38(3), 1835–1838 (2011)

    Article  Google Scholar 

  9. Kennedy, J.F., Eberhart, R.C., Shi, Y., NetLibrary, Inc.: Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco (2001)

    Google Scholar 

  10. Pamba, R.V., Sherly, E., Mohan, K.: Automated information retrieval model using FP growth based fuzzy particle swarm optimization. Int. J. Comput. Sci. Inf. Technol. 9(1) (2017)

    Google Scholar 

  11. Priyadharshini, S.P., Pujeri, R.V.: Performance analysis of fuzzy clustering. Int. J. Adv. Eng. Technol. (2014)

    Google Scholar 

  12. Zheng, Y., Qu, J., Zhou, Y.: An improved PSO clustering algorithm based on affinity propagation. WSEAS Trans. Syst. 12(9), 447–456 (2013)

    Google Scholar 

  13. Huang, H.-C., Chuang, Y.-Y., Chen, C.-S.: Multiple kernel fuzzy clustering. IEEE Trans. Fuzzy Syst. 20(1), 120–134 (2012)

    Article  Google Scholar 

  14. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010). Elsevier

    Article  Google Scholar 

  15. Cui, X., Potok, T.E.: Document clustering analysis based on hybrid PSO+Kmeans algorithm. J. Comput. Sci. 27–33 (2005). Special Issue

    Google Scholar 

  16. Wu, J., Xiong, H., Chen, J.: Adapting the right measures for k-means clustering. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, ser. KDD 2009, pp. 877–886 (2009)

    Google Scholar 

  17. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)

    MathSciNet  MATH  Google Scholar 

  18. Amodio, S., d’Ambrosio, A., Iorio, C., Siciliano, R.: Adjusted concordance index, an extension of the adjusted rand index to fuzzy partitions. STAD Research report 03 2015 (2016)

    Google Scholar 

  19. Campello, R.J.G.B.: A fuzzy extension of the rand index and other related indexes for clustering and classification assessment. Pattern Recogn. Lett. 28(7), 833–841 (2007)

    Article  Google Scholar 

  20. Hullermeier, E., Rifqi, M., Henzgen, S., Senge, R.: Comparing fuzzy partitions: a generalization of the rand index and related measures. IEEE Trans. Fuzzy Syst. 20(3), 546–556 (2012)

    Article  Google Scholar 

  21. Yates, R.B., Neto, B.R.: Modern Information Retrieval. Addison-Wesley, New York (1999)

    Google Scholar 

  22. http://qwone.com/jason/20Newsgroups/

  23. Cardoso-Cachopo, A.: Datasets for single-label text categorization. http://web.ist.utl.pt/acardoso/

  24. Labatut, V.: Generalized measures for the evaluation of community detection methods. https://arxiv.org/ftp/arxiv/papers/1303/1303.5441.pdf

  25. https://arxiv.org/ftp/arxiv/papers/1303/1303.5441.pdf

  26. https://arxiv.org/pdf/1509.00803.pdf

  27. Larsen, B., Aone, C.,: Fast and effective text mining using linear-time document clustering. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1999)

    Google Scholar 

  28. Alok, A.K., Saha, S., Ekbal, A.: Development of an external cluster validity index using probabilistic approach and min-max distance. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 6, 494–504 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raja Varma Pamba .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Pamba, R.V., Sherly, E., Mohan, K. (2017). Evaluation of Frequent Pattern Growth Based Fuzzy Particle Swarm Optimization Approach for Web Document Clustering. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2017. ICCSA 2017. Lecture Notes in Computer Science(), vol 10404. Springer, Cham. https://doi.org/10.1007/978-3-319-62392-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62392-4_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62391-7

  • Online ISBN: 978-3-319-62392-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics