Skip to main content
Log in

Swarm-based clustering algorithm for efficient web blog and data classification

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Data classification and the weblog classification have become the most regular approach for people to express themselves. Data classification is another type of problem for classifying a feature set into several feature subsets, and those are further clustered into different classes on the basis of binary or multiclassification. Many problems in science and technology, industry and commercial business and medicine and health care can be treated as classification problems. In recent years, many methods are existing to build a classification model based on many statistical concepts and optimization methods. One major issue of building statistical model will have the principle to provide good accuracy simply when the principal assumptions are correct. The classification decision made on accuracy only justifies the performance of the particular model. Before applying the model to the particular application, it requires good perceptive of data utilized. In order to provide an effective learning algorithm to refine such complexity in handling the data and to minimize output errors and to provide the hands to improve the efficiency of the model, this research article is framed. In this work, a novel algorithm named ‘swarm-based cluster algorithm’ is proposed to complete the feature selection task in order to produce optimized feature-based clusters for effective data and weblogs classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ikeda D, Takamura H, Okumura M (2008) Semi-supervised learning for blog classification. In: AAAI, pp 1156–1161

  2. Chen Y, Xu X-H et al (2012) Study of modified particle swarm optimization algorithm classification. IEEE Trans Knowl Data Eng 24(1):127–140

    Article  MathSciNet  Google Scholar 

  3. Lin K-C, Zhang K-Y, Huang Y-H, Hung JC, Yen N (2016) Feature selection based on an improved cat swarm optimization algorithm for big data classification. J Supercomput 72(8):3210–3221

    Article  Google Scholar 

  4. Zhao Q, Meng G (2012) Bacterial foraging with PSO algorithm and its application on attribute reduction. Int J Innov Comput Appl 4(2):100

    Article  Google Scholar 

  5. Liao J-K, Ye D-Y (2012) Minimal attribute reduction algorithm based on particle swarm optimization with immunity. J Comput Appl 7(3):550–555

    Google Scholar 

  6. Guo J-L, Wu Z-J, Jiang D-Z (2009) Adaptive swarm optimization algorithm based on energy of particle. J Syst Simul 21(5):4465–4471

    Google Scholar 

  7. Li J, Fong S, Mohammed S, Fiaidhi J (2016) Improving the classification performance of biological imbalanced datasets by swarm optimization algorithms. J Supercomput 72(10):3708–3728

    Article  Google Scholar 

  8. Abualigah LM, Khader ATJ (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 1:1–27. doi:10.1007/s11227-017-2046-2

    Article  Google Scholar 

  9. Wang YL, Kim KT, Lee B, Youn HY (2017) A novel buffer management scheme based on particle swarm optimization for SSD. J Supercomput 1:1–19

    Google Scholar 

  10. Melgani F, Bazi Y (2008) Classification of electrocardiogram signals with support vector machines and particle swarm optimization. IEEE Trans Inf Technol Biomed 12(5):667–677

    Article  Google Scholar 

  11. Olesen JR (2009) Auto-clustering using particle swarm optimization and bacterial foraging in agents and data mining interaction. Springer, Berlin, pp 69–83

    Google Scholar 

  12. Wan M, Wang C, Li L, Yang Y (2012) Chaotic ant swarm approach for data clustering. Appl Soft Comput 12:2387–2393

    Article  Google Scholar 

  13. Yuwono M, Su SW, Moulton B, Nguyen H (2012) Fast unsupervised learning method for rapid estimation of cluster centroids. In: IEEE, pp 1–8

  14. Chuang L, Yang C, Wu K, Yang C (2011) Gene selection and classification using Taguchi chaotic binary particle swarm optimization. Expert Syst Appl 38(10):13367–13377

    Article  Google Scholar 

  15. Wang X-Y, Yang J, Teng X-L (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(1):459–471

    Article  Google Scholar 

  16. Lee IH, Lushington GH, Visvanathan M (2011) A filter-based feature selection approach for identifying potential biomarkers for lung cancer. J Clin Bioinform 1(1):1–11

    Article  Google Scholar 

  17. Liu H, Liu L, Zhang H (2010) Ensemble gene selection for cancer classification. Pattern Recogn 43(8):2763–2772

    Article  Google Scholar 

  18. Wang J, Wu L, Kong J, Li Y, Zhang B (2013) Maximum weight and minimum redundancy: a novel framework for feature subset selection. Pattern Recogn 46(1):1616–1627

    Article  MATH  Google Scholar 

  19. Chang P-C, Lin J-J, Liu C-H (2012) An attribute weight assignment and particle swarm optimization algorithm for medical database classifications. Comput Methods Progr Biomed 107:382–392. doi:10.1016/j.cmpb.2010.12.004 (PMID: 21194784)

    Article  Google Scholar 

  20. Maji P (2012) Mutual information-based supervised attribute clustering for microarray sample classification. IEEE Trans Knowl Data Eng 24(1):127–140

    Article  Google Scholar 

  21. Han JQ, Sun ZY, Hao HW (2015) Selecting feature subset with sparsity and low redundancy for unsupervised learning. Knowl Based Syst 86(1):210–223

    Article  Google Scholar 

  22. Huang KY (2011) A hybrid particle swarm optimization approach for clustering and classification of datasets. Knowl Based Syst 24(3):420–426

    Article  Google Scholar 

  23. Han M, Liu XX (2013) Feature selection techniques with class separability for multivariate time series. Neurocomputing 110(1):29–34

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. A. Neeba.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Neeba, E.A., Koteeswaran, S. & Malarvizhi, N. Swarm-based clustering algorithm for efficient web blog and data classification. J Supercomput 76, 3949–3962 (2020). https://doi.org/10.1007/s11227-017-2162-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-017-2162-z

Keywords

Navigation