Skip to main content
Log in

Chaotic gradient artificial bee colony for text clustering

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Text clustering is widely used to create clusters of the digital documents. Selection of cluster centers plays an important role in the clustering. In this paper, we use artificial bee colony algorithm (ABC) to select appropriate cluster centers for creating clusters of the text documents. The ABC is a population-based nature-inspired algorithm, which simulates intelligent foraging behavior of the real honey bees and has been shown effective in solving many search and optimization problems. However, a major drawback of the algorithm is that it provides a good exploration of the search space at the cost of exploitation. In this paper, we improve search equation of the ABC and embed two local search paradigms namely chaotic local search and gradient search in the basic ABC to improve its exploitation capability. The proposed algorithm is named as chaotic gradient artificial bee colony. The effectiveness of the proposed algorithm is tested on three different benchmark text datasets namely Reuters-21,578, Classic4, and WebKB. The obtained results are compared with the ABC, a recent variant of the ABC namely gbest-guided ABC, a variant of the proposed methodology namely chaotic artificial bee colony, memetic ABC, and conventional clustering algorithm K-means. The empirical evaluation reveals very encouraging results in terms of the quality of solution and convergence speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. http://jmlr.org/papers/volume5/lewis04a/a11-smart-stop-list/english.stop.

  2. http://tartarus.org/martin/PorterStemmer/.

References

  • Bansal JC, Sharma H, Arya K, Nagar A (2013) Memetic search in artificial bee colony algorithm. Soft Comput 17(10):1911–1928

    Article  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, Norwell

    Book  MATH  Google Scholar 

  • Bharti KK, Singh PK (2014a) A three-stage unsupervised dimension reduction method for text clustering. J Comput Sci 5(2):156–169

  • Bharti KK, Singh PK (2014b) Chaotic artificial bee colony for text clustering. In: Fourth international conference on emerging applications of information technology (EAIT-2014), ISI. IEEE Kolkata

  • Buckley C, Singhal A, Mitra M, Salton G (1995) New retrieval approaches using smart: TREC 4. In: Proceedings of the fourth text retrieval conference (TREC-4), pp 25–48

  • Chuang LY, Tsai SW, Yang CH (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707

    Article  Google Scholar 

  • Cui X, Potok TE, Palathingal P (2005) Document clustering using particle swarm optimization. In: Proceedings of IEEE swarm intelligence symposium (SIS-2005). IEEE, pp 185–191

  • Cura T (2012) A particle swarm optimization approach to clustering. Expert Syst Appl 39(1):1582–1588

    Article  Google Scholar 

  • Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evolut Comput 1(1):3–18

    Article  Google Scholar 

  • Eberhart RC, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the 6th international symposium on micro machine and human science (MHS-1995), vol 1, New York, pp 39–43

  • Fei K, Junjie L, Haojin L, Zhenyue M, Qing X (2010) Improved artificial bee colony algorithm. IEEE, 2nd international workshop on intelligent systems and applications (ISA-2010), pp 1–4

  • Figueiredo F, Rocha L, Couto T, Salles T, Gonçalves MA, Meira W Jr (2011) Word co-occurrence features for text classification. Inf Syst 36(5):843–858

    Article  Google Scholar 

  • Gao W, Liu S, Huang L (2012) A global best artificial bee colony algorithm for global optimization. J Comput Appl Math 236(11):2741–2753

    Article  MathSciNet  MATH  Google Scholar 

  • W Gao, S Liu, L Huang (2013) A novel artificial bee colony algorithm with powell’s method. Appl Soft Comput 13(9):3763–3775

    Article  Google Scholar 

  • Guo JQ, Zhou HF, Meng LQ (2009) Chaos particle swarm optimization algorithm for estimating solute transport parameters of streams from tracer experiment data. In: Fourth international conference on innovative computing, information and control (ICICIC-2009). IEEE, pp 872–875

  • Han J, Kamber M (2006) Data mining. Concepts and techniques, Southeast Asia edn. Morgan kaufmann, Waltham

    MATH  Google Scholar 

  • Handl J, Meyer B (2007) Ant-based and swarm-based clustering. Swarm Intell 1(2):95–113

  • He D, He C, Jiang LG, Zhu HW, Hu GR (2001) Chaotic characteristics of a one-dimensional iterative map with infinite collapses. IEEE Trans Circuits Syst I: Fundam Theory Appl 48(7):900–906

  • Jadhav H, Roy R (2013) Gbest guided artificial bee colony algorithm for environmental/economic dispatch considering wind power. Expert Syst Appl 40(16):6385–6399

    Article  Google Scholar 

  • Jolliffe I (2005) Principal component analysis. Wiley Online Library

  • Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical report TR06, Engineering faculty, Computer Engineering Department, Erciyes University Press, Erciyes

  • Karaboga D, Ozturk C (2011) A novel clustering approach: artificial bee colony (ABC) algorithm. Appl Soft Comput 11(1):652–657

    Article  Google Scholar 

  • Kaufman L, Rousseeuw P (1987) Clustering by means of medoids. North-Holland, Amsterdam

    Google Scholar 

  • Kiefer J (1953) Sequential minimax search for a maximum. Proc Am Math Soc 4(3):502–506

    Article  MathSciNet  MATH  Google Scholar 

  • Li C, Zhou J, Kou P, Xiao J (2012) A novel chaotic particle swarm optimization based fuzzy clustering algorithm. Neurocomputing 83:98–109

    Article  Google Scholar 

  • Liang Z (2010) Genetic enhancing chaotic particle swarm optimization algorithm. In: Proceedings of the 29th Chinese control conference (CCC-2010). IEEE, pp 5182–5187

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, California, vol 1, no 14, pp 281–297

  • Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recognit 33(9):1455–1465

    Article  Google Scholar 

  • Pantel P, Lin D (2002) Document clustering with committees. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 199–206

  • Powell MJD (1977) Restart procedures for the conjugate gradient method. Math Program 12(1):241–254

    Article  MATH  Google Scholar 

  • Reed JW, Jiao Y, Potok TE, Klump BA, Elmore MT, Hurson AR (2006) Tf-icf: a new term weighting scheme for clustering dynamic data streams. In: 5th International conference on machine learning and applications (ICMLA-2006). IEEE, pp 258–263

  • Robertson SE, Walker S (1999) Okapi/keenbow at trec-8. In: Text retrieval conference (TREC), vol 8, pp 151–162

  • Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manage 24(5):513–523

    Article  Google Scholar 

  • Sharma H, Bansal JC, Arya K (2013) Opposition based lévy flight artificial bee colony. Memet Comput 5(3):213–227

    Article  Google Scholar 

  • Sharma TK, Pant M, Singh VP (2012) Improved local search in artificial bee colony using golden section search. J Eng 1(1):14–19

    Google Scholar 

  • Tan PN, Steinbach M, Kumar V (2005) Introduction to Data Mining. Addison Wesley, Upper Saddle River

    Google Scholar 

  • Uğuz H (2011) A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl Based Syst 24(7):1024–1032

    Article  Google Scholar 

  • Umeno K, Kitayama K (1999) Spreading sequences using periodic orbits of chaos for CDMA. Electron Lett 35(7):545–546

    Article  Google Scholar 

  • Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kusum Kumari Bharti.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bharti, K.K., Singh, P.K. Chaotic gradient artificial bee colony for text clustering. Soft Comput 20, 1113–1126 (2016). https://doi.org/10.1007/s00500-014-1571-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1571-7

Keywords

Navigation