Skip to main content
Log in

Tree-Traversing Ant Algorithm for term clustering based on featureless similarities

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Many conventional methods for concepts formation in ontology learning have relied on the use of predefined templates and rules, and static resources such as WordNet. Such approaches are not scalable, difficult to port between different domains and incapable of handling knowledge fluctuations. Their results are far from desirable, either. In this paper, we propose a new ant-based clustering algorithm, Tree-Traversing Ant (TTA), for concepts formation as part of an ontology learning system. With the help of Normalized Google Distance (NGD) and n° of Wikipedia (n°W) as measures for similarity and distance between terms, we attempt to achieve an adaptable clustering method that is highly scalable and portable across domains. Evaluations with an seven datasets show promising results with an average lexical overlap of 97% and ontological improvement of 48%. At the same time, the evaluations demonstrated several advantages that are not simultaneously present in standard ant-based and other conventional clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Bennett C, Gacs P, Li M, Vitanyi P and Zurek W (1998). Information distance. IEEE Trans Inform Theory 44(4): 1407–1423

    Article  MATH  MathSciNet  Google Scholar 

  • Berkhin P (2002) Survey of clustering data mining techniques. Technical report. Accrue Software

  • Choi B, Yao Z (2005) Web page classification. In: Chu W, Lin T (eds) Foundations and advances in data mining. Springer-Verlag

  • Cilibrasi R, Vitanyi P (2005) Automatic meaning discovery using google. http://xxx.lanl. gov/abs/cs.CL/0412098

  • Cilibrasi R, Vitanyi P (2006) Automatic extraction of meaning from the web. In: Proceedings of the IEEE international symposium on information theory, Seattle, USA

  • Cimiano P, Staab S (2005) Learning concept hierarchies from text with a guided agglomerative clustering algorithm. In: Proceedings of the workshop on learning and extending lexical ontologies with machine learning methods, Bonn, Germany

  • Dellschaft K, Staab S (2006) On how to perform a gold standard based evaluation of ontology learning. In: Proceedings of the 5th international semantic web conference (ISWC)

  • Deneubourg J, Goss S, Franks N, Sendova-Franks A, Detrain C, Chretien L (1991) The dynamics of collective sorting: robot-like ants and ant-like robots. In: Proceedings of the 1st international conference on simulation of adaptive behavior: from animals to Animats, France

  • Faure D, Nedellec C (1998) A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: Proceedings of the 1st international conference on language resources and evaluation (LREC), Granada, Spain

  • Faure D, Poibeau T (2000) First experiments of using semantic knowledge learned by asium for information extraction task using intex. In: Proceedings of the 1st Workshop on Ontology Learning, Berlin, Germany

  • Gomez-Perez A, Manzano-Macho D (2003) A survey of ontology learning methods and techniques. Deliverable 1.5, OntoWeb Consortium

  • Grunwald P and Vitanyi P (2003). Kolmogorov complexity and information theory. J Logic Language(and Information) 12(4): 497–529

    Article  MathSciNet  Google Scholar 

  • Gutowitz H (1993) Complexity-seeking ants. In: Proceedings of the 3rd European conference on artificial life.

  • Handl J, Meyer B (2002) Improved ant-based clustering and sorting. In: Proceedings of the 7th international conference on parallel problem solving from nature

  • Handl J, Knowles J, Dorigo M (2003) Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1d-som. Technical Report TR/IRIDIA/2003-24, Universite Libre de Bruxelles

  • Handl J, Knowles J and Dorigo M (2006). Ant-based clustering and topographic mapping. Artif Life 12(1): 35–61

    Article  Google Scholar 

  • Jain A, Murty M and Flynn P (1999). Data clustering: a review. ACM Comput Survey 31(3): 264–323

    Article  Google Scholar 

  • Lagus K, Honkela T, Kaski S, Kohonen T (1996) Self-organizing maps of document collections: A new approach to interactive exploration. In: Proceedings of the 2nd international conference on knowledge discovery and data mining

  • Lelewer D and Hirschberg D (1987). Data compression. ACM Comput Surveys 19(3): 261–296

    Article  MATH  Google Scholar 

  • Lumer E, Faieta B (1994) Diversity and adaptation in populations of clustering ants. In: Proceedings of the 3rd international conference on simulation of adaptive behavior: from animals to animats 3

  • Maedche A, Staab S (2002) Measuring similarity between ontologies. In: Proceedings of the European conference on knowledge acquisition and management (EKAW), Madrid, Spain

  • Maedche A, Volz R (2001) The ontology extraction & maintenance framework: text-to-onto. In: Proceedings of the IEEE international conference on data mining, California, USA

  • Ritter H and Kohonen T (1989). Self-organizing semantic maps. Biol Cybernet 61(1): 241–254

    Article  Google Scholar 

  • Sabou M, Wroe C, Goble C, Mishne G (2005) Learning domain ontologies for web service descriptions: an experiment in bioinformatics. In: Proceedings of the 14th international conference on World Wide Web

  • Shamsfard M, Barforoush A (2002) An introduction to hasti: an ontology learning system. In: Proceedings of the 7th Iranian conference on electrical engineering, Tehran, Iran

  • Shamsfard M and Barforoush A (2004). Learning ontologies from natural language texts. Int J Human-Computer Stud 60(1): 17–63

    Article  Google Scholar 

  • Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. Technical Report 00-034, University of Minnesota

  • Vitanyi P (2005) Universal similarity. In: Proceedings of the IEEE ITSOC information theory workshop on coding and complexity, New Zealand

  • Vizine A, deCastro L, Hruschka E and Gudwin R (2005). Towards improving clustering ants: an adaptive ant clustering algorithm. Informatica 29(2): 143–154

    MATH  Google Scholar 

  • Wong W, Liu W, Bennamoun M (2006) Terms clustering using tree-traversing ants and featureless similarities. In: Proceedings of the international symposium on practical cognitive agents and robots, Perth, Australia

  • Yao Z, Choi B (2003) Bidirectional hierarchical clustering for web mining. In: Proceedings of the IEEE/WIC international conference on web intelligence

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Liu.

Additional information

Communicated by M.J. Zaki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wong, W., Liu, W. & Bennamoun, M. Tree-Traversing Ant Algorithm for term clustering based on featureless similarities. Data Min Knowl Disc 15, 349–381 (2007). https://doi.org/10.1007/s10618-007-0073-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-007-0073-y

Keywords

Navigation