Abstract
Many conventional methods for concepts formation in ontology learning have relied on the use of predefined templates and rules, and static resources such as WordNet. Such approaches are not scalable, difficult to port between different domains and incapable of handling knowledge fluctuations. Their results are far from desirable, either. In this paper, we propose a new ant-based clustering algorithm, Tree-Traversing Ant (TTA), for concepts formation as part of an ontology learning system. With the help of Normalized Google Distance (NGD) and n° of Wikipedia (n°W) as measures for similarity and distance between terms, we attempt to achieve an adaptable clustering method that is highly scalable and portable across domains. Evaluations with an seven datasets show promising results with an average lexical overlap of 97% and ontological improvement of 48%. At the same time, the evaluations demonstrated several advantages that are not simultaneously present in standard ant-based and other conventional clustering methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bennett C, Gacs P, Li M, Vitanyi P and Zurek W (1998). Information distance. IEEE Trans Inform Theory 44(4): 1407–1423
Berkhin P (2002) Survey of clustering data mining techniques. Technical report. Accrue Software
Choi B, Yao Z (2005) Web page classification. In: Chu W, Lin T (eds) Foundations and advances in data mining. Springer-Verlag
Cilibrasi R, Vitanyi P (2005) Automatic meaning discovery using google. http://xxx.lanl. gov/abs/cs.CL/0412098
Cilibrasi R, Vitanyi P (2006) Automatic extraction of meaning from the web. In: Proceedings of the IEEE international symposium on information theory, Seattle, USA
Cimiano P, Staab S (2005) Learning concept hierarchies from text with a guided agglomerative clustering algorithm. In: Proceedings of the workshop on learning and extending lexical ontologies with machine learning methods, Bonn, Germany
Dellschaft K, Staab S (2006) On how to perform a gold standard based evaluation of ontology learning. In: Proceedings of the 5th international semantic web conference (ISWC)
Deneubourg J, Goss S, Franks N, Sendova-Franks A, Detrain C, Chretien L (1991) The dynamics of collective sorting: robot-like ants and ant-like robots. In: Proceedings of the 1st international conference on simulation of adaptive behavior: from animals to Animats, France
Faure D, Nedellec C (1998) A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: Proceedings of the 1st international conference on language resources and evaluation (LREC), Granada, Spain
Faure D, Poibeau T (2000) First experiments of using semantic knowledge learned by asium for information extraction task using intex. In: Proceedings of the 1st Workshop on Ontology Learning, Berlin, Germany
Gomez-Perez A, Manzano-Macho D (2003) A survey of ontology learning methods and techniques. Deliverable 1.5, OntoWeb Consortium
Grunwald P and Vitanyi P (2003). Kolmogorov complexity and information theory. J Logic Language(and Information) 12(4): 497–529
Gutowitz H (1993) Complexity-seeking ants. In: Proceedings of the 3rd European conference on artificial life.
Handl J, Meyer B (2002) Improved ant-based clustering and sorting. In: Proceedings of the 7th international conference on parallel problem solving from nature
Handl J, Knowles J, Dorigo M (2003) Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1d-som. Technical Report TR/IRIDIA/2003-24, Universite Libre de Bruxelles
Handl J, Knowles J and Dorigo M (2006). Ant-based clustering and topographic mapping. Artif Life 12(1): 35–61
Jain A, Murty M and Flynn P (1999). Data clustering: a review. ACM Comput Survey 31(3): 264–323
Lagus K, Honkela T, Kaski S, Kohonen T (1996) Self-organizing maps of document collections: A new approach to interactive exploration. In: Proceedings of the 2nd international conference on knowledge discovery and data mining
Lelewer D and Hirschberg D (1987). Data compression. ACM Comput Surveys 19(3): 261–296
Lumer E, Faieta B (1994) Diversity and adaptation in populations of clustering ants. In: Proceedings of the 3rd international conference on simulation of adaptive behavior: from animals to animats 3
Maedche A, Staab S (2002) Measuring similarity between ontologies. In: Proceedings of the European conference on knowledge acquisition and management (EKAW), Madrid, Spain
Maedche A, Volz R (2001) The ontology extraction & maintenance framework: text-to-onto. In: Proceedings of the IEEE international conference on data mining, California, USA
Ritter H and Kohonen T (1989). Self-organizing semantic maps. Biol Cybernet 61(1): 241–254
Sabou M, Wroe C, Goble C, Mishne G (2005) Learning domain ontologies for web service descriptions: an experiment in bioinformatics. In: Proceedings of the 14th international conference on World Wide Web
Shamsfard M, Barforoush A (2002) An introduction to hasti: an ontology learning system. In: Proceedings of the 7th Iranian conference on electrical engineering, Tehran, Iran
Shamsfard M and Barforoush A (2004). Learning ontologies from natural language texts. Int J Human-Computer Stud 60(1): 17–63
Steinbach M, Karypis G, Kumar V (2000) A comparison of document clustering techniques. Technical Report 00-034, University of Minnesota
Vitanyi P (2005) Universal similarity. In: Proceedings of the IEEE ITSOC information theory workshop on coding and complexity, New Zealand
Vizine A, deCastro L, Hruschka E and Gudwin R (2005). Towards improving clustering ants: an adaptive ant clustering algorithm. Informatica 29(2): 143–154
Wong W, Liu W, Bennamoun M (2006) Terms clustering using tree-traversing ants and featureless similarities. In: Proceedings of the international symposium on practical cognitive agents and robots, Perth, Australia
Yao Z, Choi B (2003) Bidirectional hierarchical clustering for web mining. In: Proceedings of the IEEE/WIC international conference on web intelligence
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M.J. Zaki.
Rights and permissions
About this article
Cite this article
Wong, W., Liu, W. & Bennamoun, M. Tree-Traversing Ant Algorithm for term clustering based on featureless similarities. Data Min Knowl Disc 15, 349–381 (2007). https://doi.org/10.1007/s10618-007-0073-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-007-0073-y