Abstract
A concept hierarchy is an integral part of an ontology but it is expensive and time consuming to build. Motivated by this, many unsupervised learning methods have been proposed to (semi-) automatically develop a concept hierarchy. A significant work is the Guided Agglomerative Hierarchical Clustering (GAHC) which relies on linguistic patterns (i.e., hypernyms) to guide the clustering process. However, GAHC still relies on contextual features to build the concept hierarchy, thus data sparsity still remains an issue in GAHC. Artificial Immune Systems are known for robustness, noise tolerance and adaptability. Thus, an extension to the GAHC is proposed by hybridizing it with Artificial Immune Network (aiNet) which we call Guided Clustering and aiNet for Learning Concept Hierarchy (GCAINY). In this paper, we have tested GCAINY using two parameter settings. The first parameter setting is obtained from the literature as a baseline parameter setting and second is by automatic parameter tuning using Particle Swarm Optimization (PSO). The effectiveness of the GCAINY is evaluated on three data sets. For further validations, a comparison between GCAINY and GAHC has been conducted and with statistical tests showing that GCAINY increases the quality of the induced concept hierarchy. The results reveal that the parameters value found by using PSO significantly produce better concept hierarchy than the vanilla parameter. Thus it can be concluded that the proposed approach has greater ability to be used in the field of ontology learning.
Similar content being viewed by others
References
Acilar AM, Arslan A (2009) A collaborative filtering method based on artificial immune network. Expert Syst Appl 36(4):8324–8332
Alesso HP, Smith CF (2005) Developing semantic web services. AK Peters, Natick, Massachusetts, USA
Andrews P, Timmis J (2007) Alternative inspiration for artificial immune systems: exploiting Cohen’s cognitive immune model. Silico immunology. Springer, NY, USA
Baedi J, Arabshahi H, Armaki MG, Hosseini E (2010) Optical design of multilayer filter by using PSO algorithm. Res J Appl Sci Eng Technol 2(1):56–59
Beck G, Habicht GS (1996) Immunity and the invertebrates. Sci Am 275:60–66
Bezerra GB, de Castro LN, von Zuben FJ (2004) A hierarchical immune network applied to gene expression data. Artificial immune systems. Springer, Berlin
Bhríde FMG, McGinnity TM, McDaid LJ (2005) Landscape classification and problem specific reasoning for genetic algorithm. Int J Syst Cybernet 34:1469–1495
Brunzel M (2007) Learning of semantic sibling group hierarchies—K-means vs. Bi-secting-K-means data warehousing and knowledge discovery. Springer, Berlin
Burnet FM (1959) The clonal selection theory of aquired immunity. Cambridge University Press, Cambridge
Caraballo SA (1999) Automatic construction of a hypernym-labeled noun hierarchy from text. In: 7th annual meeting of the association for computational linguistics, College Park, Maryland 1999. Proceedings of the 37th annual meeting of the association for computational linguistics. ACM, Morristown, NJ, USA, pp 120–126
Cayzer S, Smith J, Marshall JAR, Kovacs T (2005) What have gene libraries done for AIS? In: 4th international conference on artificial immune systems (ICARIS 2005), Banff, Canada, 2005. Lecture notes in computer science, vol 3627. Springer, NY, pp 86–99
Chen J, Li Q (2006) Concept hierarchy construction by combining spectral clustering and subsumption estimation. Web Information systems—WISE 2006. Springer, Berlin
Cimiano P (2006) Ontology learning and population from text. Springer, Berlin
Cimiano P, Staab S (2005) Learning concept hierarchies from text with a guided agglomerative clustering algorithm. In: International conference on machine learning 2005 (ICML 2005), Bonn Germany, 2005. Workshop on learning and extending ontologies with machine learning methods, Bonn, Germany
Cimiano P, Hotho A, Staab S (2005) Learning concept hierarchies from text corpora using formal concept analysis. J Artif Intell Res 24:305–339
Dasgupta D (1999) Artificial immune systems and their applications. Springer, Berlin
de Castro LN, Timmis J (2002a) Artificial immune systems: a new computational intelligence approach. Springer, London
de Castro LN, Timmis JI (2002b). Hierarchy and convergence of immune networks: basic ideas and preliminary results. In: Proceedings on first international conference on artificial immune system (ICARIS 2002)
de Castro LN, Von Zuben FJ (2001) aiNet: an artificial immune network for data analysis. In: Abbass Ha, Saker Ra, Newton Cs (eds) Data mining: a heuristic approach. Idea Group Publishing, USA
de Castro LN, Zuben FJV (2000) An evolutionary immune network for data clustering. In: 6th Brazilian symposium on neural networks (SBRN 2000). IEEE Computer Society, Pernambuco, pp 84–89
Drumond L, Girardi R (2008) A survey of ontology learning procedures. In: Flgd F, Stuckenschmidt H, Pinto H, Malucelli A, Corcho Ó (eds) The 3rd workshop on ontologies and their applications, October 26, 2008, Salvador, Bahia, Brazil. CEUR-WS.org
Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization. In: IEEE congress evolutionary computation, pp 84–88
Eiben AE, Jelasity M (2002) A critical note on experimental research methodology in EC. Proceedings of the 2002 congress on evolutionary computation (CEC ‘02). IEEE Press, USA
Ercan MF (2008) A performance comparison of PSO and GA in scheduling hybrid Flow-shops with multiprocessor tasks. In: SAC‘08, Fortaleza, Ceara, Brazil. ACM, Canada
Forrest S, Hofmeyr SA, Somayaji A (1997) Computer immunology. Commun ACM 40(10):88–96
Gomez-Perez A, Corcho-Garcia O, Fernandez-Lopez M (2005) Ontological engineering. Springer, New York
Grosjean J, Plaisant C, Bederson B (2002) Space tree, 1.6th edn. Human-Computer Interaction Lab, University of Maryland, MD, USA
Gulla JA, Brasethvik T (2008) A hybrid approach to ontology relationship learning. Natural language and information systems. Springer, Berlin
Hamzah MP (2006).Frasa dan Hubungan Semantik Dalam Perwakilan Pengetahuan:kesan Terhadap Keberkesanan Capaian Dokumen Melayu (Phrase and semantic relationship in knowledge representation: effects in malay document retrieval). Dissertation (Phd), Universiti Kebangsaan Malaysia
Han J, Fu Y (1994) Dynamic generation and refinement of concept hierarchies for knowledge discovery in databases. Workshop on knowledge discovery in databases (AAAI94). AAAI, Burnaby, Canada
Hang X, Dai H (2004) An immune network approach for web document clustering. In: Proceedings of the 2004 IEEE/WIC/ACM international conference on web intelligence. IEEE Computer Society, Los Alamitos, USA
Harmer PK, Williams PD, Gunsch GH, Lamont GB (2002) An artificial immune system architecture for computer security applications. IEEE Trans Evol Comput 6:252–280
Harris Z (1968) Mathematical structure of language. Wiley, New York
Hearst M (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of 14th COLING, Nantes, France
Hightower RR, Forrest S, Perelson AS (1995) The evolution of emergent organization in immune system gene libraries. In: Proceedings of the 6th international conference on genetic algorithms. Morgan Kaufmann, San Francisco, USA, pp 344–350
Hindle D (1990) Noun classification from predicateargument structures. In: Annual meeting of the association for computational linguistics. Proceedings of the annual meeting of the association for computational linguistics, Pittsburgh, USA
Holland JH (1962) Outline for a logical theory of adaptive systems. J Assoc Comput Mach 3:297–314
Iwanska LM, Mata N, Kruger K (2000) Fully automatic acquisition of taxonomic knowledge from large corpora of texts. In: Lmiasc Shapiro (ed) Natural language processing and knowledge processing. MIT/AAAI Press, Cambridge, MA
Jerne NK (1974) Towards a network theory of the immune system. Ann Immunol (inst past) 125(C):373–389
Kelsey J, Timmis J (2003) Immune inspired somatic contiguous hypermutation for function optimisation. In: Cantu-Paz E (ed) GECCO 2003, Chicago, IL, USA, July 2003. Lecture notes in computer science, vol 2723. Springer, NY, pp 207–218
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol IV. IEEE Press, Piscataway, pp 1942–1948
Khalid NK, Kurniawan TB, Ibrahim Z, Yusof ZM, Khalid M, Engelbrecht AP (2008) A model to optimize DNA sequences based on particle swarm optimization. 2008 second Asia international conference on modelling and simulation (AICMS, 08). ACM, Kuala Lumpur, Malaysia
Khalid N, Ibrahim Z, Kurniawan T, Khalid M, Engelbrecht A (2009) Implementation of binary particle swarm optimization for DNA sequence design. In: Omatu S, Rocha M, Bravo J, Fernández F, Corchado E, Bustillo A, Corchado J (eds) Distributed computing, artificial intelligence, bioinformatics, soft computing, and ambient assisted living. Springer, Berlin
Kuo H-C, Lai H-C, Huang J-P (2008) Building a concept hierarchy automatically and its measuring. In: The seventh international conference on machine learning and cybernetics. IEEE, Kunming, China
Lee L (1999) Measures OD distributional similarity. In: Proceedings of the 37th annual meeting of the associations for computational linguistics (ACL). ACL, MD, USA, 1999. pp 25–32
Madche A, Staab S (2002) Measuring similarity between ontologies. In: Proceedings of the 13th international conference on knowledge engineering and knowledge management. Ontologies and the semantic web, Siguenza, Spain, 2002. Lecture notes in computer science, vol 2473. Springer, NY, pp 251–263
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60
Miller G (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41
Nazri MZA, Shamsudin SM, Bakar AA (2008) An exploratory study on malay processing tool for acquisition of taxonomy using FCA. In: Intelligent systems design and applications, 2008. ISDA ‘08. Eighth international conference on, 2008, vol 1. Kaohsiung, pp 375–380
Nazri MZA, Shamsudin SM, Abu Bakar A, Abdullah S (2009) A hybrid approach for learning concept hierarchy from Malay text using GAHC and immune network. In: Andrews Ps (ed) 8th International conference on artificial immune systems, York, UK, 2009. Lecture notes in computer science. Springer, Heidelberg
Neshati M, Abolhassani H, Fatemi H (2009) Automatic extraction of IS-A relations in taxonomy learning. In: Sarbazi-Azad H, Parhami B, Miremadi S-G, Hessabi S (eds) Advances in computer science and engineering. Springer, Heidelberg
Petersen W (2001) A set-theoretical approach for the induction of inheritance hierarchies. Elsevier Science. http://www.elsevier.nl/locate/entcs/volume51.html
Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization: an overview. Swarm Intell 1:33–57
Reinberger M-L, Daelemans W (2010) Is shallow parsing useful for unsupervised learning of semantic clusters? Computational linguistics and intelligent text processing. Springer, Berlin
Reinberger M-L, Spyns P (2005) Unsupervised text mining for the learning of DOGMA-inspired ontologies. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, applications and evaluation. IOS Press, Amsterdam
Ronghua D, Yue C (2010) A promoted global convergence particle swarm optimization algorithm. Advancing computing, communication, control and management. Springer, Heidelberg
Sabou M (2005) Learning web service ontologies: an automatic extraction method and its evaluation. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, applications and evaluation. IOS press, Amsterdam
Saleem K, Bellahsene Z (2008) Automatic extraction of structurally coherent mini-taxonomies conceptual modeling—ER 2008. Springer, Berlin
Salton G, McGill MJ (1983) Introduction to modern information retrieval. McGraw Hill, New York, NY
Sanchez D, Moeno A (2005) Web scale taxonomy learning. In: Biemann C, Pass G (eds) Proceedings of the workshop on extending and learning lexical ontologies using machine learning methods, Bonn, Germany
Secker A, Freitas A, Timmis J (2003) AISEC: an artificial immune system for email classification. In: Proceedings of the congress on evolutionary computation. IEEE, Canberra, Australia, pp 131–139
Secker A, Davies MN, Freitas AA, Timmis J, Clark E, Flower DR (2009) An artificial immune system for clustering amino acids in the context of protein function classification. J Math Model Algorithms 8(2):103–123
Sekiuchi R, Aoki C, Kurematsu M, Yamaguchi T (1998) DODDLE: a domain ontology rapid development environment. In: Proceedings of PRICAI 98
Swartout B, Patil R, Knight K, Russ T (1997) Toward distributed use of large-scale ontologies. In: AAAI spring symposium on ontological engineering. Stanford University, California. pp 38–148
Velardi P, Navigli R, Cuchiarelli A, Neri F (2005) Evaluation of OntoLean, a methodology for automatic population of domain ontologies. IOS Press, Amsterdam
Wille R (1982) Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival I (ed) Ordered sets. Reidel, Boston
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82
Yu Z, Hong Z, Ling-dong K (2009).Research of coal-gas outburst forecasting based on artificial immune network clustering model. In: Second international workshop on knowledge discovery and data mining (WKDD 2009). IEEE, Moscow, Russia, pp 23–27
Zhang C, Yi Z (2009) Tree structured artificial immune network with self-organizing reaction operator. Neurocomputing 73(1–3):336–349
Acknowledgments
We would like to acknowledge the reviewers of the International Journal of Natural Computing as well as the ICARIS’09 conference on which this work was presented for valuable comments. This research is currently supported by Universiti Kebangsaan Malaysia, Malaysia Ministry of Higher Education under the Fundamental Research Grant Scheme. We would also like to thank all our friends for feedback and comments, in particular Jon Timmis for clarifying our AIS-related questions, Hafiz Mohd Sarim for comments and Tri Basuki Kurniawan for optimization-related questions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nazri, M.Z.A., Shamsuddin, S.M., Bakar, A.A. et al. A hybrid approach for learning concept hierarchy from Malay text using artificial immune network. Nat Comput 10, 275–304 (2011). https://doi.org/10.1007/s11047-010-9228-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-010-9228-7