Elsevier

Information Sciences

Volume 417, November 2017, Pages 20-38
Information Sciences

A hybrid genetic-ant colony optimization algorithm for the word sense disambiguation problem

https://doi.org/10.1016/j.ins.2017.07.002Get rights and content

Abstract

Word sense disambiguation (WSD) is a natural language processing problem that occurs at the semantic level. It consists of determining the sense of a polysemous word that is suitable in a particular context. WSD has been addressed using several approaches, including metaheuristic algorithms. We propose hybrid algorithms for WSD that consist of a self-adaptive genetic algorithm (SAGA) and variants of ant colony optimization (ACO) algorithms: max-min ant system (MMAS) and ant colony system (ACS). SAGA is used to automatically tune the parameters of MMAS and ACS. The ACO algorithms are adapted based on a combination of semantic relatedness between sequences of senses corresponding to the context words and semantic relatedness between the sense of a target word and the sense of a context word. We evaluated the performance of the two ACO algorithms (MMASWSD and ACSWSD) and their hybridization with SAGA (GMMASWSD and GACSWSD) on fine-grained and coarse-grained corpora, and compared them with the best-performing algorithms. The empirical results indicate that GMMASWSD outperformed the other variants and all of the rival algorithms on the fine-grained corpora. However, GMMASWSD did not achieve the best performance on the coarse-grained corpus, even though its performance was close to that of the best algorithm.

Introduction

Word sense disambiguation (WSD) addresses one of the main characteristics of natural language applications (e.g., machine translation, question answering, and document classification)—the problem of ambiguous words (words that have more than one meaning). Thus, WSD is one of the main research directions in the field of natural language processing. The goal of methods to solve the WSD problem is to assign a proper meaning to a polysemous word in a context by selecting the appropriate meaning from an inventory of word meanings. For example, the word “partner” can have two very different meanings, for example, “within a marriage relationship” versus “within a law or accounting firm”. The meaning or sense inventories contain words, their senses, and extra information about the words. The two main types of inventories used in methods intended to perform WSD are structured inventories (e.g., thesauri and ontologies) and unstructured inventories (e.g., corpora). One of the more common inventories used in WSD is WordNet (WN).1 In WN, words are arranged in groups of synonyms (named synsets). Each synset contains a gloss, which is the textual definition, and there are lexical and semantic relations between a pair of synsets. Following the original work on WN, some additions have been made to enhance the relations between pairs of synsets. In eXtended WordNet (XWN),2 the words in the gloss of a synset are annotated with their senses. Then, relations are created between the synset and the senses of these words. Another work, called WordNet++ (WNPP), produced relations between noun synsets in WN based on Wikipedia [1]. Magnini and Cavaglia developed the WordNet Domain (WND),3 in which each synset in WN is annotated with at least one domain, such as GEOLOGY, ECONOMY, or TRANSPORT.

The amounts of supervision and knowledge that the WSD methods require are the main difference between these methods. WSD methods can be classified into knowledge-based methods, which focus on the use of knowledge lexicon resources [2], [3], [4], [5], [6]; machine learning-based methods, which use evidence extracted from annotated and unannotated corpora and statistical models [7], [8]; and other methods, including domain-driven disambiguation, for example, [9], and metaheuristics algorithms, such as those in [10], [11], [12], [13]. To evaluate such methods, an international WSD competition originally called SensEval4 (now renamed to SemEval) has been held every year since 1998. In each competition, standard manually sense-annotated corpora (standard corpora) are used to evaluate and compare the performances of different WSD methods. The related corpora and their corresponding tasks are SensEval-2: task#1, fine-grained English All-Words (S2FGAW) [14]; SensEval3: task#1, fine-grained English All-Words (S3FGAW) [15]; SemEval2007: Task#17_Subtask#3, fine-grained English All-Words (S07FGAW) [16]; and SemEval2007: Task#7; coarse-grained English All-Words (S07CGAW) [17].

Swarm intelligence (SI) algorithms imitate the collective intelligence inspired from the behaviour of a swarm of social insects (such as birds, bees, and ants). Insect swarms can solve some problems efficiently, such as finding food; however, the swarms have no supervising or controlling entity. Rather, each individual—despite its limited capacity—helps to solve many difficult problems through simple cooperation with other individuals of the swarm. The SI algorithms implement these characteristics of self-organization and decentralized control [18]. Ant colony optimization (ACO) [18] algorithms are SI algorithms inspired by pheromone-based ant foraging strategies. Several variants of ACO algorithms have been proposed in the literature, including ant system (AS) [19], ant colony system (ACS) [20] and max-min ant system (MMAS) [21]. ACO algorithms have been successfully used for many combinatorial optimization problems, and they provide competitive solutions [18]. There is a large body of literature related to using ACO techniques for solving such problems, including the travelling salesperson problem (TSP) [19], [20], [22], the quadratic assignment problem [23], and the vehicle routing problem (VRP) [24]. ACO algorithms have been also explored for solving various natural language processing (NLP) tasks and applications [25], including the WSD problem [26]. The promising results obtained by Nguyen and Ock [26] with their ACO-TSP algorithm for solving the WSD problem demonstrate the effectiveness of this approach and encourage further investigations of other ACO variants for this problem.

In this article, we propose hybrid genetic-ant colony optimization algorithms for solving the WSD problem. We studied two well-known ACO algorithms that have achieved competitive results for a variety of problems [18]: the ACS and the MMAS algorithms. We adapted some of their rules to make them more efficient in solving the WSD problem, and we used a self-adaptive GA (SAGA) [11] to automatically adjust their numerous parameters. The dynamic selection of the crossover and mutation operators and their probabilities in SAGA led to an algorithm that outperformed a standard GA on different WSD corpora. Its hybridization with ACO algorithms derives a class of algorithms with very few parameters to set. We experimentally evaluated the performance of the proposed algorithms on both fine-grained and coarse-grained benchmark corpora, and we compared them with the best rival algorithms. The obtained results show that one of the proposed variants (GMMASWSD) outperformed all the rival algorithms on the fine-grained corpora.

The remainder of this paper is organized as follows. Section 2 presents the related works. Section 3 outlines ACO algorithms, and it specifically describes two algorithms proposed for WSD, called MMASWSD and ACSWSD. Section 4 describes two hybrid algorithms called GMMASWSD and GACSWSD. Section 5 presents the experimental results and discusses their significance. Finally, Section 6 concludes the paper and provides some suggestions for future work.

Section snippets

Related work

In this section, we present an overview of the best-performing algorithms for solving the WSD problem that participated in the SensEval and SemEval competitions (SensEval2, SensEval3, and SemEval2007), as well as the recently proposed algorithms.

Several studies have investigated graph-based methods for solving the WSD problem, including those of Navigli and Velardi [3], Sinha and Mihalcea [27], Navigli et al. [17], Agirre and Soroa [28], and Agirre et al. [5]. Sinha and Mihalcea [27] used a

Ant colony optimization

ACO is a category of metaheuristic algorithms that simulate the foraging behaviour of an ant colony in the real world [18]. Briefly, the ants aim to discover the shortest path between a food source and the nest by using a chemical called a pheromone that evaporates over time. The ants are attracted to paths with more pheromone, implying that a high number of ants have used that path in the past (positive feedback), while the pheromone evaporates over time. In ACO, a colony of artificial ants

Hybrid genetic-ACO algorithms for WSD

We propose automatically tuning the parameters of the MMASWSD and ACSWSD algorithms using a GA, which provides two hybrid algorithms called GMMASWSD and GACSWSD, respectively. The use of a GA on top of an ACO algorithm was described by Botee and Bonabeau [50]. We used a recently introduced self-adaptive GA called SAGA [11]. This algorithm features automated tuning of its crossover and mutation operators (uniform crossover and mutation vs. single-point crossover and mutation) and their

Results and discussion

We conducted a set of experiments to evaluate the proposed algorithms. The following sections present the corpora that we used to evaluate the proposed algorithms, the experimental setup, the performance measures, the obtained results and their comparison with those of the best rival algorithms.

Conclusion and future work

We presented two variants of ACO algorithms and their hybridization with a GA for solving the WSD problem. A self-adaptive GA (SAGA) is used to tune the parameters of the ACO algorithms: MMAS and ACS. The rules of MMAS and ACS were modified to better fit the nature of the WSD problem. We evaluated the performance of the four algorithms on standard corpora, including fine-grained and coarse-grained corpora. The GMMASWSD algorithm (hybridization of SAGA and MMAS) significantly outperformed the

Acknowledgements

This work was supported by the Research Center of the College of Computer and Information Sciences, King Saud University. The authors are grateful for this support. The authors would also like to thank the anonymous reviewers for their valuable and constructive comments.

References (50)

  • R. Sawhney et al.

    A modified technique for Word Sense Disambiguation using Lesk algorithm in Hindi language

    Proceedings of the International Conference on Advances in Computing, Communications and Informatics (ICACCI)

    (2014)
  • E. Agirre et al.

    UBC-ALM: Combining k-NN with SVD for WSD

    Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval ’07)

    (2007)
  • A. Novischi et al.

    LCC-WSD: System description for English coarse grained all words task at SemEval 2007

    Proceedings of the 4th International Workshop on Semantic Evaluations

    (2007)
  • B. Magnini et al.

    The role of domain information in word sense disambiguation

    Nat. Lang. Eng.

    (2002)
  • W. Alsaeedan et al.

    A self-adaptive genetic algorithm for the word sense disambiguation problem

  • A. Bakhouche et al.

    Ant colony algorithm for arabic word sense disambiguation through english lexical information

    Int. J. Metadata Semant. Ontol.

    (2015)
  • M. Palmer et al.

    English tasks: All-words and verb lexical sample

    The Proceedings of the 2nd International Workshop on Evaluating Word Sense Disambiguation Systems (SENSEVAL’01)

    (2001)
  • B. Snyder et al.

    The English all-words task

    Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semanti Analysis of Text, ACL, Barcelona, Spain

    (2004)
  • S.S. Pradhan et al.

    SemEval-2007 task 17: English lexical sample, SRL and all words

    Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval ’07)

    (2007)
  • R. Navigli et al.

    SemEval-2007 task 07: Coarse-grained English all-words task

    Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval ’07)

    (2007)
  • E. Bonabeau et al.

    Swarm intelligence : From natural to artificial systems

    (1999)
  • M. Dorigo et al.

    Ant system: optimization by a colony of cooperating agents

    IEEE Trans. Syst. Man Cybern. Part B

    (1996)
  • M. Dorigo et al.

    Ant colony system: a cooperative learning approach to the traveling salesman problem

    IEEE Trans. Evol. Comput.

    (1997)
  • V. Maniezzo et al.

    The ant system applied to the quadratic assignment problem

    IEEE Trans. Knowl. Data Eng.

    (1999)
  • K.M. Sim et al.

    Ant colony optimization for routing and load-balancing: survey and new directions

    IEEE Trans. Syst. Man Cybern. Part A

    (2003)
  • Cited by (29)

    • Parameter adaptation-based ant colony optimization with dynamic hybrid mechanism

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Mavrovouniotis et al. (2017) presented a new memetic ACO with a local search operator. Alsaeedan et al. (2017) presented hybrid algorithms based on consisting of a self-adaptive GA and variants of ACO. Wang et al. (2018) presented a hybrid ACO with saving algorithm and 2-Opt.

    • Minimizing makespan in a Flow Shop Sequence Dependent Group Scheduling problem with blocking constraint

      2020, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      This is usually implemented by encoding parameter values into the solution genotype, which are evolved simultaneously (Bäck, 1997). The recent literature in the area of optimization problems gathered several contributions reporting applications of self-adaptive genetic algorithms in various fields (Pellerin et al., 2004; Subbaraj and Rajnarayanan, 2009; Mahdavi et al., 2011; Subbaraj et al., 2011; Lu et al., 2015; Shahsavar et al., 2015; Jie et al., 2017; Alsaeedan et al., 2017). Nevertheless, to the best of our knowledge, the present research represents the first time a novel self-adaptive genetic algorithm, operating in a parallel computational structure, is applied to a combinatorial issue such as FSDGS.

    View all citing articles on Scopus
    View full text