A new approach of rules extraction for word sense disambiguation by features of attributes

doi:10.1016/j.asoc.2014.10.037

Applied Soft Computing

Volume 27, February 2015, Pages 411-419

https://doi.org/10.1016/j.asoc.2014.10.037 Get rights and content

Highlights

•
A new approach of rule extraction by features of attributes is proposed.
•
Simple class exclusive attributes and composite class exclusive attributes are calculated.
•
The attributes are used in rule extraction for WSD of English preposition on.
•
The accuracy of WSD is improved. Comparative result shows the new approach has a few advantages over the well-formed SPOAD approach.

Abstract

Classification is an important issue in data mining and knowledge discovery. It is a significant issue to develop effective and easy approach of rule extraction for classification. A new approach of rule extraction by features of attributes is proposed in this article for word sense disambiguation (WSD). English preposition on is taken as a target word of WSD, a data set of 600 samples is randomly selected from a 350,000 words corpus. Semantic and syntactic features are extracted from the context, and the corresponding formal context is generated. The rules for WSD of English preposition on are extracted based on the theoretical descriptions and calculation of the simple class exclusive attributes and composite class exclusive attributes. The extracted rules are used in the WSD of English preposition on, and the accuracy reaches 93.2%. The results of the comparative analysis show that the proposed feature of attribute approach is simpler, more effective and easier to use than the existing well-formed structural partial ordered attribute diagram approach.

Graphical abstract

Introduction

Rule extraction is an important issue in natural language processing. It is a process of deriving a symbolic description of a model for classification. It simulates the behavior of the model in a concise and comprehensible form. Rule extraction gives insight into the logic behind the model. Many researchers have studied rule extraction from different perspectives. In the aspect of rule extraction from different models, Setiono et al. [1] proposed an approach for rule extraction from minimal neural networks for credit card screening. Ozbakir et al. [2] proposed an approach for rule extraction from artificial neural networks to discover reasons of quality defects in fabric production. Chorowski and Zurada [3] presented an eclectic approach for rule extraction from neural network as decision diagrams. Zhu and Hu [4] proposed a rule extraction technique by support vector machines through analyzing the distribution of samples. Chaves et al. [5] proposed a new method for fuzzy rule extraction from trained support vector machines for classification of multi-class problems. Tang et al. [6] presented a method of extracting classification rules from concept lattice. Li et al. [7] extracted rules for word sense disambiguation (WSD) of English modal verbs from a structural partial ordered attribute diagram. Asaduzzaman et al. [8] reported a method of finding out interesting rules from heterogeneous internet search histories.

In the aspects of the algorithms for rule extraction, Liu et al. [9] proposed an algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. Zhao and Sun [10] proposed an approach to rough set rule extraction from a decision system using conditional information entropy. He et al. [11] proposed a guidance rule extraction algorithm for getting the attribute information along the quickest direction and achieving the intelligent information analysis. Sun [12] developed an algorithm framework for rule extraction with different levels of knowledge granular from decision system in order to delete redundant features from decision system and highlight the most efficient features to construct classifiers. Ahmed and Carson-Berndsen [13] presented a method for automatic rule extraction for modeling pronunciation variation in order to model pronunciation variation in phoneme based continuous speech recognition at language model level. Sarkar et al. [14] introduced a genetic algorithm-based rule extraction system to improve prediction accuracy over any classification problem irrespective to domain, size, dimensionality and class distribution. They [15], [16], [17] also proposed a hybrid approach to design efficient learning classifiers and an accuracy-based learning system to extract efficient rule set for the implement of a multi-category classification, and select informative rules by using parallel genetic algorithm. Rodriguez et al. [18] presented an efficient distributed genetic algorithm for classification rule extraction in data mining. Wang et al. [19] introduced a method for rule extraction based on granular computing in order for default diagnosis of a helicopter transmission system. Koklu et al. [20] presented a new method of rule extraction from medical related datasets using artificial immune system algorithm. Huang et al. [21] proposed a method based on clustering artificial fish-swarm algorithm and rough set theory to extract decision rules. Costro et al. [22] described a rule extraction algorithm based on fuzzy logic, named linguistic rules in fuzzy inductive reasoning, to derive linguistic rules from a fuzzy inductive reasoning model. Chen et al. [23] presented an integrated mechanism for simultaneous extraction of fuzzy rules and selection of useful features in order to solve the classification problem. Cheng [24], [25] studied the approached for rule extraction in fuzzy information systems based on rough set theory.

The previous studies in rule extraction have solved many practical problems and made a great progress in natural language processing. However, most of them have focused on the rule extractions for solving problems in engineering, business, medical diagnosis, and fuzzy information system etc. Up to now, few of them are related to WSD and no studies on the rule extraction by features of attributes have been found. In addition, there is a clear need in natural language processing to develop approaches which can extract effective and high quality rules for classification with less effort. Therefore, a new approach of rule extraction by features of attributes is proposed in this article for WSD of English preposition, with on as a target word, in order to simplify the process of rule extraction and improve the qualities of the extracted rules and the accuracy of WSD. The proposed approach may be applied to the WSD of other English prepositions, and it can also be used in different fields, such as pattern recognition, knowledge discovery, data mining, default diagnosis, decision support system and intelligent robot. The result of the study may provide references for natural language processing and understanding, semantic studies of prepositions and WSD of other part of speech.

The rest of the article includes the following contents. Section 2 presents the senses of on occurred in the corpus and the granularity of the senses of on in this study. Section 3 gives the theoretical descriptions of formal context and features of some attributes. Section 4 gives the procedure of calculating simple class exclusive attribute and composite class exclusive attributes. Section 5 explains the process of generation of the formal context of English preposition on. Section 6 exhibits the process of rule extraction for WSD of on. Section 7 makes a comparison between two approaches of rule extraction; the feature of attribute approach and the structural partial ordered attribute diagram approach. Finally, Section 8 comes to the conclusions of the study.

Section snippets

Granularity of the senses of English preposition on

English preposition on is one of the most frequently used simple prepositions in the natural language. It may have about 20 senses, and it may mean differently in different contexts. For instance [26],

(1)
My mobile phone is on the table (on-place)
(2)
The meeting will be on Tuesday (on-time)
(3)
He made a lot of money on the deal (on-cause)
(4)
He walked on tiptoe (on-manner)
(5)
He had thrown me down on one hundred pitchforks (on-direction)
(6)
He would lead a violent assault on the jail (on-objective)
(7)
His role on base was

Theoretical descriptions of formal context and the features of some attributes

The new approach of rule extraction is based on the following theoretical descriptions of formal context (Definitions 1–3 [27]) and the features of some attributes (Definition 4–5 [28]):

Definition 1

A formal context K = (G, M, I) consists of two sets G and M and a relation I between G and M. The elements of G are called the objects and the elements of M are called the attributes of the context. I represents the relation between an object g and an attribute m, written as gIm or (g, m) ∈ I.

Definition 2

Let K = (G, M, I) be a

Calculation of features of attributes

In [28], different features of attribute are defined. In this study, only simple class exclusive attributes and composite class exclusive attributes are needed, and they are calculated by the following procedure.

1.
Determine the decision attribute set corresponding to different classes D₁, D₂,…, D_p; p≥2;
2.
Initialize i = 1;
3.
Calculate the object sets corresponding to the decision attributes h(D_i) = G_i;
4.
Suppose that the object set G_i includes n objects, M_c is the non-decision attribute set of a class.

Generation of formal context of English preposition on

A data set is constructed for WSD and rule extraction of on. It is composed of 600 samples, among which 200 samples are for on-time, 200 are for on-others and the rest 200 are for on-place. Different linguistic features are extracted from the context based on the sample sentences of the data sets. Semantic features include the mutual information (MI) between a preposition on and the followed noun, which is calculated by the following formula [29]: $MI (w_{1}, w_{2}) = \log \frac{P (w_{1}, w_{2})}{P (w_{1}) P (w_{2})}$ where w₁ and w₂

Rule extraction of English preposition on

Based on the theoretical description and calculation method of the features of attributes, the rules for WSD of on are extracted by the following steps:

(1)
Calculate all the simple class exclusive attributes and the composite class exclusive attributes of each of the 3 classes (senses of on). They constitute m in the concept pair (g, m), and their corresponding objects constitute g in the concept pair (g, m).
(2)
Carry out calculations for every two pairs in the pair set by the following algorithm. If C₁

A comparison of rule extraction of on by the feature of attribute approach and structural partial ordered attribute diagram approach

Both the feature of attribute approach and the structural partial ordered attribute diagram (SPOAD) approach are based on the theory of formal concept analysis and they can both be used for rule extraction. Since the SPOAD approach is a well-formed approach, a comparison is made between it and the new approach in order to see the merits of the new approach.

Conclusions

A new approach of rule extraction for word sense disambiguation (WSD) of English preposition by features of attributes is proposed. It is based on the theory of formal concept analysis and the descriptions and calculations of the simple class exclusive attribute and the composite class exclusive attribute. The approach is used in the rule extraction for WSD of English preposition on, and the accuracy of WSD reaches 93.2%. Compared with the well-formed SPOAD approach, the proposed feature of

Acknowledgements

This work is supported by the National Social Science Foundation of China under Grant No. 12BYY121 and by the Humanities and Social Sciences Foundation of the Ministry of Education of China under Grant No. 12YJA740096. It is also partially supported by National Natural Science Foundation of China under Grant No. 61074130. The authors gratefully acknowledge the supports.

References (31)

P. Zhu et al.
Rule extraction from support vector machines based on consistent region covering reduction
Knowl. Based Syst.
(2013)
B.K. Sarkar et al.
A genetic algorithm-based rule extraction system
Appl. Soft Comput. J.
(2012)
B.K. Sarkar et al.
A hybrid approach to design efficient learning classifiers
Comput. Math. Appl.
(2009)
B.K. Sarkar et al.
Selecting informative rules with parallel genetic algorithm in classification problem
Appl. Math. Comput.
(2011)
M. Rodriguez et al.
Efficient distributed genetic algorithm for rule extraction
Appl. Soft Comput. J.
(2011)
F. Castro et al.
On the extraction of decision support rules from fuzzy predictive models
Appl. Soft Comput. J.
(2011)
Y. Cheng et al.
Rule extraction based on granulation order in interval-valued fuzzy information system
Expert Syst. Appl.
(2011)
R. Setiono et al.
Rule extraction from minimal neural networks for credit card screening
Int. J. Neural Syst.
(2011)
L. Ozbakir et al.
Rule extraction from artificial neural networks to discover causes of quality defects in fabric production
Neural Comput. Appl.
(2011)
J. Chorowski et al.
Extracting rules from neural networks as decision diagrams
IEEE Trans. Neural Netw.
(2011)

A. da Costa et al.

Fuzzy rule extraction from support vector machines for multi-class classification

Neural Comput. Appl.

(2013)

J. Tang et al.

An algorithm of extracting classification rule based on classified concept lattice

H. Li et al.

Rule extraction for word sense disambiguation of English modal verb must

ICIC Express Lett.

(2012)

M. Asaduzzaman et al.

Extraction of interesting rules from internet search histories

J Softw.

(2011)

S. Liu et al.

Combined rule extraction and feature elimination in supervised classification

IEEE Trans. Nanobiosci.

(2012)

Cited by (22)

Building multi-subtopic Bi-level network for micro-blog hot topic based on feature Co-Occurrence and semantic community division
2020, Journal of Network and Computer Applications
Citation Excerpt :
Although this method is independent of the external knowledge base, it ignores the connection between feature words and knowledge base, and also affects the accuracy of feature words extraction. In recent years, the co-occurrence relationship of words has been studied deeply (Qu et al., 2018; Yu et al., 2015; Hai and Luo, 2006; Li et al., 2019a). This method takes into account the above two methods and has significant advantages.
The multi-subtopic is challenging to be understood timely and comprehensively due to micro-blog characteristics, such as low-value density, and fast update speed. For such an issue, this paper proposes a Multi-Subtopic Bi-level Network (MSBN) for micro-blog hot topics based on feature co-occurrence and semantic community division to support users understanding better the subject. First, the highlighted words are extracted by combining two coefficients including the micro-blog importance (e.g., the number of comments and the number of praises) and the time decay. The compound co-occurrence rates (i.e., global and local co-occurrence rates) are used to measure the correlation strength between any two highlighted words, while the global semantic of a micro-blog hot topic can be shown as a complex network whose nodes are the extracted feature words and edges are relations between any two feature words. Next, an improved weighted modularity function is proposed as a criterion for the community division. The complex network of a topic is divided into some semantic communities, where each is regarded as a subtopic of the given micro-blog topic. Subsequently, the genetic algorithm is used to calculate the maximum of weighted modularity and achieve community division of complex networks, so finally, the terminal location of each micro-blog in a different semantic community is obtained to draw regional location map and analyze the supporting propensity of each region to the micro-blog hot topic. Experimental results show that the proposed model can accurately and effectively represent the multi-subtopic of a micro-blog hot topic in the current time that supports users to discover and understand the micro-blog hot topic, allowing users to identify and understand the concerned differences among different regions for the same micro-blog hot topic.
Spreading semantic information by Word Sense Disambiguation
2017, Knowledge-Based Systems
Citation Excerpt :
The need to evaluate different tasks in NLP resulted in the creation of evaluation campaigns like for example SensEval1. The main goal of this campaign was initially to measure the strengths and weaknesses of WSD systems with regard to different words, different aspects of language and different languages [13], [14], [15], [16], [17], etc. However, subsequent campaigns added new tasks such as: semantic roles; web people search; affective text; etc.
This paper presents an unsupervised approach to solve semantic ambiguity based on the integration of the Personalized PageRank algorithm with word-sense frequency information. Natural Language tasks such as Machine Translation or Recommender Systems are likely to be enriched by our approach, which includes semantic information that obtains the appropriate word-sense via support from two sources: a multidimensional network that includes a set of different resources (i.e. WordNet, WordNet Domains, WordNet Affect, SUMO and Semantic Classes); and the information provided by word-sense frequencies and word-sense collocation from the SemCor Corpus. Our series of results were analyzed and compared against the results of several renowned studies using SensEval-2, SensEval-3 and SemEval-2013 datasets. After conducting several experiments, our procedure produced the best results in the unsupervised procedure category taking SensEval campaigns rankings as reference.
Cloning DRASiW systems via memory transfer
2016, Neurocomputing
Citation Excerpt :
The most interesting aspect of RE is that we can discover interesting relationships existing within the data given and what general conclusions can be drawn. The number of new applications and theoretical contributions makes RE still a very active and interesting research field [18–20]. Contribution to RE has been given even with WNN [21,22].
DRASiW is an extension of the WiSARD Weightless Neural Network (WNN) model with the capability of storing the frequencies of seen patterns during the training phase in an internal data structure called “mental image” (MI). Due to this capability, in a previous work it was demonstrated how to reversely process MIs in order to generate synthetic prototypes. Then, a training set composed of synthetic prototypes can be used to train new DRASiW systems (clones) with different architectures. In this paper we present a methodology to transfer memory between DRASiW systems, and we show how it is possible to generate clones of DRASiW systems with good classification capabilities within an acceptable loss of accuracy.
A new approach of attribute partial order structure diagram for word sense disambiguation of English prepositions
2016, Knowledge-Based Systems
Citation Excerpt :
Alam [18] proposed an algorithm for assigning the syntactic categories of the English preposition over. Yu et al. [19] proposed a new approach of rule extraction for the WSD by features of attributes. Xu et al. [20] studied the contribution of governors to the WSD of the English prepositions.
To improve the accuracy of word sense disambiguation (WSD) has been a significant issue, and to visualize the structure of a dataset to discover knowledge has been an urgent demand in natural language processing. In order to fulfill these two tasks simultaneously, a new approach of attribute partial order structure diagram is proposed. The principle of attribute partial order and the approach of attribute partial order structure diagram are described. The proposed approach is testified by the WSD of the English preposition over, using the dataset from SemEval corpus. Two well-accepted sense inventories for fine-grained WSD of the English prepositions are adopted. The formal contexts for the fine-grained WSD of the English preposition over are established and the corresponding attribute partial order structure diagrams are generated and used as the models of WSD. The tested results show that the accuracies of WSD of over by the proposed approach are significantly higher than the ones by the state of the art system. Moreover, the proposed approach can visualize the attribute partial order structure of the dataset, which can be used for knowledge discovery.
Transformer-Based Word Sense Disambiguation: Advancements, Impact, and Future Directions
2023, Proceedings - 11th IEEE International Conference on Intelligent Computing and Information Systems, ICICIS 2023
A Filter-APOSD approach for feature selection and linguistic knowledge discovery
2023, Journal of Intelligent and Fuzzy Systems

View all citing articles on Scopus

View full text

A new approach of rules extraction for word sense disambiguation by features of attributes

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Granularity of the senses of English preposition on

Theoretical descriptions of formal context and the features of some attributes

Calculation of features of attributes

Generation of formal context of English preposition on

Rule extraction of English preposition on

A comparison of rule extraction of on by the feature of attribute approach and structural partial ordered attribute diagram approach

Conclusions

Acknowledgements

Knowl. Based Syst.

Appl. Soft Comput. J.

Comput. Math. Appl.

Appl. Math. Comput.

Appl. Soft Comput. J.

Appl. Soft Comput. J.

Expert Syst. Appl.

Rule extraction from minimal neural networks for credit card screening

Int. J. Neural Syst.

Rule extraction from artificial neural networks to discover causes of quality defects in fabric production

Neural Comput. Appl.

Extracting rules from neural networks as decision diagrams

IEEE Trans. Neural Netw.

Fuzzy rule extraction from support vector machines for multi-class classification

Neural Comput. Appl.

An algorithm of extracting classification rule based on classified concept lattice

Rule extraction for word sense disambiguation of English modal verb must

ICIC Express Lett.

Extraction of interesting rules from internet search histories

J Softw.

Combined rule extraction and feature elimination in supervised classification

IEEE Trans. Nanobiosci.