A new approach to generate weighted fuzzy rules using genetic algorithms for estimating null values

doi:10.1016/j.eswa.2007.07.033

Expert Systems with Applications

Volume 35, Issue 3, October 2008, Pages 905-917

https://doi.org/10.1016/j.eswa.2007.07.033 Get rights and content

Abstract

In this paper, we present a new method to generate weighted fuzzy rules using genetic algorithms for estimating null values in relational database systems, where there are negative functional dependency relationships between attributes. The proposed method can get higher average estimated accuracy rates than the method presented in [Chen, S. M., & Huang, C. M. (2003). Generating weighted fuzzy rules from relational database systems for estimating null values using genetic algorithms. IEEE Transactions on Fuzzy Systems, 11(4), 495–506].

Introduction

In traditional relational database systems, there are some functional dependency relationships among attributes. For example, assume that there is a relation R with attributes A and B. If the value of attribute A of a tuple increases and the value of attribute B of the tuple increases, then we say that there is a positive dependency relationship between the attribute A and the attribute B. On the other hand, if the value of attribute A of a tuple increases and the value of attribute B of the tuple decreases, then we say that there is a negative dependency relationship between the attribute A and the attribute B. In recent years, relational database systems are widely used in enterprises. However, a relational database system will not operate properly if it exists some null values of attributes in the system. Cheng and Wang (2006) pointed out that a basic problem with null values is that they have many plausible interpretations. They also pointed out that the various manifestations of null values can be reduced to two basic interpretations (Zaniolo, 1984). That is,

(1)
The unknown interpretation: A value exists but it is not known.
(2)
The nonexistent interpretation: A value does not exist.

In recent years, some methods (Chen and Chen, 2000, Chen and Yeh, 1997, Chen and Huang, 2003, Chen and Lee, 2003, Chen and Lee, 2005, Chen and Hsiao, 2005, Cheng and Wang, 2006) have been presented to estimate null values in relational database systems based on the fuzzy set theory (Zadeh, 1965, Chen, 1988).

Chen and Chen (2000) presented a method to estimate null values in distributed relational database systems, where an “employee database” is used to illustrate their method for estimating the null values of the attributes “Degree” and “Salary”, respectively. However, there is a drawback in the method presented in (Chen & Chen, 2000), i.e., the fuzzy rules are given directly by experts and are not generated by the system automatically. Chen and Yeh (1997) presented a method to estimate null values in relational database systems by generating fuzzy rules from relational database systems. They proposed a fuzzy concept learning system (FCLS) algorithm to construct a fuzzy decision tree from the “employee database”, and then generate fuzzy rules automatically from the constructed fuzzy decision tree for estimating the null values of the attribute “Salary” of the employee database. Chen and Huang (2003) presented a method to estimate null values in relational database systems using the fuzzy set theory and genetic algorithms (Holland, 1975) to adjust the weight of attributes in the antecedent part of the generated fuzzy rules, where the “employee database” is used to illustrate their method for estimating the null values of the attribute “Salary”. Chen and Lee (2003) presented a method to generate fuzzy rules from relational database systems for estimating null values based on the concept of “coefficient of determination” and “regression equations” of the statistics, where the “employee database” is used to illustrate their method for estimating the null values of the attribute “Salary”. Chen and Lee (2005) presented a method for estimating null values in relational database systems based on genetic algorithms. It tunes the membership functions of the linguistic values of the attributes in the “employee database” for estimating the null values of the attribute “Salary”. Chen and Hsiao (2005) presented a method to estimate null values in relational database systems based on automatic clustering techniques, where the “employee database” is used to illustrate their method for estimating the null values of the attribute “Salary”. Cheng and Wang (2006) presented an approach for estimating null values in relational database systems using clustering techniques, where the “employee database” is used to illustrate their method for estimating the null values of the attribute “Salary”. However, these methods do not consider the situation in which there are negative dependency relationships between attributes. Therefore, it is necessary to develop a new method for estimating null values in relational database systems in which there are negative dependency relationships between attributes.

In this paper, we present a new method to generate weighted fuzzy rules using genetic algorithms for estimating null values in relational database systems having negative functional dependency relationships between attributes. The difference between the proposed method and the existing methods is that it uses genetic algorithms rather than clustering techniques for estimating null values in relational database systems. The proposed method gets higher average estimated accuracy rates than the method presented in (Chen & Huang, 2003).

The rest of this paper is organized as follows. In Section 2, we briefly review basic concepts of genetic algorithms (Holland, 1975). In Section 3, we present a method to estimate null values in relational database systems by tuning the weights of attributes. In Section 4, we use the “Benz secondhand car database” (Huang & Chen, 2002) to make an experiment to compare the average estimated error rate of the proposed method with the method presented in (Chen & Huang, 2003). The conclusions are discussed in Section 5.

Section snippets

Basic concepts of genetic algorithms

The concept of genetic algorithms was proposed by Holland (1975), which is based on the theory of evolution proposed by Charles Darwin. It can find optimum solutions to solve problems in a way similar to the evolution process of a species. In a genetic algorithm, we encode the parameters of a solution into a numerical stream, where the numerical stream is called a chromosome. The basic element of a chromosome is called a gene. A genetic algorithm uses a fitness function to calculate the degree

A new method for estimating null values in relational database systems using genetic algorithms

In this session, we present a new method to generate weighted fuzzy rules using a genetic algorithm for estimating null values in relational database systems, where there are negative functional dependency relationships between attributes. In a genetic algorithm, we must define the format of a chromosome in a population. For example, we use the relation of “Secondhand Cars” shown in Table 1 to describe how to define the chromosomes. Fig. 1 shows the membership functions of the linguistic terms

Experimental results

Assume that there is a relation in a relational database containing a null value as shown in Table 4, where Table 4 is derived from Table 1 by letting the value of the attribute “Price” of tuple T₁ be a null value.

In order to estimate the null value of the attribute “Price” of the tuple T₁ whose Car-ID is 1, we must find a tuple that is closest to tuple T₁. The process for computing the degree of closeness of the tuple T₁ with respect to the other tuples is described as follows. Take tuple T₂

Conclusions

In this paper, we have presented a new method to generate weighted fuzzy rules using genetic algorithms for estimating null values in the “Benz Secondhand Car” database, where there are negative functional dependency relationships between attributes. From Table 6, we can see that the proposed method has smaller average estimated error rates than the method presented in (Chen & Huang, 2003) with respect to different numbers of training instances and different numbers of testing instances. That

Acknowledgement

This work was supported in part by the National Science Council, Republic of China, under Grant NSC 95-2221-E-011-117-MY2.

References (19)

S.M. Chen et al.
A new method to estimate null values in relational database systems based on automatic clustering techniques
Information Sciences
(2005)
L.A. Zadeh
Fuzzy Sets, Information and Control
(1965)
C. Zaniolo
Database relations with null values
Journal of Computer Systems and Science
(1984)
S.M. Chen
A new approach to handling fuzzy decisionmaking problems
IEEE Transactions on Systems, Man, and Cybernetics
(1988)
S.M. Chen et al.
Estimating null values in the distributed relational databases environment
Cybernetics and Systems
(2000)
S.M. Chen et al.
Generating fuzzy rules from relational database systems for estimating null values
Cybernetics and Systems
(1997)
S.M. Chen et al.
Generating weighted fuzzy rules from relational database systems for estimating null values using genetic algorithms
IEEE Transactions on Fuzzy Systems
(2003)
S.M. Chen et al.
A new method to generate fuzzy rules from relational database systems for estimating null values
Cybernetics and Systems
(2003)
S.M. Chen et al.
Estimating null values in relational database systems based on genetic algorithms
Cybernetics and Systems
(2005)

There are more references available in the full text version of this article.

Cited by (15)

Fuzzy functional dependencies and linguistic interpretations employed in knowledge discovery tasks from relational databases
2020, Engineering Applications of Artificial Intelligence
Citation Excerpt :
It implies that we have to adjust the relation into the third normal form by creating a new table. Thirdly, FDs could be converted into the if-then rules to support decision making (Chen and Huang, 2008; Hudec et al., 2014a). Thus, revealed FDs could be used in two main ways: adjusting database structure (designers are not aware of all dependencies during the design phase) and providing information for decision making.
Knowledge discovery from databases copes with several problems including the heterogeneity of data and interpreting the solution in an understandable and convenient form for domain experts. Fuzzy logic approaches based on the computing with words paradigm are very appealing since they offer the possibility to express useful knowledge from a large volume of data by linguistic terms, which are easily understandable for diverse users. In this paper, the novel descriptive data mining algorithm based on fuzzy functional dependencies has been proposed. In the first step, data are fuzzified, which ensures the same manipulation of crisp and fuzzy data. The data mining step is based on revealing fuzzy functional dependencies among considered attributes. In the final step, the mined knowledge is interpreted linguistically by the fuzzy modifiers and quantifiers. The proposed algorithm has been explained on illustrative data and tested on real-world dataset. Finally, its benefits, weak points and possible future research topics are discussed.
Jointly optimizing microgrid configuration and energy consumption scheduling of smart homes
2019, Swarm and Evolutionary Computation
Citation Excerpt :
The optimum design of microgrid systems is a hot topic and there is a rich literature dedicated to this topic. Genetic algorithm (GA) that imitates the genetic process of biological organisms, is an effective optimization method to provide solutions to intricate real world scenarios, even microgrid configuration [16,17]. Senjyu et al. [18] configure a generating system in isolated island consisting of diesel generators, wind turbine generators, PV system and batteries.
In this paper, we formulate joint optimization of microgrid configuration and energy consumption scheduling as a leader-follower Stackelberg game to model the coordination between two rational decision makers of microgrid configuration and energy consumption scheduling. The microgrid configuration decision, as the leader, is modeled as an upper-level optimization problem for optimal installed numbers of wind turbines, photovoltaic (PV) units, micro-turbines and batteries. The energy consumption scheduling, as the follower, is modeled as a lower-level optimization problem, which responds to decisions of the upper level in order to determine the optimal appliance scheduling. A bi-level nonlinear programming model is formulated for the Stackelberg game. To solve this optimization model, four bi-level hierarchical algorithms with the combination of different evolutionary algorithms are implemented and compared. A case study of microgrid configuration of smart building is employed to demonstrate the feasibility and advantage of the proposed game-theoretic model.
A Sequential Linear Programming algorithm for economic optimization of Hybrid Renewable Energy Systems
2019, Journal of Process Control
Citation Excerpt :
Various optimization techniques for HRES optimization have been reported in literature. The most common ones are genetic algorithm (GA) [21–23,5], simulated annealing (SA) [24], and particle swarm optimization (PSO) [25–27]. There are also possible promising techniques for future use in HRES sizing, such as ant colony optimization (ACO) [28] or artificial immune system (AIS) algorithm [29].
Combining renewable energy sources, as photovoltaic arrays (PV), wind turbine (WT), biomass fuel generators (BM), with back-up units to form a Hybrid Renewable Energy System (HRES) can provide a more economic and reliable energy supply architecture compared to the separate usage of such units. In this work an optimization tool for a general HRES is developed: it generates an operating plan over a specified time horizon of the setpoints of each device to meet all electrical and thermal load requirements with possibly minimum operating costs. A large number of devices, such as conventional and renewable source generators, mandatory and deferrable/adjustable electrical loads, batteries, combined heat and power configurations are modeled with high fidelity. The optimization tool is based on a Sequential Linear Programming (SLP) algorithm, equipped with trust region, which is able to efficiently solve a general nonlinear program. A case study of a real HRES in Tuscany is presented to test the major functionalities of the developed optimization tool.
Application of Artificial Intelligence Methods for Hybrid Energy System Optimization
2016, Renewable and Sustainable Energy Reviews
Citation Excerpt :
A list of this software for the design of an HES is presented in Table 2. One of the optimization methods operates in terms of the genetic process for biological mechanisms and is called GAs, which have the ability to present a problem-solving method for difficult real-world problems [47,48]. Holland first represented the concept of GAs [49], and afterward, it was widely utilized in many applications, case studies, and information mining.
Consciousness of the need to decrease our unnatural weather changes and of the critical increase in the costs of traditional sources of energy have motivated many nations to provide innovative energy strategies that promulgate renewable energy systems. For example, solar, wind and hydro related energies are renewable energy sources, and they are environmentally friendly with the potential for broad use. All of the load requirement conditions in comparison with single usage can provide more economical and dependable electricity, as well as environmentally friendly sources, by compounding such renewable energy sources using backup units to shape a hybrid scheme. Sizing the hybrid system elements optimally is one of the most important matters in this type of hybrid system, which could sufficiently meet all of the load demands with a minor financial investment. Although a number of studies have been performed on the optimization and sizing of hybrid renewable energy systems, this study presents a full analysis of Artificial Intelligence optimum plans in the literature, making the contribution of penetrating extensively the renewable energy aspects for improving the functioning of the systems economically.
Optimum design of hybrid renewable energy systems: Overview of different approaches
2012, Renewable and Sustainable Energy Reviews
Citation Excerpt :
For a detailed literature survey specifically on commercially available software tools for the performance evaluation of hybrid renewable energy systems, the readers are addressed to Ref. [78]. GA is an optimization method based on the genetic process of biological organisms [79,80]. By mimicking this process, GA has capability to provide solutions to complex real world problems.
Public awareness of the need to reduce global warming and the significant increase in the prices of conventional energy sources have encouraged many countries to provide new energy policies that promote the renewable energy applications. Such renewable energy sources like wind, solar, hydro based energies, etc. are environment friendly and have potential to be more widely used. Combining these renewable energy sources with back-up units to form a hybrid system can provide a more economic, environment friendly and reliable supply of electricity in all load demand conditions compared to single-use of such systems. One of the most important issues in this type of hybrid system is to optimally size the hybrid system components as sufficient enough to meet all load requirements with possible minimum investment and operating costs. There are many studies about the optimization and sizing of hybrid renewable energy systems since the recent popular utilization of renewable energy sources. In this concept, this paper provides a detailed analysis of such optimum sizing approaches in the literature that can make significant contributions to wider renewable energy penetration by enhancing the system applicability in terms of economy.
Extracting linguistic rules from data sets using fuzzy logic and genetic algorithms
2012, Neurocomputing
Linguistic rules in natural language are useful and consistent with human way of thinking. They are very important in multi-criteria decision making due to their interpretability. In this paper, our discussions concentrate on extracting linguistic rules from data sets. In the end, we firstly analyze how to extract complex linguistic data summaries based on fuzzy logic. Then, we formalize linguistic rules based on complex linguistic data summaries, in which, the degree of confidence of linguistic rules from a data set can be explained by linguistic quantifiers and its linguistic truth from the fuzzy logical point of view. In order to obtain a linguistic rule with a higher degree of linguistic truth, a genetic algorithm is used to optimize the number and parameters of membership functions of linguistic values. Computational results show that the proposed method is an alternative method for extracting linguistic rules with linguistic truth from data sets.

View all citing articles on Scopus

View full text

A new approach to generate weighted fuzzy rules using genetic algorithms for estimating null values

Abstract

Introduction

Section snippets

Basic concepts of genetic algorithms

A new method for estimating null values in relational database systems using genetic algorithms

Experimental results

Conclusions

Acknowledgement

Information Sciences

Fuzzy Sets, Information and Control

Journal of Computer Systems and Science

A new approach to handling fuzzy decisionmaking problems

IEEE Transactions on Systems, Man, and Cybernetics

Estimating null values in the distributed relational databases environment

Cybernetics and Systems

Generating fuzzy rules from relational database systems for estimating null values

Cybernetics and Systems

Generating weighted fuzzy rules from relational database systems for estimating null values using genetic algorithms

IEEE Transactions on Fuzzy Systems

A new method to generate fuzzy rules from relational database systems for estimating null values

Cybernetics and Systems

Estimating null values in relational database systems based on genetic algorithms

Cybernetics and Systems