The construction of fuzzy least squares estimators in fuzzy linear regression models

https://doi.org/10.1016/j.eswa.2011.04.131Get rights and content

Abstract

A new concept and method of imposing imprecise (fuzzy) input and output data upon the conventional linear regression model is proposed. Under the considerations of fuzzy parameters and fuzzy arithmetic operations (fuzzy addition and multiplication), we propose a fuzzy linear regression model which has the similar form as that of conventional one. We conduct the h-level (conventional) linear regression models of fuzzy linear regression model for the sake of invoking the statistical techniques in (conventional) linear regression analysis for real-valued data. In order to determine the sign (nonnegativity or nonpositivity) of fuzzy parameters, we perform the statistical testing hypotheses and evaluate the confidence intervals. Using the least squares estimators obtained from the h-level linear regression models, we can construct the membership functions of fuzzy least squares estimators via the form of “Resolution Identity” which is well-known in fuzzy sets theory. In order to obtain the membership degree of any given estimate taken from the fuzzy least squares estimator, optimization problems have to be solved. We also provide two computational procedures to deal with those optimization problems.

Highlights

► The fuzzy least squares estimators are constructed via “Resolution Identity”. ► We conduct the h-level (conventional) linear regression models. ► The sign of fuzzy parameters are determined by testing hypotheses. ► Membership degrees can be obtained by solving the optimization problems. ► Computational procedures are provided to obtain the membership degrees.

Introduction

In the real world, the data sometimes cannot be recorded or collected precisely due to human errors, machine errors or some other unexpected situations. For instance, the water level of a river cannot be measured in an exact way because of the fluctuation, and the temperature in a room is also not able to be measured precisely because of similar reasons. With this situation, fuzzy sets theory is naturally an appropriate tool in statistical models when fuzzy data have been observed. The more appropriate way to describe the water level is to say that the water level is around 30 m. The phrase “around 30 m” can be regarded as a fuzzy number 30. This is the main concern of this paper.

Since Zadeh (1965) introduced the concept of fuzzy sets, the applications of considering fuzzy data to the regression models have been proposed in the literature. Tanaka, Uejima, and Asai (1982) initiated this research topic. They also generalized their approaches to more general models in Tanaka and Warada, 1988, Tanaka et al., 1989, Tanaka and Ishibuchi, 1991. The book on fuzzy regression analysis edited by Kacprzyk and Fedrizzi (1992) gave an insightful survey. Chang and Ayyub (2001) gave the differences between the fuzzy regression and conventional regression analysis and Kim, Moskowitz, and Koksalan (1996) also compared both fuzzy regression and statistical regression conceptually and empirically.

In the approach of Tanaka et al. (1982), they considered the LR fuzzy data and minimized the index of fuzziness of the fuzzy linear regression model. Redden and Woodall (1994) compared various fuzzy regression models and gave the difference between the approaches of fuzzy regression analysis and conventional regression analysis. They also pointed out some weakness of the approaches proposed by Tanaka et al. Chang and Lee (1994) also pointed out another weakness of the approaches proposed by Tanaka et al. Peters (1994) introduced a new fuzzy linear regression models based on Tanaka’s approach by considering the fuzzy linear programming problem. Moskowitz and Kim (1993) proposed a method to assess the H-value in a fuzzy linear regression model proposed by Tanaka et al. Wang and Tsaur (2000) also proposed a new model to improve the predictability of Tanaka’s model. In this paper, we propose a fuzzy linear regression model, and then the h-level linear regression models will be created by taking the h-level set of fuzzy linear regression model. We shall see that the h-level linear regression models are conventional linear regression models. Therefore, the statistical techniques proposed in the conventional linear regression analysis can be invoked to discuss the h-level linear regression models.

For the least squares sense, Chang (2001) proposed a method for hybrid fuzzy least squares regression by defining the weighted fuzzy-arithmetic and using the well-accepted least squares fitting criterion. Celminš, 1987, Celminš, 1991 proposed a methodology for the fitting of differentiable fuzzy model function by minimizing a least squares objective function. Chang and Lee (1996) proposed a fuzzy regression technique based on the least squares approach to estimate the modal value and the spreads of LR fuzzy number. Jajuga (1986) calculated the linear fuzzy regression coefficients using a generalized version of the least squares method by considering the fuzzy classification of a set of observations and obtaining the homogeneous classes of observations. In this paper, the least squares estimators will be obtained from the h-level linear regression models. Using these least squares estimators, we can construct a fuzzy least squares estimators via the form of “Resolution Identity” which is introduced by Zadeh et al. (1975) and is well-known in fuzzy sets theory.

For optimization approach, Sakawa and Yano (1992) introduced three indices for equalities between fuzzy numbers. From these three indices, three types of multiobjective programming problems were formulated. Tanaka and Lee (1998) used the quadratic programming approach to obtain the possibility and necessity regression models simultaneously. The advantage of adopting a quadratic programming approach is to be able to integrate both the property of central tendency in least squares and the possibilistic property in fuzzy regression. In this paper, in order to obtain the membership value (confidence degree) of any given estimate taken from the fuzzy least squares estimator, the optimization problems have to be solved. We also provide two computational procedures to solve those optimization problems.

There are also some other interesting articles concerning the fuzzy regression analysis. Näther, 1997, Näther, 2000, Näther and Albrecht, 1990, Körner et al., 1998 introduced the concept of random fuzzy sets (fuzzy random variables) into the linear regression model, and developed an estimation theory for the parameters. Dunyak and Wunsch (2000) described a method for nonlinear fuzzy regression using a special training technique for fuzzy number neutral networks. Kim and Bishu (1998) used a criterion of minimizing the difference of the membership degrees between the observed and estimated fuzzy numbers. Yager (1982) used a linguistic variable to represent imprecise information for the regression models. Bárdossy (1990) proposed many different measures of fuzziness which must be minimized with respect to some suggested constraints.

In Section 2, we give some properties of fuzzy numbers. In Section 3, The techniques for solving fuzzy linear regression problems are proposed. We shall focus on the h-level linear regression models of fuzzy linear regression model, and then apply the conventional linear regression techniques to solve the h-level linear regression models. The membership functions of fuzzy least squares estimators in fuzzy linear regression model will be constructed according to the form of “Resolution Identity” in fuzzy sets theory. In Section 4, we shall develop two computational procedures to obtain the membership degree of any given estimate taken from the fuzzy least squares estimators. We also provide an example to clarify the theoretical results, and show the possible applications in linear regression analysis for imprecise data.

Section snippets

Fuzzy numbers

Let X be a universal set and A be a subset of X. Then the indicator (characteristic) function 1A defined by1A(x)=1ifxA0otherwisecan be used to represent the subset A of X. A fuzzy subset A of X proposed by Zadeh (1965) is defined by its membership function ξA:X[0,1]. We see that the concept of membership function is an extension of the indicator function 1A of A, since the indicator function 1A can also be regarded as a membership function of A. In this case, the indicator function is

Fuzzy linear regression analysis

The linear regression model is displayed as follows:Yi=β0+β1Xi1+β2Xi2++βp-1Xi,p-1+εifor i = 1,  , n, where εi are independent normal random variables with expectation E(εi) = 0 and variance V(εi) = σ2. LetX=1X11X1,p-11X21X2,p-11Xn1Xn,p-1andY=Y1Y2Yn.It is well-known that the least squares estimators are given byβ^=(XtX)-1XtY,where β^=β^0,β^1,,β^p-1.

Now we consider the fuzzy linear regression model as follows:Yi=β0β1Xi1β2Xi2βp-1Xi,p-11˜{εi},where Yi and Xij are nonnegative

Computational methods and example

Given a least squares estimate r of fuzzy least squares estimator β^j;1-α, we plan to know its membership degree h. If the decision-makers are comfortable with this membership degree h, then it will be reasonable to take this value r as the estimate of βj. In this case, the decision-makers can accept this value r as the estimate of βj with confidence degree h and confidence coefficient 1  α.

In order to obtain the confidence degree (membership degree) h of any given value r of β^j;1-α, it is

Conclusions

A fuzzy linear regression model is proposed in this paper for considering the fuzzy input and output data. In order to apply the conventional techniques in linear regression model. We propose the lower and upper h-level linear regression models. Since those two models are the conventional linear regression models, we can naturally obtain the least squares estimators of the lower and upper h-level linear regression models, respectively, using formula (2). In order to determine the nonnegativity

References (32)

  • D.T. Redden et al.

    Properties of certain fuzzy linear regression methods

    Fuzzy Sets and Systems

    (1994)
  • M. Sakawa et al.

    Multiobjective fuzzy linear regression analysis for fuzzy input–output data

    Fuzzy Sets and Systems

    (1992)
  • H. Tanaka et al.

    Possibilistic linear system and their application to the linear regression model

    Fuzzy Sets and Systems

    (1988)
  • H. Tanaka et al.

    Possibilistic linear regression analysis for fuzzy data

    European Journal of Operational Research

    (1989)
  • H. Tanaka et al.

    Identification of possibilistic linear systems by quadratic membership functions of fuzzy parameters

    Fuzzy Sets and Systems

    (1991)
  • H.-F. Wang et al.

    Resolution of fuzzy regression model

    European Journal of Operational Research

    (2000)
  • Cited by (23)

    • A fuzzy penalized regression model with variable selection

      2021, Expert Systems with Applications
      Citation Excerpt :

      These proposals consist of least-distance, combinative, innovative methods. In some methods, explanatory variables are either fuzzy (D’Urso et al., 2011; Hojati et al., 2005; Chachi and Roozbeh, 2017; Wu, 2011; Nasrabadi et al., 2005; Chachi et al., 2016; Hesamian et al., 2017; Bargiela et al., 2007; de Hierro et al., 2016; Kelkinnama and Taheri, 2012) or crisp numbers (Diamond, 1988; D’Urso and Gastaldi, 2000; Choi and Buckley, 2008; Petit-Renaud and Denux, 2004; Sohn, 2005; Hao and Chiang, 2008; Kelkinnama and Taheri, 2012; Zeng et al., 2017). Meanwhile, in D’Urso and Gastaldi (2000), D’Urso and Gastaldi proposed a method that produced response with non-fuzzy inputs and symmetric fuzzy output based on euclidean distance between two symmetrical fuzzy numbers.

    • A robust varying coefficient approach to fuzzy multiple regression model

      2020, Journal of Computational and Applied Mathematics
      Citation Excerpt :

      Since introduced by Tanaka et al. [9] the fuzzy regression methods have been widely used in real-life applications. Such methods can be categorized in two classes: 1- the observations of the predictors can be either fuzzy numbers [10–28], or 2- real value quantities [9,29–44]. Moreover, the robust fuzzy linear regression techniques have been successfully applied at the presence of outliers over the past decades [45–50].

    • Fuzzy Lasso regression model with exact explanatory variables and fuzzy responses

      2019, International Journal of Approximate Reasoning
      Citation Excerpt :

      Since the introduction of the fuzzy regression analysis by Tanaka et al. [45], the fuzzy univariate/multivariate regression methods have been employed in numerous real life applications with fuzzy data. The observations of the explanatory variables can be either fuzzy ([2,4–7,18,13,15,17,22,24,26,30,27–29,34,38,39,50–52]) or real numbers ([3,8–10,14,16,21,24,25,27,31,33,35,40,41,44,45,49,53,55]). Such fuzzy regression models provide various un penalized methods for multivariate regression analysis.

    • A partial-robust-ridge-based regression model with fuzzy predictors-responses

      2019, Journal of Computational and Applied Mathematics
      Citation Excerpt :

      The fuzzy regression analysis has been introduced by Tanaka et al. [8] and since then, the fuzzy regression methods have been successfully applied for many real applications under fuzzy data. The observations of the predictors can be either fuzzy numbers [9–26] or real numbers [8,16,27–42]. Moreover, at the presence of outliers, robust fuzzy linear regression analysis techniques have drawn attention and a significant progress has been made over past decades [43–48].

    View all citing articles on Scopus
    View full text