Stochastics and Statistics
Real-time fuzzy regression analysis: A convex hull approach

https://doi.org/10.1016/j.ejor.2010.10.007Get rights and content

Abstract

In this study, we present an enhancement of fuzzy regression analysis with regard to its aspect of real-time processing. Let us recall that fuzzy regression generalizes the concept of classical (numeric) regression in the sense of bringing additional capabilities that allow the model to deal with fuzzy (granular) data. We show that a convex hull method provides a useful vehicle to reduce computing time, which becomes of particular relevance in case of real-time data analysis. Our objective is to develop an efficient real-time fuzzy regression analysis based on the use of convex hull, specifically a Beneath-Beyond algorithm. In this algorithm, the re-construction of convex hull edges depends on incoming vertices while a re-computing procedure can be realized in real-time. We demonstrate the use of the developed enhancement to application to unit performance assessment and air pollution data. An important role of convex hull is contrasted with the limitations of linear programming used in the “standard” regression.

Introduction

In real-world optimization problems, which we routinely encounter in engineering, management, economy, medicine, psychology, and other disciplines, it is quite common to handle large amounts of various types of data (Tanaka et al., 1982, Bohm and Kriegel, 2000, Watada et al., 2001, Olafsson et al., 2008). In particular, real-time data analysis becomes more important given the growing demand to support efficient managerial practices, which call for on-line (real-time) timely results of data analysis (Gould et al., 2008). The main intent in this setting is to reduce computing overhead in supplying the results in real-time.

Soft computing techniques such as fuzzy logic have become an important alternative to realize effective data analysis, especially for decision making process (Watada et al., 2001, He et al., 2007, Kumar and Ravi, 2007). On the other hand, regression analysis is a generic statistical tool to explore and describe dependencies among variables. When forming a synergy between these two essential modeling methodologies, we arrive at fuzzy regression, which takes full advantage of the strengths of the contributing technologies; cf. Wang and Tsaur (2000). Linear programming (LP) was used to determine the location of the centres and the spreads of the fuzzy coefficients (fuzzy numbers) of the fuzzy regression hyperplane by minimizing an objective function which takes into consideration the total spread of the outputs of the model treated as fuzzy numbers (Tanaka et al., 1982, Sakawa and Yano, 1992, Yao and Yu, 2006).

Let us recall that convex hull is a fundamental concept present in many applications encountered in pattern recognition, image processing, and statistics. Convex hull is defined as the smallest convex polygon located in a multidimensional data space which contains all point set (vertices) (Shapiro, 2004). In other words, convex hull corresponds to the intuitive notion of a “boundary” of a set of points and as such can be used to approximate a shape of any object of complex geometry.

In real-world problems such as those encountered present in economics, bio-computing or engineering, we are concerned with the massive data sets of high dimensionality to analyze in a limited time or even in real-time (Taylor, 2008). In addition to experimental evidence of numeric nature, some data can be described in a linguistic term, which immediately invokes the concept of fuzzy sets (Chang and Ayyub, 2001).

Related to the methods used for real-time application where statistical regression has been shown to be highly relevant, we can also observe some shortcomings, which arise when dealing with several characteristics of the data or making some simplifying yet not necessarily fully legitimate assumptions, cf. (Shapiro, 2004):

  • i.

    Difficulties with a thorough verification of assumptions about data distributions,

  • ii.

    Vagueness present in the relationships between input and output variables,

  • iii.

    Ambiguity of events or non-Boolean degrees to which they occur, and

  • iv.

    Inaccuracy and distortion introduced by linearization.

In all these scenarios, the use of the “standard” regression might raise some hesitation. Here the use of fuzzy regression arises as a viable alternative. There are two general ways supporting the development process of fuzzy regression (Dom, 2007).

  • i.

    Models where the relationships among the variables is inherently fuzzy, and

  • ii.

    Models where the input (independent) variables themselves are fuzzy.

There are some arguments that are worth highlighting with regard to fuzzy regression. As an example, Wang and Tsaur (2000) concluded that there is no proper interpretation of fuzzy regression interval. Besides that expert could still provide an interval of possible values but also indicate the probability of occurrence of each one of them. Obviously, this approach would require more information (Aznar and Guijarro, 2007, Guo and Tanaka, 2010).

Given the explanation presented above, a convex hull approach can help implement real-time fuzzy regression analysis by serving as an alternative optimization vehicle.

The main objective of this research is to enhance the implementation procedure of real-time fuzzy regression analysis with the use of the convex hull approach. In addition, the adaptation of selected algorithm of this approach, specifically the Beneath-Beyond algorithm, helps address the limitations of the generic implementation of fuzzy regression when applied to the analysis of real-time data. The unit performance assessment and air pollution evaluation offer two numeric examples, which exemplify the performance of the proposed approach.

The paper is organized as follows. Section 2 serves as a related literature review, which includes real-time data analysis, brief review of the fundamentals of fuzzy regression, outlines an LP approach, computational geometry and convex hull approach, and related research studies. Next, Section 3 presents the real-time fuzzy regression model realized with the use of the convex hull approach. Section 4 is devoted to empirical experiments. Finally, Section 5 presents concluding remarks.

Section snippets

Real-time data analysis processing

Essentially, real-time data analysis refers to studies where data revisions (updates, successive data accumulation) or data release timing is important to a significant degree (Guide Jr., 2006). The most important properties for real-time data analysis are dynamic analysis and reporting, based on data entered into a system in a short interval before the actual time of the usage of the results (Ramli and Watada, 2009).

An important notion in real-time systems is event, that is, any occurrence

Real-time fuzzy regression analysis with a convex hull algorithm

In what follows, we present an adaptation of convex hull algorithm, specifically the Beneath–Beyond algorithm, for the purpose of fuzzy regression analysis.

Numerical studies

In order to present the process of real-time data analysis, we selected two sets of real-world data, which were obtained from two different problems. Both of the selected samples of data are divided into two groups. This process aims to show the simulation of a real-time situation, in which the first group is analyzed at the beginning of the procedure and the remaining sample of data is added next which mimics the real-time scenario. In addition, we also developed the ordinary regression for

Comparative analysis and discussion

The increase in sample size might cause computational difficulties in the implementation of the LP problem. Another problem might emerge when changes occur with regard to the variables themselves, thus the entire set of constraints must be reformulated. Therefore the computing complexity increases. The increase of computing complexity has been alleviated by the use of the proposed method.

To highlight the main features of regression and fuzzy regression model, we summarized the results in Table 7

Concluding remarks

We have developed the enhancement of fuzzy regression, which implements the convex hull method, specifically the Beneath-Beyond algorithm. In real-time processing where we faced with variable amounts of data, the proposed algorithm may perform fuzzy regression by reconstructing particular edges and considering new vertices for which the re-computing takes place.

We completed the enhancement of the convex hull in order to handle a real-time fuzzy regression. Several points are worth stressing

Acknowledgements

Special thanks go to the reviewers for providing constructive suggestions to improve the presentation of this study. The first author was supported by Ministry of Higher Education Malaysia (MOHE) under Skim Latihan Akademik IPTA (SLAI-UTHM) scholarship program at Graduate School of Information, Production and Systems, Waseda University, Fukuoka, Japan.

References (41)

  • S. Olafsson et al.

    Operations research and data mining

    European Journal of Operational Research

    (2008)
  • M. Sakawa et al.

    Fuzzy linear regression analysis for fuzzy input-output data

    Information Science

    (1992)
  • C. Stahl

    A strong consistent least-squares estimator in a linear fuzzy regression model with fuzzy parameters and fuzzy dependent variables

    Fuzzy Sets and Systems

    (2006)
  • H. Tanaka et al.

    Portfolio selections based on upper and lower exponential possibility distributions

    European Journal of Operational Research

    (1999)
  • H.-F. Wang et al.

    Insight of a fuzzy regression model

    Fuzzy Sets and Systems

    (2000)
  • W. Wang et al.

    Incident detection algorithm based on partial least squares regression

    Transportation Research Part C: Emerging Technologies

    (2008)
  • C-W. Wu

    Decision-making in testing process performance with fuzzy data

    European Journal of Operational Research

    (2009)
  • M.-S. Yang et al.

    Fuzzy least-squares linear regression analysis for fuzzy input-output data

    Fuzzy Sets and Systems

    (2002)
  • C.-C. Yao et al.

    Fuzzy regression based on asymmetric support vector machines

    Applied Mathematics and Computation

    (2006)
  • P.-S. Yu et al.

    Support vector regression for real-time flood stage forecasting

    Journal of Hydrology

    (2006)
  • Cited by (28)

    • Fuzzy regression analysis: Systematic review and bibliography

      2019, Applied Soft Computing Journal
      Citation Excerpt :

      Moreover, Huang [96] introduces a reduced support vector machine for interval regression analysis, which reduces the number of support vectors via random selection of sample subsets. Beyond evolutionary algorithms, neural networks and support vector machines, there are further machine learning approaches that have been utilized to improve fuzzy regression analysis: Ramli et al. [219, 220, 221] propose an efficient real-time (switching) fuzzy regression approach using convex hull designed via a beneath-beyond algorithm. Zuo et al. [222] present a fuzzy regression transfer learning method using fuzzy rules by developing a Takagi–Sugeno fuzzy regression model to transfer knowledge from a source domain to a target domain.

    • FLAS: Fuzzy lung allocation system for US-based transplantations

      2016, European Journal of Operational Research
      Citation Excerpt :

      The final 10-fold cross-validated values were found by averaging the resulted values of each performance measure over these ten rounds. The coefficient of determination (R2) refers to one of the most important statistical measures to explore dependencies among system variables (Ramli, Watada, & Pedrycz, 2011) and is hence deemed to be a powerful metric for prediction. At this point, it is worth to first choose between linear and nonlinear regression (Kao & Chyu, 2003) and based on the results our proposed FLAS could be judged.

    • The normalized interval regression model with outlier detection and its real-world application to house pricing problems

      2015, Fuzzy Sets and Systems
      Citation Excerpt :

      The method was extended to the non-symmetrical case [6] and utilized in the CAPM beta estimation problem [17]. Further, the possibilistic regression model has been extended to the real time fuzzy regression analysis [24] and fuzzy autocorrelation models [37]. To deal with the hybrid uncertain data, the confidence–interval-based fuzzy random regression model (CI-FRRM) was introduced [38].

    View all citing articles on Scopus
    View full text