Stochastics and StatisticsReal-time fuzzy regression analysis: A convex hull approach
Introduction
In real-world optimization problems, which we routinely encounter in engineering, management, economy, medicine, psychology, and other disciplines, it is quite common to handle large amounts of various types of data (Tanaka et al., 1982, Bohm and Kriegel, 2000, Watada et al., 2001, Olafsson et al., 2008). In particular, real-time data analysis becomes more important given the growing demand to support efficient managerial practices, which call for on-line (real-time) timely results of data analysis (Gould et al., 2008). The main intent in this setting is to reduce computing overhead in supplying the results in real-time.
Soft computing techniques such as fuzzy logic have become an important alternative to realize effective data analysis, especially for decision making process (Watada et al., 2001, He et al., 2007, Kumar and Ravi, 2007). On the other hand, regression analysis is a generic statistical tool to explore and describe dependencies among variables. When forming a synergy between these two essential modeling methodologies, we arrive at fuzzy regression, which takes full advantage of the strengths of the contributing technologies; cf. Wang and Tsaur (2000). Linear programming (LP) was used to determine the location of the centres and the spreads of the fuzzy coefficients (fuzzy numbers) of the fuzzy regression hyperplane by minimizing an objective function which takes into consideration the total spread of the outputs of the model treated as fuzzy numbers (Tanaka et al., 1982, Sakawa and Yano, 1992, Yao and Yu, 2006).
Let us recall that convex hull is a fundamental concept present in many applications encountered in pattern recognition, image processing, and statistics. Convex hull is defined as the smallest convex polygon located in a multidimensional data space which contains all point set (vertices) (Shapiro, 2004). In other words, convex hull corresponds to the intuitive notion of a “boundary” of a set of points and as such can be used to approximate a shape of any object of complex geometry.
In real-world problems such as those encountered present in economics, bio-computing or engineering, we are concerned with the massive data sets of high dimensionality to analyze in a limited time or even in real-time (Taylor, 2008). In addition to experimental evidence of numeric nature, some data can be described in a linguistic term, which immediately invokes the concept of fuzzy sets (Chang and Ayyub, 2001).
Related to the methods used for real-time application where statistical regression has been shown to be highly relevant, we can also observe some shortcomings, which arise when dealing with several characteristics of the data or making some simplifying yet not necessarily fully legitimate assumptions, cf. (Shapiro, 2004):
- i.
Difficulties with a thorough verification of assumptions about data distributions,
- ii.
Vagueness present in the relationships between input and output variables,
- iii.
Ambiguity of events or non-Boolean degrees to which they occur, and
- iv.
Inaccuracy and distortion introduced by linearization.
In all these scenarios, the use of the “standard” regression might raise some hesitation. Here the use of fuzzy regression arises as a viable alternative. There are two general ways supporting the development process of fuzzy regression (Dom, 2007).
- i.
Models where the relationships among the variables is inherently fuzzy, and
- ii.
Models where the input (independent) variables themselves are fuzzy.
There are some arguments that are worth highlighting with regard to fuzzy regression. As an example, Wang and Tsaur (2000) concluded that there is no proper interpretation of fuzzy regression interval. Besides that expert could still provide an interval of possible values but also indicate the probability of occurrence of each one of them. Obviously, this approach would require more information (Aznar and Guijarro, 2007, Guo and Tanaka, 2010).
Given the explanation presented above, a convex hull approach can help implement real-time fuzzy regression analysis by serving as an alternative optimization vehicle.
The main objective of this research is to enhance the implementation procedure of real-time fuzzy regression analysis with the use of the convex hull approach. In addition, the adaptation of selected algorithm of this approach, specifically the Beneath-Beyond algorithm, helps address the limitations of the generic implementation of fuzzy regression when applied to the analysis of real-time data. The unit performance assessment and air pollution evaluation offer two numeric examples, which exemplify the performance of the proposed approach.
The paper is organized as follows. Section 2 serves as a related literature review, which includes real-time data analysis, brief review of the fundamentals of fuzzy regression, outlines an LP approach, computational geometry and convex hull approach, and related research studies. Next, Section 3 presents the real-time fuzzy regression model realized with the use of the convex hull approach. Section 4 is devoted to empirical experiments. Finally, Section 5 presents concluding remarks.
Section snippets
Real-time data analysis processing
Essentially, real-time data analysis refers to studies where data revisions (updates, successive data accumulation) or data release timing is important to a significant degree (Guide Jr., 2006). The most important properties for real-time data analysis are dynamic analysis and reporting, based on data entered into a system in a short interval before the actual time of the usage of the results (Ramli and Watada, 2009).
An important notion in real-time systems is event, that is, any occurrence
Real-time fuzzy regression analysis with a convex hull algorithm
In what follows, we present an adaptation of convex hull algorithm, specifically the Beneath–Beyond algorithm, for the purpose of fuzzy regression analysis.
Numerical studies
In order to present the process of real-time data analysis, we selected two sets of real-world data, which were obtained from two different problems. Both of the selected samples of data are divided into two groups. This process aims to show the simulation of a real-time situation, in which the first group is analyzed at the beginning of the procedure and the remaining sample of data is added next which mimics the real-time scenario. In addition, we also developed the ordinary regression for
Comparative analysis and discussion
The increase in sample size might cause computational difficulties in the implementation of the LP problem. Another problem might emerge when changes occur with regard to the variables themselves, thus the entire set of constraints must be reformulated. Therefore the computing complexity increases. The increase of computing complexity has been alleviated by the use of the proposed method.
To highlight the main features of regression and fuzzy regression model, we summarized the results in Table 7
Concluding remarks
We have developed the enhancement of fuzzy regression, which implements the convex hull method, specifically the Beneath-Beyond algorithm. In real-time processing where we faced with variable amounts of data, the proposed algorithm may perform fuzzy regression by reconstructing particular edges and considering new vertices for which the re-computing takes place.
We completed the enhancement of the convex hull in order to handle a real-time fuzzy regression. Several points are worth stressing
Acknowledgements
Special thanks go to the reviewers for providing constructive suggestions to improve the presentation of this study. The first author was supported by Ministry of Higher Education Malaysia (MOHE) under Skim Latihan Akademik IPTA (SLAI-UTHM) scholarship program at Graduate School of Information, Production and Systems, Waseda University, Fukuoka, Japan.
References (41)
- et al.
Generalized additive modeling of air pollution, traffic volume and meteorology
Atmospheric Environment
(2005) - et al.
Estimating regression parameters with imprecise input data in an appraisal context
European Journal of Operational Research
(2007) - et al.
Multiple regression with fuzzy data
Fuzzy Sets and Systems
(2007) - et al.
Goodness of fit and variable selection in the fuzzy multiple linear regression
Fuzzy Sets and System
(2006) - et al.
Forecasting time series with multiple seasonal patterns
European Journal of Operational Research
(2008) - et al.
Decision making with interval probabilities
European Journal of Operational Research
(2010) - et al.
Balancing productivity and consumer satisfaction for profitability: statistical and fuzzy regression analysis
European Journal of Operational Research
(2007) - et al.
A simple method for computation of fuzzy linear regression
European Journal of Operational Research
(2005) - et al.
Least-squares estimates in fuzzy regression analysis
European Journal of Operational Research
(2003) - et al.
Multiple criteria linear regression
European Journal of Operational Research
(2007)
Operations research and data mining
European Journal of Operational Research
Fuzzy linear regression analysis for fuzzy input-output data
Information Science
A strong consistent least-squares estimator in a linear fuzzy regression model with fuzzy parameters and fuzzy dependent variables
Fuzzy Sets and Systems
Portfolio selections based on upper and lower exponential possibility distributions
European Journal of Operational Research
Insight of a fuzzy regression model
Fuzzy Sets and Systems
Incident detection algorithm based on partial least squares regression
Transportation Research Part C: Emerging Technologies
Decision-making in testing process performance with fuzzy data
European Journal of Operational Research
Fuzzy least-squares linear regression analysis for fuzzy input-output data
Fuzzy Sets and Systems
Fuzzy regression based on asymmetric support vector machines
Applied Mathematics and Computation
Support vector regression for real-time flood stage forecasting
Journal of Hydrology
Cited by (28)
Fuzzy regression analysis: Systematic review and bibliography
2019, Applied Soft Computing JournalCitation Excerpt :Moreover, Huang [96] introduces a reduced support vector machine for interval regression analysis, which reduces the number of support vectors via random selection of sample subsets. Beyond evolutionary algorithms, neural networks and support vector machines, there are further machine learning approaches that have been utilized to improve fuzzy regression analysis: Ramli et al. [219, 220, 221] propose an efficient real-time (switching) fuzzy regression approach using convex hull designed via a beneath-beyond algorithm. Zuo et al. [222] present a fuzzy regression transfer learning method using fuzzy rules by developing a Takagi–Sugeno fuzzy regression model to transfer knowledge from a source domain to a target domain.
Possibility Grades with Vagueness in Fuzzy Regression Models
2017, Procedia Computer ScienceFLAS: Fuzzy lung allocation system for US-based transplantations
2016, European Journal of Operational ResearchCitation Excerpt :The final 10-fold cross-validated values were found by averaging the resulted values of each performance measure over these ten rounds. The coefficient of determination (R2) refers to one of the most important statistical measures to explore dependencies among system variables (Ramli, Watada, & Pedrycz, 2011) and is hence deemed to be a powerful metric for prediction. At this point, it is worth to first choose between linear and nonlinear regression (Kao & Chyu, 2003) and based on the results our proposed FLAS could be judged.
The normalized interval regression model with outlier detection and its real-world application to house pricing problems
2015, Fuzzy Sets and SystemsCitation Excerpt :The method was extended to the non-symmetrical case [6] and utilized in the CAPM beta estimation problem [17]. Further, the possibilistic regression model has been extended to the real time fuzzy regression analysis [24] and fuzzy autocorrelation models [37]. To deal with the hybrid uncertain data, the confidence–interval-based fuzzy random regression model (CI-FRRM) was introduced [38].