Partially adaptive robust estimation of regression models and applications

doi:10.1016/j.ejor.2004.06.008

European Journal of Operational Research

Volume 170, Issue 1, 1 April 2006, Pages 132-143

https://doi.org/10.1016/j.ejor.2004.06.008 Get rights and content

Abstract

This paper provides an accessible exposition of recently developed partially adaptive estimation methods and their application. These methods are robust to thick-tailed or asymmetric error distributions and should be of interest to researchers and practitioners in data mining, agent learning, and mathematical modeling in a wide range of disciplines. In particular, partially adaptive estimation methods can serve as robust alternatives to ordinary regression analysis, as well as machine learning methods developed by the artificial intelligence and computing communities.

Results from analysis of three problem domains demonstrate application of the theory.

Introduction

During the past decade, there has been an explosion in computation and information technology, which has brought with it vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and has spawned new areas such as data mining and machine learning. Many of the developments in statistics are termed statistical learning methods, which have their roots in, or are refinements of, regression analysis. Regression analysis has, of course, been a mainstay in data analysis for the past 30 years and remains one of its most important tools (Hastie et al., 2001).

Regression analysis is routinely used by researchers in many disciplines to fit mathematical models to observed data. The traditional estimation technique of least squares is efficient if the error terms are independent of the regressors and are identically and independently distributed as a normal. While the unobserved random disturbances in a regression model are often assumed to be normally distributed, real data are often replete with outliers that lie far from the pattern evidenced by a majority of the data. While these errors may be the result of measurement inaccuracies or human recording error, many outliers are generated by genuinely thick-tailed or asymmetric error distributions. In such cases, discarding outliers is inappropriate since they are representative of the true data generating process (Boyer et al., 2003).

The purpose of this paper is to provide an accessible exposition of recently developed partially adaptive estimation methods and their application. These methods are robust to thick-tailed or asymmetric error distributions and should be of interest to researchers and practitioners in data mining, agent learning, and mathematical modeling in a wide range of disciplines. In particular, partially adaptive estimation methods can serve as robust alternatives to ordinary regression analysis, as well as machine learning methods developed by the artificial intelligence and computing communities.

The paper proceeds as follows: Section 2 outlines important motivational details. Section 3 provides the fundamentals of partially adaptive estimators. Section 4 illustrates application of those methods to three data sets, two of which closely approximate normality conditions and one of which does not. This is intended to illustrate the value of partially adaptive estimators. Section 5 offers a summary and some conclusions.

Section snippets

Motivating issues

The simple linear regression model can be expressed as $Y_{t} = X_{t} β + u_{t},$ where β denotes a K × 1 vector of unknown coefficients corresponding to the 1 × K vector of observations on the explanatory variables (X_t) and u_t denotes the unobserved random disturbance or error term. The success of regression analysis is founded on its ease of application and on a guarantee of estimator optimality if certain assumptions are met by the data being used. Primarily these assumptions are conditioned on the properties of

Partially adaptive estimation

Partially adaptive estimation and the study of departures from normally distributed errors are not new. Zeckhauser and Thompson (1970) investigated the impact of such anomalies utilizing the generalized error distribution (GED) introduced by Subbotin (1923) and popularized by Box and Tiao (1962) which is defined by the pdf $GED (u; s, p) = \frac{{pe}^{- (| u |^{p} / (s)^{p})}}{2 s Γ (1 / p)} for - \infty < u < \infty,$ where s is a positive scale parameter and the parameter p is a positive shape parameter. Johnson and Kotz (1970) reviewed the

Empirical examples

Butler et al. (1990) considered market models in a similar study using partially adaptive estimators for symmetrically distributed error distributions. This study addressed simple models having a single independent variable. In this section we consider three complex regression applications which have appeared in the literature.

The estimation for this paper was performed using MATLAB. Other programs, such as STATA, SAS, and Shazam, which allow the user to solve the non-linear optimization

Summary and conclusions

Technology now allows us to capture and store vast quantities of data. Finding and summarizing the patterns, trends, and anomalies in these data sets is one of the grand challenges of the information age. While machine learning techniques, such as neural networks and decision trees, are seeing important applications, regression analysis continues to play a central role in data analysis and modeling.

It is well known that ordinary least squares estimates can be very sensitive to departures from

References (23)

D. Harrison et al.
Hedonic prices and the demand for clean air
Journal of Environmental Economics and Management
(1978)
J.B. McDonald et al.
A generalization of the beta distribution with applications
Journal of Econometrics
(1995)
W.K. Newey
Adaptive estimation of regression models via moment restrictions
Journal of Econometrics
(1988)
D.A. Belsely et al.
Regression Diagnostics
(1980)
G.E.P. Box et al.
A further look at robustness via Bayes’ theorem
Biometrica
(1962)
B.H. Boyer et al.
A comparison of partially adaptive and reweighted least squares estimation
Econometric Reviews
(2003)
R.J. Butler et al.
Robust and partially adaptive estimation of regression models
Review of Economics and Statistics
(1990)
F.R. Hampel
The influence curve and its role in robust estimation
Journal of the American Statistical Association
(1974)
F.R. Hampel et al.
Robust Statistics: The Approach Based on Influence Functions
(1986)
T. Hastie et al.
The Elements of Statistical Learning
(2001)

P.J. Huber

Robust Statistics

(1981)

Cited by (19)

On the use of distribution-adaptive likelihood functions: Generalized and universal likelihood functions, scoring rules and multi-criteria ranking
2022, Journal of Hydrology
This paper is concerned with the formulation of an adequate likelihood function in the application of Bayesian epistemology to uncertainty quantification of hydrologic models. We focus our attention on a special class of likelihood functions (hereinafter referred to as distribution-adaptive likelihood functions), which do not require prior assumptions about the expected distribution of the residuals, rather inference takes place over the hypotheses (model parameters) and space of distribution functions. Our goals are threefold. First, we present theory of a revised implementation of the generalized likelihood (GL) function of Schoups and Vrugt (2010) wherein residual standardization precedes the treatment of serial correlation. This so-called GL⁺ function, enjoys a solid statistical underpinning and guarantees a more robust joint inference of the autoregressive coefficients and residual properties. Then, as secondary goal, we present a further generalization of the GL⁺ function, coined the universal likelihood (UL) function, which extends applicability to highly asymmetrical lepto- and platy-kurtic residual distributions. The UL function builds on the 5-parameter skewed generalized Student’s $t$ distribution of Theodossiou (2015) which makes up a large family of continuous probability distributions including (but not limited to) symmetric and skewed forms of the generalized normal, generalized $t$ , Laplace, normal, Student’s $t$ , and Cauchy-Lorentz distributions. As our third and last goal, we present the use of strictly proper scoring rules to evaluate, compare and rank likelihood functions. These scoring rules condense the accuracy of a distribution forecast to a single value while retaining attractive statistical properties. The GL⁺ and UL functions are illustrated using data of a simple autoregressive scheme and benchmarked against the GL function, Student $t$ likelihood (SL) of Scharnagl et al. (2015) and normal likelihood (NL) for a conceptual hydrologic model using measured streamflow data. Our results show that, (i) the GL⁺ function is superior to the GL function, (ii) the active set of nuisance variables exerts a large control on the performance of the GL⁺, SL and UL functions, (iii) the treatment of autocorrelation deteriorates the scoring rules and performance metrics of the forecast distribution, (iv) a leptokurtic distribution is favored for discharge residuals, (v) scoring rules are indispensable in our search for the true forecast distribution, and (vi) the use of multiple strictly proper scoring rules turns the selection of an adequate likelihood function into a multi-criteria problem.
Regression based scenario generation: Applications for performance management
2019, Operations Research Perspectives
Citation Excerpt :
Consequently, Management Science has alot to offer in terms of improving performance management techniques. One frequently used method in performance analysis is simple linear regression (SLR), see for example [2,19,31,38]. Although many firms and analysts are aware of the disadvantages of SLR, it is frequently applied in performance analysis for a wide range of reasons.
Regression analysis is a common tool in performance management and measurement in industry. Many firms wish to optimise their performance using Stochastic Programming but to the best of our knowledge there exists no scenario generation method for regression models. In this paper we propose a new scenario generation method for linear regression used in performance management. Our scenario generation method is able to produce more representative scenarios by utilising the data driven properties of linear regression models and cluster based resampling. Secondly, our scenario generation method is more robust to model ‘overfitting’ by utilising a multiple of linear regression functions, hence our scenarios are more reliable. Finally, our scenario generation method enables parsimonious incorporation of decision analysis, such as worst case scenarios, hence our scenario generation facilitates decision making. This paper will also be of interest to industry professionals.
TLS-based profile model analysis of major composite structures with robust B-spline method
2018, Composite Structures
Citation Excerpt :
Since the t-distribution is heavy-tailed, it can consider numerous outliers within a parametrically modeled dataset, so that the resulting IRLS algorithm can be expected to be a robust estimator. Furthermore, since the degree of freedom of the t-distribution, which reflects the thickness of its tails (and, thus, the number and magnitude of the outliers), can be estimated alongside the functional model parameters and the scale parameter, this type of estimator has been called self-tuning [24], partially adaptive [25] or simply adaptive [26]. It was shown by [27] that the convergence behavior of the EM algorithm can be improved considerably by modifying it to a so-called expectation conditional maximization (ECM) or to an ECM either (ECME) algorithm.
With the development of city constructions, tunnels are becoming important structures for underground transportation. Tunnels constitute layered composite structures with concrete, reinforcement, waterproof layers, etc. Deformation monitoring of this kind of wide-ranging composite structure is significant to assure their safety considering the development of their complexity. Terrestrial laser scanning (TLS) is one of the most accurate and fast measurement technologies for deformation analysis. It has been applied widely in survey fields with the advantages of non-contact and panoramic acquisition of information. In this situation, TLS instruments are being developed rapidly, which necessitates high requirements regarding software aspects, especially concerning high-accuracy model construction. Therefore, developing a reliable method for 3D modeling with complex and massive point clouds is urgent. In this paper, we propose an adaptive expectation maximization (EM) method based on the scaled t-distribution for B-spline estimation, where automation is achieved for the best approximation with the maximum probability density. The innovation of this paper lies in offering a robust, automatic and time-efficient solution to model practical tunnel structures with a complex point cloud.
Car resale price forecasting: The impact of regression method, private information, and heterogeneity on forecast accuracy
2017, International Journal of Forecasting
Citation Excerpt :
High dimensionality is a challenge in forecasting. In particular, methods that are based on the principles of empirical risk minimization might experience problems in the face of high dimensionality (e.g., Hansen et al., 2006). Various different strategies, such as regularization and structural risk minimization, have been proposed in the literature to make prediction methods more robust to high dimensionality (e.g., Hastie et al., 2009).
The paper investigates statistical models for forecasting the resale prices of used cars. An empirical study is performed to explore the contributions of different degrees of freedom in the modeling process to the forecast accuracy. First, a comparative analysis of alternative prediction methods provides evidence that random forest regression is particularly effective for resale price forecasting. It is also shown that the use of linear regression, the prevailing method in previous work, should be avoided. Second, the empirical results demonstrate the presence of heterogeneity in resale price forecasting and identify methods that can automatically overcome its detrimental effect on the forecast accuracy. Finally, the study confirms that the sellers of used cars possess informational advantages over market research agencies, which enable them to forecast resale prices more accurately. This implies that sellers have an incentive to invest in in-house forecasting solutions, instead of basing their pricing decisions on externally generated residual value estimates.
On the performance of the flexible maximum entropy distributions within partially adaptive estimation
2011, Computational Statistics and Data Analysis
Citation Excerpt :
For this reason, in this study, we determine several flexible MaxEnt distributions as alternative models for error distribution in a partially adaptive estimation procedure. By taking into account two types of error distributions, symmetric leptokurtic and skewed (asymmetric) leptokurtic, which mostly appear in real-life applications (Butler et al., 1990; Hansen et al., 2006; Kantar et al., 2010), we determine two types of flexible MaxEnt distributions for errors. Thus, we determine two types of flexible MaxEnt distributions, which would be flexible enough to accommodate the shape of the true underlying error distribution.
The partially adaptive estimation based on the assumed error distribution has emerged as a popular approach for estimating a regression model with non-normal errors. In this approach, if the assumed distribution is flexible enough to accommodate the shape of the true underlying error distribution, the efficiency of the partially adaptive estimator is expected to be close to the efficiency of the maximum likelihood estimator based on knowledge of the true error distribution. In this context, the maximum entropy distributions have attracted interest since such distributions have a very flexible functional form and nest most of the statistical distributions. Therefore, several flexible MaxEnt distributions under certain moment constraints are determined to use within the partially adaptive estimation procedure and their performances are evaluated relative to well-known estimators. The simulation results indicate that the determined partially adaptive estimators perform well for non-normal error distributions. In particular, some can be useful in dealing with small sample sizes. In addition, various linear regression applications with non-normal errors are provided.
On a Novel Skewed Generalized t Distribution: Properties, Estimations and its Applications
2024, arXiv

View all citing articles on Scopus

View full text

Stochastics and StatisticsPartially adaptive robust estimation of regression models and applications

Abstract

Introduction

Section snippets

Motivating issues

Partially adaptive estimation

Empirical examples

Summary and conclusions

Journal of Environmental Economics and Management

Journal of Econometrics

Journal of Econometrics

Regression Diagnostics

A further look at robustness via Bayes’ theorem

Biometrica

A comparison of partially adaptive and reweighted least squares estimation

Econometric Reviews

Robust and partially adaptive estimation of regression models

Review of Economics and Statistics

The influence curve and its role in robust estimation

Journal of the American Statistical Association

Robust Statistics: The Approach Based on Influence Functions

The Elements of Statistical Learning

Robust Statistics

Stochastics and Statistics
Partially adaptive robust estimation of regression models and applications