Stochastics and Statistics
Partially adaptive robust estimation of regression models and applications

https://doi.org/10.1016/j.ejor.2004.06.008Get rights and content

Abstract

This paper provides an accessible exposition of recently developed partially adaptive estimation methods and their application. These methods are robust to thick-tailed or asymmetric error distributions and should be of interest to researchers and practitioners in data mining, agent learning, and mathematical modeling in a wide range of disciplines. In particular, partially adaptive estimation methods can serve as robust alternatives to ordinary regression analysis, as well as machine learning methods developed by the artificial intelligence and computing communities.

Results from analysis of three problem domains demonstrate application of the theory.

Introduction

During the past decade, there has been an explosion in computation and information technology, which has brought with it vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and has spawned new areas such as data mining and machine learning. Many of the developments in statistics are termed statistical learning methods, which have their roots in, or are refinements of, regression analysis. Regression analysis has, of course, been a mainstay in data analysis for the past 30 years and remains one of its most important tools (Hastie et al., 2001).

Regression analysis is routinely used by researchers in many disciplines to fit mathematical models to observed data. The traditional estimation technique of least squares is efficient if the error terms are independent of the regressors and are identically and independently distributed as a normal. While the unobserved random disturbances in a regression model are often assumed to be normally distributed, real data are often replete with outliers that lie far from the pattern evidenced by a majority of the data. While these errors may be the result of measurement inaccuracies or human recording error, many outliers are generated by genuinely thick-tailed or asymmetric error distributions. In such cases, discarding outliers is inappropriate since they are representative of the true data generating process (Boyer et al., 2003).

The purpose of this paper is to provide an accessible exposition of recently developed partially adaptive estimation methods and their application. These methods are robust to thick-tailed or asymmetric error distributions and should be of interest to researchers and practitioners in data mining, agent learning, and mathematical modeling in a wide range of disciplines. In particular, partially adaptive estimation methods can serve as robust alternatives to ordinary regression analysis, as well as machine learning methods developed by the artificial intelligence and computing communities.

The paper proceeds as follows: Section 2 outlines important motivational details. Section 3 provides the fundamentals of partially adaptive estimators. Section 4 illustrates application of those methods to three data sets, two of which closely approximate normality conditions and one of which does not. This is intended to illustrate the value of partially adaptive estimators. Section 5 offers a summary and some conclusions.

Section snippets

Motivating issues

The simple linear regression model can be expressed asYt=Xtβ+ut,where β denotes a K × 1 vector of unknown coefficients corresponding to the 1 × K vector of observations on the explanatory variables (Xt) and ut denotes the unobserved random disturbance or error term. The success of regression analysis is founded on its ease of application and on a guarantee of estimator optimality if certain assumptions are met by the data being used. Primarily these assumptions are conditioned on the properties of

Partially adaptive estimation

Partially adaptive estimation and the study of departures from normally distributed errors are not new. Zeckhauser and Thompson (1970) investigated the impact of such anomalies utilizing the generalized error distribution (GED) introduced by Subbotin (1923) and popularized by Box and Tiao (1962) which is defined by the pdfGED(u;s,p)=pe-(|u|p/(s)p)2sΓ(1/p)for-<u<,where s is a positive scale parameter and the parameter p is a positive shape parameter. Johnson and Kotz (1970) reviewed the

Empirical examples

Butler et al. (1990) considered market models in a similar study using partially adaptive estimators for symmetrically distributed error distributions. This study addressed simple models having a single independent variable. In this section we consider three complex regression applications which have appeared in the literature.

The estimation for this paper was performed using MATLAB. Other programs, such as STATA, SAS, and Shazam, which allow the user to solve the non-linear optimization

Summary and conclusions

Technology now allows us to capture and store vast quantities of data. Finding and summarizing the patterns, trends, and anomalies in these data sets is one of the grand challenges of the information age. While machine learning techniques, such as neural networks and decision trees, are seeing important applications, regression analysis continues to play a central role in data analysis and modeling.

It is well known that ordinary least squares estimates can be very sensitive to departures from

References (23)

  • D. Harrison et al.

    Hedonic prices and the demand for clean air

    Journal of Environmental Economics and Management

    (1978)
  • J.B. McDonald et al.

    A generalization of the beta distribution with applications

    Journal of Econometrics

    (1995)
  • W.K. Newey

    Adaptive estimation of regression models via moment restrictions

    Journal of Econometrics

    (1988)
  • D.A. Belsely et al.

    Regression Diagnostics

    (1980)
  • G.E.P. Box et al.

    A further look at robustness via Bayes’ theorem

    Biometrica

    (1962)
  • B.H. Boyer et al.

    A comparison of partially adaptive and reweighted least squares estimation

    Econometric Reviews

    (2003)
  • R.J. Butler et al.

    Robust and partially adaptive estimation of regression models

    Review of Economics and Statistics

    (1990)
  • F.R. Hampel

    The influence curve and its role in robust estimation

    Journal of the American Statistical Association

    (1974)
  • F.R. Hampel et al.

    Robust Statistics: The Approach Based on Influence Functions

    (1986)
  • T. Hastie et al.

    The Elements of Statistical Learning

    (2001)
  • P.J. Huber

    Robust Statistics

    (1981)
  • Cited by (19)

    • Regression based scenario generation: Applications for performance management

      2019, Operations Research Perspectives
      Citation Excerpt :

      Consequently, Management Science has alot to offer in terms of improving performance management techniques. One frequently used method in performance analysis is simple linear regression (SLR), see for example [2,19,31,38]. Although many firms and analysts are aware of the disadvantages of SLR, it is frequently applied in performance analysis for a wide range of reasons.

    • TLS-based profile model analysis of major composite structures with robust B-spline method

      2018, Composite Structures
      Citation Excerpt :

      Since the t-distribution is heavy-tailed, it can consider numerous outliers within a parametrically modeled dataset, so that the resulting IRLS algorithm can be expected to be a robust estimator. Furthermore, since the degree of freedom of the t-distribution, which reflects the thickness of its tails (and, thus, the number and magnitude of the outliers), can be estimated alongside the functional model parameters and the scale parameter, this type of estimator has been called self-tuning [24], partially adaptive [25] or simply adaptive [26]. It was shown by [27] that the convergence behavior of the EM algorithm can be improved considerably by modifying it to a so-called expectation conditional maximization (ECM) or to an ECM either (ECME) algorithm.

    • Car resale price forecasting: The impact of regression method, private information, and heterogeneity on forecast accuracy

      2017, International Journal of Forecasting
      Citation Excerpt :

      High dimensionality is a challenge in forecasting. In particular, methods that are based on the principles of empirical risk minimization might experience problems in the face of high dimensionality (e.g., Hansen et al., 2006). Various different strategies, such as regularization and structural risk minimization, have been proposed in the literature to make prediction methods more robust to high dimensionality (e.g., Hastie et al., 2009).

    • On the performance of the flexible maximum entropy distributions within partially adaptive estimation

      2011, Computational Statistics and Data Analysis
      Citation Excerpt :

      For this reason, in this study, we determine several flexible MaxEnt distributions as alternative models for error distribution in a partially adaptive estimation procedure. By taking into account two types of error distributions, symmetric leptokurtic and skewed (asymmetric) leptokurtic, which mostly appear in real-life applications (Butler et al., 1990; Hansen et al., 2006; Kantar et al., 2010), we determine two types of flexible MaxEnt distributions for errors. Thus, we determine two types of flexible MaxEnt distributions, which would be flexible enough to accommodate the shape of the true underlying error distribution.

    View all citing articles on Scopus
    View full text