Skip to main content
Log in

Multicollinearity in cross-sectional regressions

  • Original Article
  • Published:
Journal of Geographical Systems Aims and scope Submit manuscript

Abstract

The paper examines robustness of results from cross-sectional regression paying attention to the impact of multicollinearity. It is well known that the reliability of estimators (least-squares or maximum-likelihood) gets worse as the linear relationships between the regressors become more acute. We resolve the discussion in a spatial context, looking closely into the behaviour shown, under several unfavourable conditions, by the most outstanding misspecification tests when collinear variables are added to the regression. A Monte Carlo simulation is performed. The conclusions point to the fact that these statistics react in different ways to the problems posed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Anselin L, Florax RJGM (1995) Small sample properties of tests for spatial dependence in regression models. In: Anselin L, Florax R (eds) New directions in spatial econometrics. Springer, Berlin Heidelberg New York, pp 21–74

    Google Scholar 

  • Belsley D, Kuh E, Welsh R (1980) Regression diagnostics: identifying influential data and sources of collinearity. Wiley, New York

    Google Scholar 

  • Chatterjee S, Hadi A (1988) Sensitivity analysis in linear regression. Wiley, New York

    Google Scholar 

  • Draper N, Nostrand R (1979) Ridge regression and James–Stein estimation: review and comments. Technometrics 21:451–465

    Article  Google Scholar 

  • Farrar D, Glauber R (1967) Multicollinearity in regression analysis: the problem revisited. Rev Econ Stat 49:92–107

    Article  Google Scholar 

  • Florax RJGM, de Graaff T (2004) The performance of diagnostics tests for spatial dependence in linear regression models: a meta-analysis of simulation studies. In: Anselin L, Florax R, Rey S (eds) Advances in spatial econometrics: methodology, tools and applications. Springer, Berlin Heidelberg New York, pp 29–66

    Google Scholar 

  • Greene W (2003) Econometric analysis, 5th edn. Prentice Hall, New Jersey

    Google Scholar 

  • Hocking R (1983) Developments in linear regression methodology: 1959–1982. Technometrics 25:219–230

    Article  Google Scholar 

  • King B (1969) Comments on “Factor analysis and regression”. Econometrica 37:538–540

    Article  Google Scholar 

  • Kosfeld R, Lauridsen J (2006) Factor analysis regression. Working Paper, Department of Economics, University of Kassel

  • Kumar T (1975) Multicollinearity in regression analysis. Rev Econ Stat 57:365–366

    Article  Google Scholar 

  • Lauridsen J, Mur J (2006) Multicollinearity and outliers in cross-sectional analysis. Working paper, Department of Business and Economics, University of Southern Denmark

  • Mur J, Lauridsen J (2006) Outliers and spatial dependence in cross-sectional analysis. Environment and Planning A (forthcoming)

  • O’Hagan J, McCabe B (1975) Tests for the severity of multicollinearity in regression analysis: a comment. Rev Econ Stat 57:368–370

    Article  Google Scholar 

  • Scott J (1966) Factor analysis and regression. Econometrica 34:552–562

    Article  Google Scholar 

  • Scott J (1969) Factor analysis and regression revisited. Econometrica 37:719

    Article  Google Scholar 

  • Wichers C (1975) The detection of multicollinearity: a comment. Rev Econ Stat 57:366–368

    Article  Google Scholar 

Download references

Acknowledgments

This work has been carried out with the financial support of project SEC 2002-02350 of the Spanish Ministerio de Educatión. The authors also wish to thank Ana Angulo for her invaluable and disinterested collaboration.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jørgen Lauridsen.

Appendix: Misspecification test used

Appendix: Misspecification test used

The tests used always refer to the model of the null hypothesis; that is, of the static type such as: yXβ +  u. This model has been estimated by LS, where \(\hat{\sigma}^{2}\) and \(\hat{\beta}\) are the corresponding LS estimations and \(\hat{u}\) the residual series. The tests are the following (see Anselin and Florax 1995; or Florax and de Graaff 2004, for details):

Moran’s I test:

$$ I = \frac{R}{{S_{0}}}\frac{{\hat{u}'W\hat{u}}}{{\hat{u}'\hat{u}}}; \quad S_{0} = {\sum\limits_{r = 1}^R {{\sum\limits_{s = 1}^R {w_{{rs}}}}}} $$
(18)

LM-ERR test:

$$ \text{LM-ERR} = \frac{1}{{T_{1}}}\left(\frac{{\hat{u}W\hat{u}}}{{\hat{\sigma}^{2}}}\right)^{2}; \quad T_{1} = {\rm tr}(W'W + W^{2}) $$
(19)

LM-EL test:

$$ \text{LM-EL} = \frac{{\left(\frac{{\hat{u}'W\hat{u}}}{{\hat{\sigma}^{2}}} - \frac{{T_{1}}}{{R_{j}}}\frac{{\hat{u}'Wy}}{{\hat{\sigma}^{2}}}\right)^{2}}}{{T_{1} - \frac{{T^{2}_{1}}}{{R_{j}}}}} $$
(20)

KR test:

$$ {\rm KR} = h_{R} \frac{{\hat{\gamma}'Z'Z\hat{\gamma}}}{{\hat{e}`\hat{e}}} $$
(21)

LM-LAG test:

$$ \text{LM-LAG} = \frac{1}{{R_{j}}}\left(\frac{{\hat{u}'Wy}}{{\hat{\sigma}^{2}}}\right )^{2}$$
(22)

LM-LE test:

$$ \text{LM-LE} = \frac{{\left(\frac{{\hat{u}'Wy}}{{\hat{\sigma}^{2}}} - \frac{{\hat{u}'W\hat{u}}}{{\hat{\sigma}^{2}}}\right)^{2}}}{{R_{j} - T_{1}}} $$
(23)

SARMA test:

$$ \text{SARMA}= \frac{{\left(\frac{{\hat{u}'Wy}}{{\hat{\sigma}^{2}}} - \frac{{\hat{u}'W\hat{u}}}{{\hat{\sigma}^{2}}}\right)^{2}}}{{R_{j} - T_{1}}} + \frac{1}{{T_{1}}}\left(\frac{{\hat{u}W\hat{u}}}{{\hat{\sigma}^{2}}}\right)^{2} $$
(24)

Moreover, \(R_{j} = T_{1} + \frac{{\hat{\beta}'X'W`MWX\hat{\beta}}}{{\hat{\sigma}^{2}}}\) and MIX(XX)− 1 X′. Furthermore, \(\hat{e}\) is the vector of residuals from the auxiliary regression of the Kelejian–Robinson (KR) test, of order h R ×  1, Z is the matrix of exogenous variables included in the last regression and \(\hat{\gamma}\) the estimated coefficients obtained for the corresponding vector of parameters.

As is well-known, the asymptotic distribution of the standardised Moran’s I, obtained as \(\frac{{I - E(I)}}{{{\sqrt {V(I)}}}}\), with \(E(I) = \frac{R}{{S_{0} (R - k)}}{\rm tr}(MW)\) and

$$ V(I) = \left(\frac{R}{{S_{0}}}\right)^{2} \frac{{\{{\rm tr}(MWMW') + {\rm tr}(MWMW) + {\rm tr}^{2} (MW)\} - \{E(I)\} ^{2}}}{{(R - k)(R - k - 2)}}, $$

is an N(0,1); the two Lagrange Multipliers that follow, LM-ERR and LM-EL, have an asymptotic χ 2 (1), the distribution of the KR test is a χ 2(m), with m being the number of regressors included in the auxiliary regression. The three final tests also have a chi-square distribution, with one degree of freedom in the first two, and two degrees of freedom in the SARMA test.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lauridsen, J., Mur, J. Multicollinearity in cross-sectional regressions. J Geograph Syst 8, 317–333 (2006). https://doi.org/10.1007/s10109-006-0031-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10109-006-0031-z

Keywords

JEL Classifications

Navigation