Skip to main content
Log in

A test for bivariate normality with applications in microeconometric models

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a test for bivariate normality in imperfectly observed models, based on the information matrix test for censored models with bootstrap critical values. In order to evaluate its properties, we run a comprehensive Monte Carlo experiment, in which we use the bivariate probit model and Heckman sample selection model as examples. We find that, while asymptotic critical values can be seriously misleading, the use of bootstrap critical values results in a test that has excellent size and power properties even in small samples. Since this procedure is relatively inexpensive from a computational viewpoint and is easy to generalise to models with arbitrary censoring schemes, we recommend it as an important and valuable testing tool.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. See, among others, Li and Racine (2006).

  2. CM tests were advocated, in a closely related context, by Pagan and Vella (1989).

  3. See for example (Davidson and MacKinnon 2001).

  4. For the latter, this result is also confirmed by Davidson and MacKinnon (1998)

  5. Numerical differentiation is clearly a possible, but more demanding computationally, alternative.

  6. On the other hand, Smith’s derivation of the score and Hessian matrix elements as functions of the \(\hbox {GEP}(r,s)\) allows us to immediately recognise the moment conditions tested in the OPG regression. Such expressions are not reported in this paper. Section 7 contains a brief exposition of GEPs and their relationship to the derivatives of the log-likelihood.

  7. This was done by setting set the intercept parameter in Eq. (5) to a suitable value. See Tables 5 and 6 for details.

  8. This, of course, rules out more extreme choices such as the Cauchy or \(\alpha \)-stable distributions.

  9. The presence of such moment conditions is not easy to spot inside the expressions given in Sects. 4.1 and 4.2. However, it becomes obvious when rewriting those expressions using Smith’s GEP formula.

  10. Labelling these tests as “skewness” and “kurtosis” tests would be tempting, but misleading, since these conditions contain cross-moment conditions which are not readily interpretable in terms of the shape of the joint density.

  11. Tables 5 and 6 for the sample selection models also include the “all-moment” variant of the test to let the reader compare it with the preferred one.

  12. For example, in a single-equation context, Skeels and Vella (1999) find that the Chesher-Irish normality test has better power for the Tobit model than for the probit model.

  13. See Robinson (1982) and Smith (1989).

  14. Of course, detailed results are, as usual, available on request.

  15. We used the Doornik-Hansen test for the main equation (Doornik and Hansen 2008) and the Chesher-Irish test for the selection equation (Chesher and Irish 1987).

  16. This aspect is not considered in the original contribution of Martins (2001).

  17. Smith’s original paper is more general than what is presented here, as the simultaneous-equations case is considered; however, this is not necessary in this paper so we may skip the resulting complications.

  18. A rigorous proof is given in Gourieroux et al. (1984).

  19. More formally: we are assuming that the spaces spanned by the two sets of regressors have no elements in common. Note that this excludes the presence of constant terms in both equations.

References

  • Bera AK, Jarque CM, Lee L-F (1984) Testing the normality assumption in limited dependent variable models. Int Econ Rev 25(3):563–578

    Google Scholar 

  • Bowman KO, Shenton LR (1975) Measures of multivariate skewness and kurtosis with applications. Biometrika 62(2):243–250

    MathSciNet  MATH  Google Scholar 

  • Chesher A (1983) The information matrix test : simplified calculation via a score test interpretation. Econ Lett 13(1):45–48

    Article  Google Scholar 

  • Chesher A, Irish M (1987) Residual analysis in the grouped and censored normal linear model. J Econ 34:33–61

    Article  MathSciNet  MATH  Google Scholar 

  • Chesher A, Spady R (1991) Asymptotic expansions of the information matrix test statistic. Econometrica 59(3):787–815

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR, Small NJH (1978) Testing multivariate normality. Biometrika 65(2):263–272

    Article  MATH  Google Scholar 

  • Cox DR, Wermuth N (1994) Tests of linearity, multivariate normality and the adequacy of linear scores. Appl Stat 43:347–355

    Article  MATH  Google Scholar 

  • Cragg JG (1971) Some statistical models for limited dependent variables with application to the demand for durable goods. Econometrica 39:829–844

    Article  MATH  Google Scholar 

  • Das M, Newey WK, Vella F (2003) Nonparametric estimation of sample selection models. Rev Econ Studies 70(1):33–58

    Article  MathSciNet  MATH  Google Scholar 

  • Davidson R, MacKinnon JG (2000) Bootstrap tests: how many bootstraps? Econ Rev 19(1):55–68

    Article  MathSciNet  MATH  Google Scholar 

  • Davidson R, MacKinnon JG (1992) A new form of the information matrix test. Econometrica 60(1):145–157

    Article  MATH  Google Scholar 

  • Davidson R, MacKinnon JG (1998) Graphical methods for investigating the size and power of hypothesis tests. Manch School 66(1):1–26

    Article  Google Scholar 

  • Davidson R, MacKinnon JG (1999a) Bootstrap testing in nonlinear models. Int Econ Rev 40(2):487–508

    Article  MathSciNet  Google Scholar 

  • Davidson R, MacKinnon JG (1999b) The size distortion of bootstrap tests. Econ Theory 15(03):361–376

    Article  MathSciNet  MATH  Google Scholar 

  • Davidson R, MacKinnon JG (2001) Artificial regressions. In: Baltagi B (ed) A companion to theoretical econometrics. Blackwell, Oxford

    Google Scholar 

  • Davidson R, MacKinnon JG (2006) The power of bootstrap and asymptotic tests. J Econ 133(2):421–441

    Article  MathSciNet  Google Scholar 

  • Davidson R, MacKinnon JG (2007) Improving the reliability of bootstrap tests with the fast double bootstrap. Comput Stat Data Anal 51(7):3259–3281

    Article  MathSciNet  MATH  Google Scholar 

  • Doornik JA, Hansen H (2008) An omnibus test for univariate and multivariate normality. Oxford Bull Econ Stat 70(1):927–939

    Article  Google Scholar 

  • Gallant AR, Nychka DW (1987) Semi-nonparametric maximum likelihood estimation. Econometrica 55(2):363–390

    Article  MathSciNet  MATH  Google Scholar 

  • Gourieroux C, Monfort A, Renault E, Trognon A (1984) Résidus généralisé ou interprétations linéaires de l’économétrie non linéaire. Discussion Paper 8410. Document de travail, INSEE

  • Hall A (1987) The information matrix test for the linear model. Rev Econ Studies 54(2):257–263

    Article  MATH  Google Scholar 

  • Heckman JJ (1974) Shadow prices, market wages, and labor supply. Econometrica 42(4):679–694

    Article  MATH  Google Scholar 

  • Horowitz JL (1994) Bootstrap-based critical values for the information matrix test. J Econ 61:365–411

    Google Scholar 

  • Horowitz JL, Härdle W (1994) Testing a parametric model against a semiparametric alternative. Econ Theory 10:821–848

    Article  Google Scholar 

  • Kennan I, Neumann G (1988) Why does the information matrix test reject too often? A diagnosis of some Monte Carlo symptoms. Discussion Paper 88. Stanford University, Stanford, CA, Working papers in economics, Hoover Institution

  • Lancaster T (1984) The covariance matrix of the information matrix test. Econometrica 52:1051–1053

    Article  MathSciNet  MATH  Google Scholar 

  • Lee L-F (1984) Tests for the bivariate normal distribution in econometric models with selectivity. Econometrica 52(4):843–863

    Article  MathSciNet  MATH  Google Scholar 

  • Li Q, Racine JS (2006) Nonparametric econometrics: theory and practice economics books. Princeton University Press, Princeton

    Google Scholar 

  • Mardia KV (1970) Measures of multivariate skewness and kurtosis with applications. Biometrika 57(3):519–530

    Article  MathSciNet  MATH  Google Scholar 

  • Martins MFO (2001) Parametric and semiparametric estimation of sample selection models: an empirical application to the female labour force in Portugal. J Appl Econ 16(1):23–39

    Article  Google Scholar 

  • Montes-Rojas GV (2011) Robust misspecification tests for the Heckman’s two-step estimator. Econ Rev 30(2):1–19

    Article  MathSciNet  Google Scholar 

  • Murphy A (2007) Score tests of normality in bivariate probit models. Econ Lett 95(3):374–379

    Article  MATH  Google Scholar 

  • Newey WK (1985) Maximum likelihood specification testing and conditional moment tests. Econometrica 53:1047–1070

    Article  MathSciNet  MATH  Google Scholar 

  • Newey WK (1999) Two-step series estimation of sample selection models. Working papers 99–04, Massachusetts Institute of Technology (MIT), Department of Economics

  • Orme C (1990) The small-sample performance of the information-matrix test. J Econ 46(3):309–331

    Article  MathSciNet  Google Scholar 

  • Pagan A, Vella F (1989) Diagnostic tests for models based on individual sata: a survey. J Appl Econ 4(S):S29–S59

    Article  Google Scholar 

  • Robinson PM (1982) On the asymptotic properties of estimators of models containing limited dependent variables. Econometrica 50(1):27–41

    Article  MathSciNet  MATH  Google Scholar 

  • Shenton LR, Bowman KO (1977) A bivariate model for the distribution of \(\surd {b_1}\) and \(b_2\). J Am Stat Assoc 72:206–211

    MATH  Google Scholar 

  • Skeels CL, Vella F (1999) A Monte Carlo investigation of the sampling behavior of conditional moment tests in tobit and probit models. J Econ 92(2):275–294

    Article  MATH  Google Scholar 

  • Smith RJ (1985) Some tests for misspecification in bivariate limited dependent variables models. Annales de l’INS 59(60):97–123

    Google Scholar 

  • Smith RJ (1987) Testing the normality assumption in multivariate simultaneous limited dependent variable models. J Econ 34(1–2):105–123

    Article  MATH  Google Scholar 

  • Smith RJ (1989) On the use of distributional mis-specification checks in limited dependent variable models. The Economic Journal Vol. 99, No. 395, Supplement: Conference Papers, pp 178–192

  • Tauchen GE (1985) Diagnostic testing and evaluation of maximum likelihood models. J Econ 30:415–443

    Article  MathSciNet  MATH  Google Scholar 

  • Taylor LW (1987) The size bias of the information matrix test. Econ Lett 24:63–67

    Article  Google Scholar 

  • Van de Ven WPMM, Van Praag BMS (1981) The demand for deductibles in private health insurance: a probit model with sample selection. J Econ 17(2):229–252

    Article  Google Scholar 

  • van der Klaauw B, Koning RH (2003) Testing the normality assumption in the sample selection model with an application to travel demand. J Bus Econ Stat 21(1):31–42

    Article  Google Scholar 

  • White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50(1):1–25

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claudia Pigini.

Appendices

Appendix A: Information matrix tests in two-equation limited dependent variable models

Consider the latent variable modelFootnote 17:

$$\begin{aligned} y_1^{*}&= x_1'\beta _1+ v_1 \end{aligned}$$
(11)
$$\begin{aligned} y_2^{*}&= x_2'\beta _2+ v_2 \end{aligned}$$
(12)

where \(v_1\) and \(v_2\) have the following bivariate normal distribution:

$$\begin{aligned} \left( \begin{array}{cc} v_{1} \\ v_{2} \end{array} \right) \sim N \left[ \left( \begin{array}{cc} 0 \\ 0 \end{array} \right) ; \left( \begin{array}{cc} \omega _1^{2} &{}\quad \omega _{12} \\ \omega _{12} &{}\quad \omega _2^{2} \end{array} \right) \right] \end{aligned}$$
(13)

where \(x_1\) and \(x_2\) are vectors of exogenous variables and \(\beta _1\), \(\beta _2\) are the parameter vectors. Since \(v_1\) can be written as \(v_1 = \rho v_2 + u\), where \(\rho = \omega _{12}/\omega _2^{2}\), the model for \(y_1^{*} | y_2^{*}\) is:

$$\begin{aligned} y_1^{*}&= x_1'\beta _1 + \rho v_2 + u \end{aligned}$$
(14)
$$\begin{aligned} y_2^{*}&= x_2'\beta _2+ v_2 \end{aligned}$$
(15)

with \(u|x_1,v_2 \sim N(0, \omega _{11.2}^{2})\) where \(\omega _{11.2}^{2}= \omega _1^{2}-\omega _{12}/ \omega _2^{2}\).

The log-likelihood for the latent variables \(\ell ^{*}\) can be split into conditional \(\ell _{12}^{*}\) and marginal \(\ell _{2}^{*}\) log-likelihoods so that:

$$\begin{aligned} \ell ^{*} (y_1^{*},y_2^{*};\theta ) = \ell _{12}^{*}(y_1^{*}|y_2^{*};\theta ) + \ell _2^{*}(y_2^{*};\theta _2) \end{aligned}$$
(16)

where \(\theta = (\theta _1', \theta _2')'\), \(\theta _1 = (\beta _1',\rho )\) and \(\theta _2 = (\beta _2', \omega _2^2)\). For an iid sample \((y_i, x_i)\), the observational rules for \(y_1^{*}\) and \(y_2^{*}\) are assumed to be independent from the parameters.

The following are the key results on the likelihood function (Gourieroux et al. 1984) crucial to the presentation of the test statistic for a model subject to an arbitrary censoring scheme. The score and Hessian matrix elements for observables can be derived quite easily from the score and Hessian matrix for the unobservables as follows:Footnote 18

$$\begin{aligned} \frac{\partial \ell }{\partial \theta } = E\left[ \frac{\partial \ell ^{*}}{\partial \theta } \Bigg | y\right] \end{aligned}$$
(17)

and the Hessian matrix as:

$$\begin{aligned} \frac{\partial ^2 \ell }{\partial \theta \partial \theta '} = E\left[ \frac{\partial ^2 \ell ^{*}}{\partial \theta \partial \theta '} \Bigg | y\right] +V\left[ \frac{\partial \ell ^{*}}{\partial \theta } \Bigg | y\right] \end{aligned}$$
(18)

These quantities can be shown to be functions of the generalised error product of order (r,s) (\(\hbox {GEP}(r,s)\)), introduced by Smith, defined as:

$$\begin{aligned} \overline{\varepsilon ^{r}\xi ^{s}} = E(\varepsilon ^{r}\xi ^{s}| y) - E(\varepsilon ^{r}\xi ^{s}) \end{aligned}$$
(19)

where \(\varepsilon = u/\omega _{11.2}\) and \(\xi = v_2/\omega _2\). \(E(\varepsilon ^{r}\xi ^{s}| y)\) is the expectation conditional on the censoring scheme, that is the relevant region of integration. In (Smith 1985, Sect. 1) some examples are given. The sample counterpart of the GEP(\(r,s\)) is the generalised residual product of order (\(r,s\)), GRP(\(r,s\)), which is the GEP(\(r,s\)) evaluated at \(\hat{\theta }_{ML}\) the ML estimator of model (14)–(15).

The IM test statistic is based on the moment conditions

$$\begin{aligned} C_i= \hbox {vech} \left[ \frac{\partial ^2 \ell _i}{\partial \theta \partial \theta '} + G_i G_i' \right] . \end{aligned}$$

evaluated at \(\theta = \hat{\theta }_{ML}\). The contributions to the Hessian matrix and to the OPG are derived using (17) and (18) and are linear functions of the GRP(\(r,s\)), \(r+s \le 4\). Expressions for the model (14)–(15) are given in (Smith 1985, Appendix 1 and 2), for both unobservables and observables.

Appendix B: Rank analysis

1.1 B.1 Rank analysis for the sample selection model

As mentioned in 3.1, in the the case of the sample selection model we are only able to prove that the upper bound for \(df\) is equal \(k(k+1)/2 - 1\) since the last moment condition \(C_i^{\psi ,\psi }\) is always dropped in the OPG regression. This happens when different sets of regressors without constant terms for the two equations are considered. The lower bound is more difficult to determine, and will be derived only for a specific setup as an example.

Let us now consider two sets of completely different regressors \(x_{is}\) with \(s=1,\ldots ,m\) and \(w_{ir}\) with \(r=1,\ldots ,h\) without constant terms. All the moment conditions that are going to be considered are non-zero only for uncensored observations, so the \(d_i\) index will be dropped to simplify the notation (see also Table (1)). \(C_i^{\psi ,\psi }\) can be written as a linear combination of other columns of matrix \(M\). For this purpose, the following expressions are developed. Consider first the generic \(r\)-condition \(C_i^{\gamma _r,\sigma }\) also as a function of \(u_i\) and \(b_i\):

$$\begin{aligned} C_i^{\gamma _r,\sigma }= \frac{1}{\sigma }\mu _i \left[ s_{\psi }c_{\psi }^2 u_ib_i + s_{\psi }^2 c_{\psi }u_i^2 + c_{\psi }u_i^2-c_{\psi }\right] w_{ir} \end{aligned}$$

We now multiply \(C_i^{\gamma _r,\sigma }\) by \(\sigma \gamma _r\) and then sum across the \(r\) moment conditions obtaining

$$\begin{aligned} \sigma \sum _{r=1}^{h}\gamma _r C_i^{\gamma _r,\sigma } = \mu _i \left[ s_{\psi }c_{\psi }^2 u_ib_i + s_{\psi }^2 c_{\psi }u_i^2 + c_{\psi }u_i^2-c_{\psi }\right] \sum _{r=1}^{h}\gamma _rw_{ir} \end{aligned}$$

which gives

$$\begin{aligned} \sigma \sum _{r=1}^{h}\gamma _r C_i^{\gamma _r,\sigma } = \mu _i \left[ s_{\psi }c_{\psi }^2 u_ib_i^2 + c_{\psi }^3 u_i^2 b_i-c_{\psi }b_i \right] \end{aligned}$$
(20)

since \(\sum _{r=1}^{h}\gamma _rw_{ir}=b_i\) and \(c_{\psi }s_{\psi }^2 + c_{\psi }= c_{\psi }^3\). We will later need also

$$\begin{aligned} t_{\psi }^2\sigma \sum _{r=1}^{h}\gamma _r C_i^{\gamma _r,\sigma } = \mu _i \left[ s_{\psi }^3 u_ib_i^2 + s_{\psi }^2 c_{\psi }u_i^2 b_i - \frac{s_{\psi }^2}{c_{\psi }}b_i \right] \end{aligned}$$
(21)

Similar transformations applied to \(C_i^{\gamma _r,\psi }\) yield

$$\begin{aligned} t_{\psi }\sum _{r=1}^{h}\gamma _r C_i^{\gamma _r,\psi } = \mu _i \left[ -s_{\psi }^2c_{\psi }b_i^3 -s_{\psi }c_{\psi }^2 u_i b_i^2 - s_{\psi }^3 u_i b_i^2 - s_{\psi }^2c_{\psi }u_i^2b_i + \frac{s_{\psi }^2}{c_{\psi }}b_i \right] \nonumber \\ \end{aligned}$$
(22)

Let us now write \(C_i^{\sigma ,\psi }\) as a function of \(u_i\) and \(b_i\). We get

$$\begin{aligned} \sigma t_{\psi }C_i^{\sigma ,\psi } = \mu _i \left[ -2s_{\psi }u_i - \frac{s_{\psi }^2}{c_{\psi }}b_i + s_{\psi }c_{\psi }^2 u_i^3 + s_{\psi }^3 u_i b_i^2 +2s_{\psi }^2c_{\psi }u_i^2 b_i \right] \end{aligned}$$
(23)

and, finally, we need

$$\begin{aligned} t_{\psi }G_i^{\psi } = \mu _i \left[ \frac{s_{\psi }^2}{c_{\psi }}b_i +s_{\psi }u_i \right] \end{aligned}$$
(24)

Since \(C_i^{\psi ,\psi }\), written as a function of \(u_i\) and \(b_i\) is

$$\begin{aligned}&C_i^{\psi ,\psi } = \mu _i \left[ a_i(1-c_i^2)\right] \nonumber \\&\quad \quad \quad \quad =\mu _i \left[ -s_{\psi }c_{\psi }^2 u_i^3 -s_{\psi }^2c_{\psi }b_i^3 - c_{\psi }^3 u_i^2 b_i -s_{\psi }^3 u_i b_i^2 \right. \nonumber \\&\quad \quad \quad \quad \quad \left. -2s_{\psi }^2c_{\psi }u_i^2 b_i -2s_{\psi }c_{\psi }^2 u_i b_i^2 + c_{\psi }b_i +s_{\psi }u_i \right] \end{aligned}$$
(25)

it can now be expressed as a linear combination of (20), (21), (22), (23) and (24)

$$\begin{aligned} C_i^{\psi ,\psi } = - \sigma (1-t_{\psi }^2)\sum _{r=1}^{h}\gamma _r C_i^{\gamma _r,\sigma } + t_{\psi }\sum _{r=1}^{h}\gamma _r C_i^{\gamma _r,\psi } - \sigma t_{\psi }C_i^{\sigma ,\psi } -t_{\psi }G_i^{\psi } \end{aligned}$$

While it makes sense to consider the case of the same sets of regressors for the rank analysis in the bivariate probit model and therefore to choose the limiting case of only two constants to study \(df\)’s lower bound, the choice of the case study for the Heckman selection model needs further discussion. First of all, it is not possible to consider only two constants since the model would not be identified. Secondly, it is quite common to see applications with two at least slightly different sets of regressors. Therefore we believe that the simplest form of a reasonable setup is one containing a constant term and a continuous regressor \(w\) in the selection equation, and only a constant term in the main equation. The vector of parameters of the model just described is (\(\beta _0, \gamma _0, \gamma _1, \sigma , \psi \)). Six of the fifteen moment conditions are dropped in the OPG regression. Naturally

$$\begin{aligned} C_i^{\beta _0,\beta _0}&= d_i\frac{1}{\sigma }\left[ G_i^{\sigma }- \frac{s_{\psi }c_{\psi }}{\sigma }G_i^{\psi } \right] \\ C_i^{\beta _0,\gamma _0}&= d_i\frac{c_{\psi }^2}{\sigma }G_i^{\psi } \end{aligned}$$

are dropped as they are linear combinations of the score elements. Also \(C_i^{\gamma _0,\gamma _0}\) is a linear combination of \(G_i^{\gamma _0}, G_i^{\gamma _1}, G_i^{\psi }\). First notice that in this specific setup

$$\begin{aligned} b_i = \gamma _0 + w_i\gamma _1 \end{aligned}$$

and

$$\begin{aligned} a_i = c_{\psi }\gamma _0 + c_{\psi }w_i\gamma _1 + s_{\psi }u_i; \quad c_i = s_{\psi }\gamma _0 + s_{\psi }w_i\gamma _1 + c_{\psi }u_i \end{aligned}$$

then write

$$\begin{aligned} C_i^{\gamma _0,\gamma _0} = d_i \mu _i \left[ -c_{\psi }^3\gamma _0 -c_{\psi }^3 w_i\gamma _1 -s_{\psi }c_{\psi }^2 u_i\right] - (1-d_i)\mu _i \left[ \gamma _0 + w_i\gamma _1 \right] \end{aligned}$$

Given the following transformations, only for uncensored observations,

$$\begin{aligned} -c_{\psi }^2 \gamma _0 G_i^{\gamma _0} = -c_{\psi }^3\gamma _0 \mu _i; \quad -c_{\psi }^2 \gamma _1 G_i^{\gamma _1} = -c_{\psi }^3 w_i\gamma _1 \mu _i \end{aligned}$$

and

$$\begin{aligned} -s_{\psi }c_{\psi }G_i^{\psi }= -\mu _i \left[ -c_{\psi }^3\gamma _0 -c_{\psi }^3w_i\gamma _1 -c_{\psi }^2s_{\psi }u_i \right] , \end{aligned}$$

\( C_i^{\gamma _0,\gamma _0}\) can be written as

$$\begin{aligned} C_i^{\gamma _0,\gamma _0} = -d_i\left[ c_{\psi }s_{\psi }G_i^{\psi } + \gamma _0 G_i^{\gamma _0} + \gamma _1 G_i^{\gamma _1} \right] - (1-d_i)\left[ \gamma _0 G_i^{\gamma _0} + \gamma _1 G_i^{\gamma _1}\right] . \end{aligned}$$

\(C_i^{\gamma _0,\sigma }\) is also a linear combination of \(G_i^{\psi }\) and of the moment conditions \(C_i^{\beta _0, \gamma _1}\) and \(C_i^{\beta _0,\psi }\).

Rearranging algebraically \(C_i^{\gamma _0,\sigma }\) and \(C_i^{\beta _0,\psi }\) we get

$$\begin{aligned} C_i^{\gamma _0,\sigma } = C_i^{\beta _0,\psi } -\mu _i \frac{1}{\sigma } \left[ s_{\psi }^2c_{\psi }b_i + s_{\psi }c_{\psi }^2 u_i \right] (\gamma _0 + \gamma _1w_i). \end{aligned}$$
(26)

Applying the following transformations

$$\begin{aligned}&\frac{s_{\psi }c_{\psi }\gamma _0}{\sigma }G_i^{\psi }= \frac{\mu _i}{\sigma }\left[ s_{\psi }^2c_{\psi }b_i + s_{\psi }c_{\psi }^2 u_i \right] \gamma _0\\&\frac{s_{\psi }}{c_{\psi }}\gamma _i C_i^{\beta _0, \gamma _1} = \frac{\mu _i}{\sigma }\left[ s_{\psi }^2c_{\psi }b_i + s_{\psi }c_{\psi }^2 u_i \right] \gamma _1 w_i \end{aligned}$$

we can rewrite (26) as

$$\begin{aligned} C_i^{\gamma _0,\sigma } = C_i^{\beta _0,\psi } - \frac{s_{\psi }c_{\psi }\gamma _0}{\sigma }G_i^{\psi } - \frac{s_{\psi }}{c_{\psi }}\gamma _1 C_i^{\beta _0, \gamma _1}. \end{aligned}$$
(27)

The OPG regression also drops

$$\begin{aligned} C_i^{\gamma _0,\psi } = -\mu _i (c_{\psi }a_i c_i - s_{\psi }). \end{aligned}$$
(28)

Considering

$$\begin{aligned} -\sigma \frac{c_{\psi }}{s_{\psi }}C_i^{\beta _0,\psi } = -\mu _i \left( c_{\psi }a_i c_i + \frac{c_{\psi }}{s_{\psi }}u_i c_i -\frac{c_{\psi }^2}{s_{\psi }}\right) \end{aligned}$$

and

$$\begin{aligned} \frac{\sigma }{s_{\psi }c_{\psi }}C_i^{\gamma _0,\sigma }= \mu _i \left( c_{\psi }b_i u_i + \frac{c_{\psi }^2}{s_{\psi }}u_i^2 - \frac{1}{s_{\psi }}\right) \end{aligned}$$

we can rewrite (28) as

$$\begin{aligned} C_i^{\gamma _0,\psi } = -\sigma \frac{c_{\psi }}{s_{\psi }}C_i^{\beta _0,\psi } + \frac{\sigma }{s_{\psi }c_{\psi }}C_i^{\gamma _0,\sigma } \end{aligned}$$
(29)

and substituting (27) in (29) we finally get

$$\begin{aligned} C_i^{\gamma _0,\psi } = -\sigma \frac{s_{\psi }}{c_{\psi }} C_i^{\beta _0, \psi } - \gamma _0 G_i^{\psi } - \frac{\sigma }{c_{\psi }^2}\gamma _1 C^{\beta _0, \gamma _1} \end{aligned}$$

Finally the last column to be dropped is \(C_i^{\psi ,\psi }\) as we discussed earlier. So in this simple setup the number of moment conditions are \(k(k+1)/2 = 15\) of which only \(9\) are kept for testing.

1.2 B.2 Rank analysis for the bivariate probit model

Let us start from the extreme case in which in each equation the only regressor is a constant; then,

$$\begin{aligned} - x_{1i}\beta _1 = a_{i} = \bar{a} \qquad - x_{2i}\beta _2 = b_{i} = \bar{b} \qquad P_i = \bar{P} \end{aligned}$$

are also constant across observations with differences depending only on the observational rule. Therefore the three score elements \(G_i\) are also constant across observations

$$\begin{aligned} G_i^{a_i} = \bar{S}^{\bar{a}} \qquad G_i^{b_i} = \bar{S}^{\bar{b}} \qquad G_i^{\psi } = \bar{S}^{\psi } \end{aligned}$$

This makes every moment condition \(C_i\) a linear combination of the score elements (compare Tables 2 and 9), which means that all these conditions are collinear to the score matrix and, as a consequence, do not contribute to the rank of \(M\), as defined in Sect. 3.1.

Table 9 Moment conditions for the bivariate probit model with only two constant terms as regressors

Let us turn to the opposite extreme case, with non-overlapping sets of regressors.Footnote 19 It is possible to prove that the last moment condition, \(C_i^{\psi ,\psi }\) is always collinear to the rest of the columns of \(M\), even though no suspicious redundancy is apparent.

Dropping the \(i\) index for clarity, consider \(x_{1r} \ne x_{2s}\) for every \(r=1,\ldots ,k_1\) and \(s=1,\ldots ,k_2\); then

$$\begin{aligned} a = \sum _{r=1}^{k_1} x_{1r}\beta _{1r} \qquad b = \sum _{s=1}^{k_2} x_{2s}\beta _{2s} \end{aligned}$$

The generic condition associated with cross-derivatives of \(\beta _{1r}\) and \(\beta _{2s}\) (see also Table (2)), may be written as:

$$\begin{aligned} C^{\beta _{1r}, \beta _{2s}} = c_{\psi }^2 S^{\psi }x_{1r}x_{2s} \end{aligned}$$
(30)

Now write the moment condition associated with the cross-derivative of \(\beta _{1t}\) and \(\psi \) as:

$$\begin{aligned} C^{\beta _{1t}, \psi } = - \left[ c_{\psi }\left( \sum _{r=1}^{k_1} x_{1r}\beta _{1r} \right) x_{1t} - s_{\psi }\left( \sum _{s=1}^{k_2} x_{2s}\beta _{2s} \right) x_{1t} \right] c_{\psi }S^{\psi } \end{aligned}$$

By using (30), the previous expression becomes

$$\begin{aligned} C^{\beta _{1t}, \psi } = -c_{\psi }^2 S^{\psi } \sum _{r=1}^{k_1} x_{1t} x_{1r}\beta _{1r} + t_{\psi }\sum _{s=1}^{k_2}C^{\beta _{1t},\beta _{2s}}\beta _{2s} \end{aligned}$$
(31)

By symmetry, the moment condition associated with the cross-derivative of \(\beta _{2m}\) and \(\psi \) can be written as

$$\begin{aligned} C^{\beta _{2m}, \psi } = -c_{\psi }^2 S^{\psi } \sum _{s=1}^{k_2} x_{2m} x_{2s}\beta _{2s} +t_{\psi }\sum _{r=1}^{k_1}C^{\beta _{1r},\beta _{2m}}\beta _{1r} \end{aligned}$$
(32)

Let us now rewrite \(C^{\psi ,\psi }\) as follows:

$$\begin{aligned} C^{\psi ,\psi }&= S^{\psi }[c_{\psi }^2 ab + s_{\psi }^2 ab - s_{\psi }c_{\psi }a^2 - s_{\psi }c_{\psi }b^2 - t_{\psi }]\\&= c_{\psi }^2 S^{\psi } \sum _{r=1}^{k_1}\sum _{s=1}^{k_2} x_{1r} x_{2s}\beta _{1r}\beta _{2s} +s_{\psi }^2 S^{\psi }\sum _{r=1}^{k_1}\sum _{s=1}^{k_2} x_{1r} x_{2s}\beta _{1r}\beta _{2s} \\&- s_{\psi }c_{\psi }S^{\psi }\sum _{r=1}^{k_1}\sum _{t=1}^{k_1} x_{1r}x_{1t}\beta _{1r}\beta _{1t} - s_{\psi }c_{\psi }S^{\psi }\sum _{s=1}^{k_2}\sum _{m=1}^{k_2} x_{2s}x_{2m}\beta _{2s}\beta _{2m} - S^{\psi }t_{\psi }. \end{aligned}$$

Note that

$$\begin{aligned} c_{\psi }^2 S^{\psi }\sum _{r=1}^{k_1}\sum _{s=1}^{k_2} x_{1r} x_{2s}\beta _{1r}\beta _{2s}&= \sum _{r=1}^{k_1}\sum _{s=1}^{k_2} C^{\beta _{1r}\beta _{2s}}\beta _{1r}\beta _{2s} \end{aligned}$$
(33)
$$\begin{aligned} s_{\psi }^2 S^{\psi }\sum _{r=1}^{k_1}\sum _{s=1}^{k_2} x_{1r} x_{2s}\beta _{1r}\beta _{2s}&= t_{\psi }^2 \sum _{r=1}^{k_1}\sum _{s=1}^{k_2}C^{\beta _{1r}\beta _{2s}}\beta _{1r} \beta _{2s}. \end{aligned}$$
(34)

and that by multiplying (31) by \(\beta _{1t}t_{\psi }\) one gets

$$\begin{aligned} - s_{\psi }c_{\psi }S^{\psi }\sum _{r=1}^{k_1}\sum _{t=1}^{k_1} x_{1r}x_{1t}\beta _{1r}\beta _{1t} = t_{\psi }\sum _{t=1}^{k_1} C^{\beta _{1t},\psi }\beta _{1t} -t_{\psi }^2 \sum _{t=1}^{k_1}\sum _{s=1}^{k_2}C^{\beta _{1t},\beta _{2s}} \beta _{1t},\beta _{2s}; \nonumber \\ \end{aligned}$$
(35)

similarly, (32) may be multiplied by \(\beta _{2m}t_{\psi }\) to obtain

$$\begin{aligned} - s_{\psi }c_{\psi }S^{\psi }\sum _{s=1}^{k_2}\sum _{m=1}^{k_2} x_{2s}x_{2m}\beta _{2s}\beta _{2m} \!=\! t_{\psi }\sum _{m=1}^{k_2}C^{\beta _{2m},\psi }\beta _{2m} -t_{\psi }^2\sum _{r=1}^{k_1}\sum _{m=1}^{k_2}C^{\beta _{1r}, \beta _{2m}}\beta _{1r},\beta _{2m}. \nonumber \\ \end{aligned}$$
(36)

Finally, after rearranging (33), (34), (35) and (36), we can rewrite \(C^{\psi ,\psi }\) as

$$\begin{aligned} C^{\psi ,\psi }&= (1-t_{\psi }^2)\sum _{r=1}^{k_1}\sum _{s=1}^{k_2} C^{\beta _{1r},\beta _{2s}}\beta _{1r}\beta _{2s} \\&+\, t_{\psi }\sum _{r=1}^{k_1} C^{\beta _{1r},\psi } \beta _{1r} + t_{\psi }\sum _{s=1}^{k_2} C^{\beta _{2s},\psi } \beta _{2s} - t_{\psi }S^{\psi }, \end{aligned}$$

that is, a linear combination of elements of other columns of \(M_i\).

Different combinations of constant and duplicated regressors across equations lead to intermediate cases. We are particularly interested in studying the case (which often occurs in practice) in which we have the same set of regressors for both equations including constant terms. Other than the \(C^{\psi ,\psi }\) element, we now prove that in this particular case other moment conditions are always collinear in the OPG regression. Moreover, in this special case an explicit formula to determine a priori the rank of \(M\) can be obtained.

Consider \(x_{1} = x_{2} = x\) and \(k_1 = k_2 = q\) so (again, the \(i\) index is dropped) \(x' = (1, x_{2}, \ldots , x_{q})\) and

$$\begin{aligned} a = \sum _{r=1}^{q}x_{r}\beta _{1r} \qquad b = \sum _{r=1}^{q}x_{r}\beta _{2r}. \end{aligned}$$

Similarly, \(G^{\beta _{j}}\) has \(q\) elements

$$\begin{aligned}{}[G^{\beta _{j1}}, G^{\beta _{j2}},\dots , G^{\beta _{jq}} ] \end{aligned}$$

for \(j=1,2\), such that

$$\begin{aligned} G^{\beta _{1r}} = S^{a}x_{r} \qquad G^{\beta _{2r}} = S^{b}x_{r} \end{aligned}$$

for \(r=1,\ldots ,q\) (see also Sect. 4.2).

To begin with, the three moment conditions associated to the two constant terms drop because

$$\begin{aligned} C^{\beta _{11},\beta _{11}}&= - \left[ \left( \sum _{r=1}^{q}x_{r}\beta _{1r}\right) S^{a} + s_{\psi }c_{\psi }S^{\psi } \right] = - \left[ \sum _{r=1}^{q}G^{\beta _{1r}}\beta _{1r} + s_{\psi }c_{\psi }S^{\psi } \right] \\ C^{\beta _{21},\beta _{21}}&= - \left[ \left( \sum _{r=1}^{q}x_{r}\beta _{2r}\right) S^{b} + s_{\psi }c_{\psi }S^{\psi } \right] = - \left[ \sum _{r=1}^{q}G^{\beta _{2r}}\beta _{2r} + s_{\psi }c_{\psi }S^{\psi } \right] \end{aligned}$$

Consider now the \(q^2\) elements associated with cross derivatives of \(\beta _{1r},\beta _{2s}\)

$$\begin{aligned} C^{\beta _{1r},\beta _{2s}} = c_{\psi }^2 S^{\psi }x_{r}x_{s} \end{aligned}$$

with \(s=1,\ldots ,q\). Since the sets of regressors are the same, the number of elements dropped due to collinearity will be \(q^2-q(q+1)/2\) plus the condition associated with the cross derivative of the constant terms

$$\begin{aligned} C^{\beta _{11},\beta _{21}} = c_{\psi }^2 S^{\psi }. \end{aligned}$$

There are also \(2q\) elements, collinear to other columns of \(M\), associated with the cross derivatives of regressors with \(\psi \) since

$$\begin{aligned} C^{\beta _{1t},\psi }&= - \left[ c_{\psi }\left( \sum _{r=1}^{q} x_{r}\beta _{1r} \right) - s_{\psi }\left( \sum _{r=1}^{q} x_{r}\beta _{2r} \right) \right] c_{\psi }S^{\psi }x_{t} \\&= -c_{\psi }^2 S^{\psi } \sum _{r=1}^{q}x_{t}x_{r}\beta _{1r} + s_{\psi }c_{\psi }S^{\psi }\sum _{r=1}^{q}x_{t}x_{r}\beta _{2r}\\&= - \sum _{r=1}^{q}C^{\beta _{1r},\beta _{2t}}\beta _{1t} + t_{\psi }\sum _{r=1}^{q}C^{\beta _{1r},\beta _{2t}}\beta _{2r} \end{aligned}$$

and as well

$$\begin{aligned} C^{\beta _{2t},\psi }&= - \left[ c_{\psi }\left( \sum _{r=1}^{q}x_{r}\beta _{2r} \right) - s_{\psi }\left( \sum _{r=1}^{q}x_{r}\beta _{1r} \right) \right] c_{\psi }S^{\psi }x_{t}\\&= - \sum _{r=1}^{q}C^{\beta _{1r},\beta _{2t}}\beta _{2r} + t_{\psi }\sum _{r=1}^{q}C^{\beta _{1r},\beta _{2t}}\beta _{1r} \end{aligned}$$

Finally, as shown earlier in this section, \(C^{\psi ,\psi }\) is always a linear combination of other columns of \(M\). So, in this setup, the number of degrees of freedom amounts to

$$\begin{aligned} df = \frac{k(k+1)}{2} -2 -q^2 + \frac{q(q+1)}{2} -1 -2q -1 \end{aligned}$$

and since \(k = 2q +1 \) we have

$$\begin{aligned} df = 3\left[ \frac{q(q+1)}{2}-1 \right] . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lucchetti, R., Pigini, C. A test for bivariate normality with applications in microeconometric models. Stat Methods Appl 22, 535–572 (2013). https://doi.org/10.1007/s10260-013-0236-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-013-0236-5

Keywords

JEL Classification

Navigation