Generalized likelihood ratio test for varying-coefficient models with different smoothing variables

https://doi.org/10.1016/j.csda.2006.07.027Get rights and content

Abstract

Varying-coefficient models are popular multivariate nonparametric fitting techniques. When all coefficient functions in a varying-coefficient model share the same smoothing variable, inference tools available include the F-test, the sieve empirical likelihood ratio test and the generalized likelihood ratio (GLR) test. However, when the coefficient functions have different smoothing variables, these tools cannot be used directly to make inferences on the model because of the differences in the process of estimating the functions. In this paper, the GLR test is extended to models of the latter case by the efficient estimators of these coefficient functions. Under the null hypothesis the new proposed GLR test follows the χ2-distribution asymptotically with scale constant and degree of freedom independent of the nuisance parameters, known as Wilks phenomenon. Further, we have derived its asymptotic power which is shown to achieve the optimal rate of convergence for nonparametric hypothesis testing. A simulation study is conducted to evaluate the test procedure empirically.

Introduction

Varying-coefficient models offer a flexible but parsimonious alternative to nonparametric models, and have been used in many contexts. A nice feature of varying coefficients is to allow appreciable flexibility on the structure of fitted models without suffering from the “curse of dimensionality”. The literature on this aspect apparently begins with Cleveland et al. (1992), Hastie and Tibshirani (1993) and Chen and Tsay (1993). The first two papers considered the i.i.d. samples with possible applications in biostatistics, while the last one studied the functional-coefficient autoregressive models by allowing coefficients to be driven by some lagged variables in an autoregressive model fashion. There has been substantial work on the estimation of coefficient functions, for example, Brumback and Rice (1998), Fan and Zhang (2000), Hoover et al. (1998) and Wu and Zhang (2002). However, study on inference questions, such as whether a parametric family adequately fits a given data set, or whether the coefficient functions are not varying, is very limited. Brunsdon (1999) employed the F-test to test whether all coefficient functions are not varying. Under the case where the distribution of the stochastic error in the model belongs to a certain parametric family, Fan et al. (2001) proposed the generalized likelihood ratio (GLR) test to test whether every coefficient function is different from some function. Fan and Zhang (2004a) proposed the sieve empirical likelihood ratio test for the case where the distribution of the stochastic error in the model is completely unspecified.

All these previous work are based on the assumption that all coefficient functions share the same smoothing variable in the varying-coefficient model,Y=α=1paα(X)Zα+ε,where (Y,X,Z) is random, YR, XR and Z=Z1,,ZpTRp; aα(·)α=1p are some measurable functions from R to R; ε is a random error with E(ε)=0, Var(ε)=σ2 and independent of (X,Z).

However, the assumption of all coefficient functions sharing the same smoothing variable is strict and has limited applications. It is therefore desirable to relax the assumption. We allow for different smoothing variables for different coefficient functions by discussing the varying-coefficient model,Y=α=1paαXαZα+ε,where (Y,X,Z) is random, X=X1,,XpTRp and other quantities are defined similarly as in (1).

In this paper, the efficient estimators of the coefficient functions in model (2) are given by local linear method, the averaged method and the backfitting techniques, and they are shown to have the same properties as the local linear estimators for model (1). After fitting the varying-coefficient model (2), one often asks whether the varying coefficients are in fact not varying, or whether the coefficient functions are not those given ones. Hastie and Tibshirani (1993) discussed briefly the problem and suggested a F-test based on estimators using the nature cubic spline. The test bears the varying degrees of freedom and involves strenuous calculations. However, to the best of our knowledge, based on the backfitting estimators, there has been virtually no formal or theoretical work on testing such a statement in the literature. The tools used in the inferences of model (1) cannot be used directly in model (2) because of the difference in the process of estimating the functions in the two models.

We attempt to develop an easily understandable and generally applicable approach to the testing problem. We extend the GLR test to addressing the above testing question. This not only provides a useful tool to the frequently asked question, but also enriches the GLR test theory.

The GLR test was proposed by Fan et al. (2001) for the inferences of nonparametric models. It is constructed by replacing the maximum likelihood estimator in the maximum likelihood ratio test by a reasonable nonparametric estimator of the coefficient function. The GLR test is extended in many fields, because it has several nice properties. First, the test follows an asymptotically rescaled χ2-distribution with scale constant and degree of freedom independent of the nuisance parameters, known as Wilks phenomenon. Second, the test is asymptotically optimal in terms of the rate of convergence for nonparametric hypothesis testing, and is adaptively optimal in the sense of Spokoiny by using a simple choice of adaptive smoothing parameters. Fan and Jiang (2005) extended the GLR test to additive models while Fan and Zhang (2004b) used the GLR test for spectral density.

We extend the GLR test to model (2) by using the efficient estimators of the coefficient functions. It is showed that the proposed GLR test shares the same nice properties as that of Fan et al. (2001).

The article proceeds as follows. In Section 2, we describe the efficient estimators of the varying coefficients. Section 3 develops the theoretical framework for the GLR test, its asymptotic null distribution, power and minimax rate. A simulation study is conducted in Section 4 to empirically evaluate the GLR test procedure, and to compare its performance with the F-test. Section 5 gives a conclusion and technical proofs are outlined in the Appendix.

Section snippets

Efficient estimators

Suppose that Yi,Xi,Zi,i=1,,n in model (2) are i.i.d.. First, for every α=1,,p, we define an initial estimator of aαx0α, by the local linear method, asaˇαx0α=1nj=1na˜αXj1,,Xj,α-1,x0α,Xj,α+1,,Xjp,where a˜αx0=i=1neα,2pT(UTWU)-1UiWiiYi,x0=x01,,x0pT, eα,2p is a unit vector with 1 at its αth position, U is an n×2p matrix with UiT=Zi1,,Zip,Xi1-x01/q1Zi1,,Xip-x0p/qpZip as its ith row, W=diagW11,,Wnn with Wii=α=1pQα,qαXiα-x0α, Xiα and Ziα are the αth entries of the ith observation Xi and Zi,

Generalized likelihood ratio test

In this section we define our proposed GLR statistic and develop its asymptotic theory under model (2), based on the efficient estimators of the coefficient functions defined in (4). The Wilks phenomenon and optimality are unveiled.

Under model (2), we consider the simple null hypothesis testing problem:H0:aαxα=aα0xαH1:aαxαaα0xα,α=1,,p.If aα0xαα=1p are some constants, (5) tests whether the coefficient functions are not varying; if aα0xαα=1p are some functions, it tests whether the coefficient

Simulation example

In this section we report the results of our simulation experiments. The purposes of the simulations are twofold. First, the Wilks phenomenon is demonstrated; and second, the power of the new GLR statistic is investigated. The effect of the error distribution on the performance of the test is also investigated. Numerical results show that the new GLR test performs satisfactorily.

We consider the model Y=a1X1Z1+a2X2Z2+ε,and testing the null hypothesis H0:a1x1=sinπx1,a2x2=cosπx2,against the

Conclusion

We have proposed a GLR test for the varying-coefficient function of a varying-coefficient model. Under the null hypothesis, the test statistic has been shown to follow the chi-square distribution asymptotically with scale constant and degree of freedom independent of the nuisance parameters, known as Wilks phenomenon. We have also derived an approximation to the power of the test under some regularity conditions. Monte Carlo models have been used to investigate the distribution of the test

Acknowledgments

The authors thank a Co-Editor and an anonymous referee for their constructive comments that have led to the improved quality of the paper. This research was supported by a research grant from the Hong Kong Polytechnic University Research Committee.

References (16)

  • B. Brumback et al.

    Smoothing spline models for the analysis of nested and crossed samples of curves

    J. Amer. Statist. Assoc.

    (1998)
  • C. Brunsdon

    Some notes on parametric significance tests for geographically weighted regression

    J. Regional Sci.

    (1999)
  • R. Chen et al.

    Function-coefficient autoregressive models

    J. Amer. Statist. Assoc.

    (1993)
  • W.S. Cleveland et al.

    Local regression models

  • P. De Jong

    A central limit theorem for generalized quadratic forms

    Probab. Theory Related Fields

    (1987)
  • J. Fan et al.

    Nonparametric inferences for additive models

    J. Amer. Statist. Assoc.

    (2005)
  • J. Fan et al.

    Functional linear models with applications to longitudinal data

    J. Roy. Statist. Soc. Ser. B

    (2000)
  • J. Fan et al.

    Sieve empirical likelihood ratio tests for nonparametric functions

    Ann. Statist.

    (2004)
There are more references available in the full text version of this article.

Cited by (0)

View full text