A new test for heteroscedasticity in single-index models

https://doi.org/10.1016/j.cam.2020.112993Get rights and content

Abstract

In this article, a new test is proposed for single-index models (SIM) based on the pairwise distances of the sample points, to check the heteroscedasticity. The statistic can be formulated as a U statistic and does not have to estimate the conditional variance function by using nonparametric methods. We derive a computationally feasible approximation for the limit null distribution of the statistic. We prove that the proposed procedure is valid approximation to the null distribution of the test. It shows that the statistic has an asymptotically normal distribution. The algorithmic program of this test is easy to implement. In addition, convergence rate of the statistic does not depend on the dimensions of the covariates. Finally, we give some numerical simulations and a real data example for illustrating the proposed method.

Introduction

In this article, we consider the SIM Y=g(XTθ)+ε,where X=(X1,,Xp)TRp are covariates, g() is an unknown smooth link function, ε is an independent random error with E(ε|X)=0. θ=(θ1,,θp)Rq are unknown with θ=1 and θ1>0 for identification.

In regression analysis, we often assume that error terms have common variance. Under such an assumption, many authors investigate the inference for SIM. See [1], [2], [3], [4], [5], [6] for more details. Without this assumption, we need more complicated methods to estimate the unknown parameter and regression function. Therefore, detection of heteroscedasticity in SIM is an important issue. Our objective is to check variance heterogeneity in (1) by testing the following hypothesis H0:σ2>0,E(ε2|X)=σ2(X)=σ2,H1:σ2>0,E(ε2|X)σ2.Obviously, the heteroscedasticity test in (2) is equivalent to determining whether E(ε2|X) is equal to E(ε2)=σ2.

A number of authors have investigated heteroscedasticity tests for regression models, such as [7], [8], [9], [10], [11], [12]. For testing heteroscedasticity in SIM, however, the literature is insufficient. [13] proposed a kernel smoothing-based statistic for checking the heteroscedasticity in SIM, which has a drawback of the dimensionality problem because of the estimation inefficiency for the multivariate nonparametric function. As a result, the significance level cannot be well maintained when we apply the limiting null distribution in moderate sample size, and the test in [13] can only check alternative hypothesis that is different from the null hypothesis at the rate of O(n12h14). Therefore, the test statistic in [13] is less powerful.

In this paper, we develop a new statistic for detecting heteroscedasticity in SIM. This statistic is based on the weighted integral of the residual marked characteristic function. The weight function plays an important role in the proposed test statistic, in which the density function of a spherical stable law is used as the weight function. Under this case, the weighted integral can be transformed into a simple unconditional expectation. This yields the statistic is constructed only by pairwise distances between points in a sample, which avoids the dimension problem of the kernel smoothing-based statistic in [13]. To the best of our knowledge, this work is the first applying characteristic function to check heteroscedasticity in SIM.

The rest of this article is organized as follows. In Section 2, the test statistic is developed and its asymptotic properties are established. In Section 3, we propose a simple bootstrap algorithm to detect heteroscedasticity for the SIM. In Section 4, numerical studies are conducted to evaluate the performance of the test. In Section 5, a real data is analyzed for illustrating the proposed methodology. Conclusions are given in Section 6. Technical assumptions and proofs are provided in Appendix.

Section snippets

Construction of test statistic

Let r=ε2σ2 with σ2=E(ε2). We have E(r|X)=0, under H0. According to the uniqueness of the Fourier transform of a function, we can do the following equivalent substitution for H0 H0:ϕ(t)=E[reitTX]=0,tRp.Note that ϕ(t) is not a statistic by itself, then we have Dω=Rp|ϕ(t)|2ω(t)dt,where ω(t)0 is a suitable weight function. In terms of the definition of complex modulus, we have |ϕ(t)|2=E[cos(tT(XX))rr],where (X,r) is an independent copy of (X,r). The characteristic function of a spherical

Estimation of SIM

We first estimate θ and g() using the MAVE, and see [5], [6] for more details. The main implementation steps are as follows:

  • (1)

    For the given initial estimate θ, calculate fˆθ(XjTθ)=n1i=1nKh(XijTθ)and ajθbjθh=i=1nKh(XijTθ)1XijTθh1XijTθhT1×i=1nKh(XijTθ)1XijTθhYi, where Xij=XiXj, K() is a kernel function, ajθ and bjθ are estimators of g(XjTθ) and g(XjTθ), respectively.

  • (2)

    Calculate θ=i,j=1nKh(XijTθ)ρˆjθ(bjθ)2XijXijTfˆθ(XijTθ)1×i,j=1nKh(XijTθ)ρˆjθbjθXij(yiajθ)fˆθ(XjTθ), where ρˆjθ=ρn(fˆθ(

Numerical studies

In this section, we investigate the performance of the proposed test statistic in a finite sample size by numerical studies. To assess the performance of power, the following two examples are designed, in which 1000 replications of the experiment are taken to calculate empirical sizes and powers at the significance level α=0.05. The sample sizes are n=100, 200 and 400, and the number of bootstrap sample is set to be B=500. For comparison, we consider other two competitors, i.e., (a) Zheng’s

Delft data

In this section, the Delft dataset is analyzed for illustration. This dataset was performed at the Delft Ship Hydromechanics Laboratory, comprising 308 full-scale experiments, which can be obtained from the website https://archive.ics.uci.edu/ml/machine-learning-databases/00243/. These experiments derived from a parent form closely related to the ‘Standfast 43’ designed by Frans Maas. The covariates are Xi1—longitudinal position of the center of buoyancy, Xi2—prismatic coefficient, Xi3

Conclusions

In this article, a new test is proposed to check the heteroscedasticity in SIM. The test statistic is based on the pairwise distance between sample points. The results show that the statistic shares asymptotic normal distributions under the null, local alternative, and fixed alternative hypotheses, but the asymptotic means and variances are different. The proposed method is easy to implement and the convergence rate of the statistic does not depend on the dimensions of the covariates, which

Acknowledgments

We are very grateful to the Associate Editor and reviewers for their helpful comments and suggestions which largely improved our work. This research was supported by Science Foundation of Xuzhou University of Technology of China No. XKY2018120.

References (19)

There are more references available in the full text version of this article.

Cited by (0)

1

Yan-Yong Zhao and Jian-Qiang Zhao are co-first authors of the paper.

View full text