Skip to main content
Log in

A bias reducing technique in kernel distribution function estimation

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper we suggest a bias reducing technique in kerneldistribution function estimation. In fact, it uses a convex combination of three kernel estimators, and it turned out that the bias has been reduced to the fourth power of the bandwidth, while the bias of the kernel distribution function estimator has the second power of the bandwidth. Also, the variance of the proposed estimator remains at the same order as the kernel distribution function estimator. Numerical results based on simulation studies show this phenomenon, too.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Altman N, Leger C (1995) Bandwidth selection for kernel distribution function estimation. J Stat Plan Inference 46:195–214

    Article  MathSciNet  Google Scholar 

  • Azzalini A (1981) A note on the estimation of a distribution function and quantiles by a kernel method. Biometrika 68:326–328

    Article  MathSciNet  Google Scholar 

  • Bowman A, Hall P, Prvan T (1998) Bandwidth selection for the smoothing of distribution functions. Biometrika 85:799–808

    Article  MathSciNet  Google Scholar 

  • Cheng M-Y, Choi E, Fan J, Hall P (2000) Skewing methods for two-parameter locally parametric density estimation. Bernoulli 6:169–182

    Article  MathSciNet  Google Scholar 

  • Choi E, Hall P (1998) On bias reduction in local linear smoothing. Biometrika 85:333–345

    Article  MathSciNet  Google Scholar 

  • Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

    Article  MathSciNet  Google Scholar 

  • Kim C, Kim W, Park B-U (2003) Skewing and generalized jackknifing in kernal density estimation. Commun Stat Theory Methods 32:2153–2162

    Article  Google Scholar 

  • Nadaraya EA (1964) Some new estimates for distribution functions. Theory Prob Appl 15:497–500

    Article  Google Scholar 

  • Reiss R-D (1981) Nonparametric estimation of smooth distribution functions. Scand J Stat 8:116–119

    MathSciNet  MATH  Google Scholar 

  • Sarda P (1993). Smoothing parameter selection for smooth distribution functions. J Stat Plan Inference 35:65–75

    Article  MathSciNet  Google Scholar 

  • Silverman BW (1986). Density estimation for statistics and data analysis. Chapman and Hall, London

    Book  Google Scholar 

  • Schucany WR, Sommers JP (1977) Improvement of kernel type estimators. J Am Stat Assoc 72:420–423

    Article  MathSciNet  Google Scholar 

  • Wand MP, Jones MC (1995) Kernel smoothing. Chapman and Hall, London

    Book  Google Scholar 

Download references

Acknowledgements

This research was supported by Korea Science and Engineering Foundation grant(R14-2003-002-01000-0). The authors thank the editor and referees for their helpful comments which greatly improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Choongrak Kim.

Appendix

Appendix

Proof of Theorem 1

By a Taylor expansion

$$\begin{array}{l}\widehat{E F}(x+l h)\\ =\int\limits_{-\infty}^{\infty} W\left(\frac{x+l h-y}{h}\right) f(y) \mathrm{d} y\\ =\int\limits_{-\infty}^{x-(1-l) h} f(y) \mathrm{d} y+\int\limits_{x-(1-l) h}^{x+(1+l) h} W\left(\frac{x+l h-y}{h}\right) f(y) \mathrm{d} y\\ =F(x-(1-l) h)+\int\limits_{-1}^{1} W(t) f(x-(t-l) h) h \mathrm{d} t\\ =F(x)-(1-l) h f(x)+\frac{1}{2}(1-l)^{2} h^{2} f^{\prime}(x)-\frac{1}{6}(1-l)^{3} h^{3} f^{\prime \prime}(x)\\ +\frac{1}{24}(1-l)^{4} h^{4} f^{\prime \prime \prime}(x)+\int\limits_{-1}^{1} h W(t)\left\{f(x)-(t-l) h f^{\prime}(x)\right.\\ +\frac{1}{2}(t-l)^{2} h^{2} f^{\prime \prime}(x)-\frac{1}{6}(t-l)^{3} h^{3} f^{\prime \prime \prime}(x)+o\left(h^{3}\right) \} \mathrm{d} t+o\left(h^{4}\right), \end{array}$$

and by using the following facts;

$$\begin{array}{l}{\int(t-l) W(t) \mathrm{d} t=\left(1-\mu_{2}\right) / 2-l}, \\ {\int(t-l)^{2} W(t) \mathrm{d} t=\frac{1}{3}-\left(1-\mu_{2}\right) l+l^{2}}, \\ {\int(t-l)^{3} W(t) \mathrm{d} t=\left(1-\mu_{4}\right) / 4-l+\frac{3}{2} l^{2}\left(1-\mu_{2}\right)-l^{3}},\end{array}$$

we have

$$\begin{array}{l}{E \widehat{F}(x+l h)} \\ {=F(x)+l f(x) h+\frac{1}{2}\left(l^{2}+\mu_{2}\right) f^{\prime}(x) h^{2}+\frac{1}{6}\left(l^{3}+3 l \mu_{2}\right) f^{\prime \prime}(x) h^{3}} \\ {\quad+\frac{1}{24}\left(l^{4}+6 l^{2} \mu_{2}+\mu_{4}\right) f^{\prime \prime \prime}(x) h^{4}+o\left(h^{4}\right)}.\end{array}$$

Also, it is easy to show that

$$\begin{aligned} E\widehat{ f}(x+l h)=& f(x)+l f^{\prime}(x) h+\frac{1}{2}\left(l^{2}+\mu_{2}\right) f^{\prime \prime}(x) h^{2}+\frac{1}{6}\left(l^{3}+3 l \mu_{2}\right) f^{\prime \prime \prime}(x) h^{3} \\ &+\frac{1}{24}\left(l^{4}+6 l^{2} \mu_{2}+\mu_{4}\right) f^{(4)}(x) h^{4}+o\left(h^{4}\right). \end{aligned}$$

Hence,

$$\begin{aligned} E[\widehat{F}(x+l h)-lh \widehat{f}(x+l h)]=& F(x)+\frac{1}{2}\left(\mu_{2}-l^{2}\right) f^{\prime}(x) h^{2}-\frac{1}{3} l^{3} f^{\prime \prime}(x) h^{3} \\ &+\frac{1}{24}\left(\mu_{4}-3 l^{4}-6 l^{2} \mu_{2}\right) f^{\prime \prime \prime}(x) h^{4}+o\left(h^{4}\right). \end{aligned}$$

Therefore, \(E \tilde{F}(x)\) can be computed by letting l = 0, l1 and l2, and the terms in h2 and h3 vanish if and only if

$$\lambda_{1} l_{1}^{3}+\lambda_{2} l_{2}^{3}=0$$

and

$$\lambda_{1}\left(\mu_{2}-l_{1}^{2}\right)+\mu_{2}+\lambda_{2}\left(\mu_{2}-l_{2}^{2}\right)=0.$$

By letting −l1 = l2 = l, the two equations imply

$$l \equiv l(\lambda)=\sqrt{\frac{2 \lambda+1}{2 \lambda} \mu_{2}}.$$

Finally, if we substitute λ1 = λ2 = λ, −l1 = l2 = l = {μ2(2λ + 1)/2λ}1/2 for \(E \tilde{F}(x)\), then we get the desired result for the bias part. Similar computations, though quite tedious, give the variance part. First, note that

$$\begin{aligned} \operatorname{Var}[\tilde{F}]=\frac{1}{(2 \lambda+1)^{2}} &\left[\lambda^{2} \operatorname{Var}\left(\widehat{F}_{1}\right)+\operatorname{Var}(\widehat{F})+\lambda^{2} \operatorname{Var}\left(\widehat{F}_{2}\right)\right.\\ &\left. +2 \lambda \operatorname{Cov}\left(\widehat{F}_{1}, \widehat{F}\right)+2 \lambda \operatorname{Cov}\left(\widehat{F}, \hat{F}_{2}\right)+2 \lambda^{2} \operatorname{Cov}\left(\widehat{F}_{1}, \widehat{F}_{2}\right)\right] . \end{aligned}$$

Now, we will compute each term on the right hand side of Var\([\tilde{F}]\) except \(\operatorname{Var}(\widehat{F})\) which is given in Sect. 2. Let

$$W_{2}=\int\limits_{-1}^{1} W^{2}(t) \mathrm{d} t, \quad K_{2}=\int\limits_{-1}^{1} K^{2}(t) \mathrm{d} t,$$

then, for the first term, we have

$$\operatorname{Var}\left(\widehat{F}_{1}\right)=\operatorname{Var}[\widehat{F}(x-l h)]+l^{2} h^{2} \operatorname{Var} [\widehat{f}(x-l h) ]+2 lh \operatorname{Cov}[\widehat{F}(x-l h), \widehat{f}(x-l h)].$$

Now,

$$\begin{array}{l}{\operatorname{Var}[\widehat{F}(x-l h)]} \\ {\quad=\frac{1}{n}\left[E\left\{W^{2}\left(\frac{x-l h-y}{h}\right)\right\}-E^{2}\left\{W\left(\frac{x-l h-y}{h}\right)\right\}\right]} \\ {=\frac{1}{n} F(x)\{1-F(x)\}-\frac{h}{n} f(x)\left\{1+l-W_{2}-2 l F(x)\right\}+O\left(h^{2}\right)}\end{array}$$

because

$$E\left\{W^{2}\left(\frac{x-l h-y}{h}\right)\right\}=F(x)-(1+l) h f(x)+h f(x) W_{2}+O\left(h^{2}\right)$$

and

$$E^{2}\left\{W\left(\frac{x-lh-y}{h}\right)\right\}=F^{2}(x)-2 lhf(x) F(x)+O\left(h^{2}\right).$$

Also, it can be easily shown that

$$\operatorname{Var}[\widehat{f}(x-l h)]=\frac{1}{n h} f(x) K_{2}+O\left(h^{2}\right),$$

and

$$\operatorname{Cov}[\widehat{F}(x-l h), \widehat{f}(x-l h)]=\frac{1}{n}\left\{\frac{1}{2}-F(x)\right\}+O\left(h^{2}\right).$$

Therefore,

$$\operatorname{Var}\left(\widehat{F}_{1}\right)=\frac{1}{n} F(x)\{1-F(x)\}+\frac{h}{n} f(x)\left\{W_{2}+l^{2} K_{2}-1\right\}+O\left(h^{2}\right).$$

Similar computations give

$$\operatorname{Var}\left(\widehat{F}_{2}\right)=\frac{1}{n} F(x)\{1-F(x)\}+\frac{h}{n} f(x)\; \left\{W_{2}+l^{2} K_{2}-1\right\}+O\left(h^{2}\right)$$

since

$$\begin{aligned} \operatorname{Var}&[\widehat{F}(x+l h)] & \\=& \frac{1}{n}\left[E\left\{W^{2}\left(\frac{x+-y}{h}\right)\right\}-E^{2}\left\{W\left(\frac{x+l h-y}{h}\right)\right\}\right] \\=& \frac{1}{n} F(x)\{1-F(x)\}+\frac{h}{n} f(x)\left\{l-1+W_{2}-2 l F(x)\right\}+O\left(h^{2}\right). \end{aligned}$$

Now, we have

$$\begin{array}{l}{\operatorname{Cov}\left(\widehat{F_{1}}, \widehat{F}\right)} \\ {\;\;=-\frac{1}{n} F^{2}(x)+\frac{h}{n} f(x)\left\{\int W(t-l) W(t) \mathrm{d} t+\frac{l}{2} \int K(t-l) W(t) \mathrm{d} t\right\}+O\left(h^{2}\right)},\end{array}$$
$$\begin{array}{l}{\operatorname{Cov}\left(\widehat{F_{2}}, \widehat{F}\right)} \\ {\;\;=-\frac{1}{n} F^{2}(x)+\frac{h}{n} f(x)\left\{\int W(t+l) W(t) \mathrm{d} t-\frac{l}{2} \int K(t+l) W(t) \mathrm{d} t\right\}+O\left(h^{2}\right)},\end{array}$$

and

$$\begin{array}{l}{\operatorname{Cov}\left(\widehat{F}_{1}, \widehat{F}_{2}\right)} \\ {\quad=-\frac{1}{n} F^{2}(x)+\frac{h}{n} f(x)\left\{\int W(t-l) W(t+l) \mathrm{d} t+l \int K(t-l) W(t+l) d t\right.} \\ {\quad-l \int W(t-l) K(t+l) \mathrm{d} t-l^{2} \int K(t-l) K(t+l) \mathrm{d} t \}+O\left(h^{2}\right)}.\end{array}$$

By adding up all these terms we have the desired result for the variance.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, C., Kim, S., Park, M. et al. A bias reducing technique in kernel distribution function estimation. Computational Statistics 21, 589–601 (2006). https://doi.org/10.1007/s00180-006-0016-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-006-0016-x

Keywords

Navigation