Skip to main content
Log in

Consistency of a nonparametric conditional mode estimator for random fields

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Given a stationary multidimensional spatial process \(\left\{ Z_{\mathbf{i}}=\left( X_{\mathbf{i}},\ Y_{\mathbf{i}}\right) \in \mathbb R ^d\right. \left. \times \mathbb R ,\mathbf{i}\in \mathbb Z ^{N}\right\} \), we investigate a kernel estimate of the spatial conditional mode function of the response variable \(Y_{\mathbf{i}}\) given the explicative variable \(X_{\mathbf{i}}\). Consistency in \(L^p\) norm and strong convergence of the kernel estimate are obtained when the sample considered is a \(\alpha \)-mixing sequence. An application to real data is given in order to illustrate the behavior of our methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Allard D, D’Or D, Froidevaux R (2011) An efficient maximum entropy approach for categorical variable prediction. Eur J Soil Sci 62(3):381–393

    Article  Google Scholar 

  • Anselin L, Florax RJGM (1995) New directions in spatial econometrics. Springer, Berlin

    Book  MATH  Google Scholar 

  • Biau G, Cadre B (2004) Nonparametric spatial prediction. Stat Inference Stoch Proc 7:327–349

    Article  MATH  MathSciNet  Google Scholar 

  • Carbon M, Hallin M, Tran LT (1996) Kernel density estimation for random fields. Stat Probab Lett 36:115–125

    Article  Google Scholar 

  • Carbon M, Tran LT, Wu B (1997) Kernel density estimation for random fields: the \(L_{1}\) theory. Nonparametric Stat 6:157–170

    Article  Google Scholar 

  • Carbon M, Francq C, Tran LT (2007) Kernel regression estimation for random fields. J Stat Plan Inference 137(3):778–798

    Article  MATH  MathSciNet  Google Scholar 

  • Chilès JP, Delfiner P (1999) Statistics for spatial data. Wiley, New York

    Google Scholar 

  • Collomb G, Hrdle W, Hassani S (1987) A note on prediction via conditional mode estimation. J Stat Plan Inf 15:227–236

    Article  MATH  Google Scholar 

  • Cressie NAC (1991) Statistics for spatial data. Wiley Series in Probability and Mathematical Statistics, New-York

    MATH  Google Scholar 

  • Dabo-Niang S, Yao AF (2007) Kernel regression estimation for continuous spatial processes. Math Methods Stat 16:298–317

    Article  MATH  MathSciNet  Google Scholar 

  • Dabo-Niang S, Yao A-F, Pischedda L, Cuny P, Gilbert F (2009) Spatial mode es- timation for functional random fields with application to bioturbation problem. Stoch Environ Res Risk Assess 24(4):487–497

    Article  Google Scholar 

  • Devroye L (1981) On the absolute everywhere convergence of nonparametric regression function estimates. Ann Stat 9:1310–1319

    Article  MATH  MathSciNet  Google Scholar 

  • Doukhan P (1994) Mixing—properties and examples, Lecture notes in statistics. Springer, New York

  • Ferraty F, Rabhi A, Vieu P (2005) Conditional quantiles for functionally dependent data with application to the climatic El Niño Phenomenon. Sankhya 67:378–399

    MATH  MathSciNet  Google Scholar 

  • Gannoun A, Saracco J, Yu K (2003) Nonparametric prediction by conditional median and quantiles. J Stat Plan Inference 117(2):207–223

    Article  MATH  MathSciNet  Google Scholar 

  • Gao J, Lu Z, Jjøstheim D (2008) Moment inequalities for spatial processes. Stat Probab Lett 78:687–697

    Article  MATH  Google Scholar 

  • Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, Oxford

    Google Scholar 

  • Goovaerts P (1998) Ordinary cokriging revisited. Math Geol 30(1):21–42

    Article  MathSciNet  Google Scholar 

  • Guyon X (1995) Random fields on a network—modeling, statistics, and applications. Springer, New-York

    MATH  Google Scholar 

  • Hall P, Racine JS, Li Q (2004) Cross-validation and the estimation of conditional probability densities. J Am Stat Assoc 99:1015–1026

    Article  MATH  MathSciNet  Google Scholar 

  • Hall P, Muller HG, Wu PS (2006) Real-time density and mode estimation with application to time-dynamic mode tracking. J Comput Gr Stat 15(1):82–100

    Article  MathSciNet  Google Scholar 

  • Hallin M, Lu Z, Tran LT (2004a) Kernel density estimation for spatial processes: the \(L_{1}\) theory. J Multivar Anal 88(1):61–75

    Article  MATH  MathSciNet  Google Scholar 

  • Hallin M, Lu Z, Tran LT (2004b) Local linear spatial regression. Ann Stat 32(6):2469–2500

    Article  MATH  MathSciNet  Google Scholar 

  • Hallin M, Lu Z, Yu K (2009) Local linear spatial quantile regression. Bernoulli 15(3):659–686

    Article  MATH  MathSciNet  Google Scholar 

  • Koenker R, Mizera I (2004) Penalized triograms: total variation regularization for bivariate smoothing. J R Stat Soc Ser B 66:145–164

    Article  MATH  MathSciNet  Google Scholar 

  • Lahiri SN (2006) Resampling methods for spatial regression models under a class of stochastic designs. Ann Stat 34(4):1774–1813

    Article  MATH  Google Scholar 

  • Li Q, Racine JS (2004) Cross-validated local linear nonparametric regression. Stat Sinica 14:485–512

    MATH  MathSciNet  Google Scholar 

  • Louani D, Ould-Saïd E (1999) Asymptotic normality of kernel estimators of the conditional mode under strong mixing hypothesis. J Nonparametric Stat 11:413–442

    Article  MATH  Google Scholar 

  • Lu Z, Chen X (2002) Spatial nonparametric regression estimation: non-isotropic case. Acta Math Appl Sinica Engl Ser 18(4):641–656

    Article  MATH  Google Scholar 

  • Lu Z, Chen X (2004) Spatial kernel regression estimation: weak concistency. Stat Probab Lett 68:125–136

    Article  MATH  Google Scholar 

  • Ould Abdi S, Dabo-Niang S, Diop A (2010a) Consistency of a nonparametric conditional quantile estimator for random fields. Math Methods Stat 19:1–21

    Article  MathSciNet  Google Scholar 

  • Ould Abdi A, Diop A, Dabo-Niang S (2010b) Estimation non paramétrique du mode conditionnel dans le cas spatial. CR Acad Sci Paris Ser I 348:815–819

    Article  MATH  MathSciNet  Google Scholar 

  • Ould-Saïd E (1997) A note on ergodic processes prediction via estimation of the conditional mode function. Scand J Stat 24:231–239

    Article  MATH  Google Scholar 

  • Quintela del Rio A, Vieu Ph (1997) A nonparametric conditionnal mode estimate. J Nonparametric Stat 8:253–266

    Article  MATH  MathSciNet  Google Scholar 

  • Racine JS, Li Q (2004) Nonparametric estimation of regression functions with both categorical and continuous data. J Econom 119:99–130

    Article  MathSciNet  Google Scholar 

  • Ripley B (1981) Spatial statistics. Wiley, New-York

    Book  MATH  Google Scholar 

  • Salha R, Ahmed HS (2009) On the kernel estimation of the conditional mode. Asian J Math Stat 2(1):1–8

    Article  Google Scholar 

  • Salha R, Ioannides D (2004) Joint asymptotic distribution of the estimated conditional mode at a finite number of distinct points. In: Proceedings of the national statistical conference, April, 14–18, Lefhada, Greece, pp 587–594

  • Sayers B, Men Mansourian BG, Phan Tan T (1977) A pattern-analysis study of a wild-life rabies epizootic. Med Inform 2:11–34

    Article  Google Scholar 

  • Tran LT (1990) Kernel density estimation on random fields. J Multivar Anal 34:37–53

    Article  MATH  Google Scholar 

  • Youndjé E (1993) Estimation non paramétrique de la densité conditionnelle par la méthode du noyau. Thèse de Doctorat, Université de Rouen

  • Yu K, Jones MC (1998) Local linear quantile regression. J Am Stat Assoc 93:228–237

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

These results were obtained thanks to the support of AIRES-Sud, a programme from the French Ministry of Foreign and European Affairs implemented by the “Institut de Recherche pour le Développement (IRD-DSF)”.The authors acknowledge grants from “Ministère de la Recherche Scientifique” of Senegal. The authors are indebted to the anonymous referees for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aliou Diop.

Appendix

Appendix

In this section, we establish the main results and give the necessary technical lemmas with their proofs.

1.1 Proofs of technical lemmas

The following two technical lemmas are given respectively in Tran (1990) and Carbon et al. (1997) and their proofs will then be omitted.

Lemma 1

(i) Suppose that (4) holds. Denote by \(\mathcal L _r(\mathcal F )\) the class of \(\mathcal F -\)measurable r.v.’s \(X\) satisfying \(\Vert X\Vert _r=(E|X|^r)^{1/r}<\infty \). Suppose \(X \in \mathcal L _r(\mathcal B (E))\) and \(Y \in \mathcal L _s(\mathcal B (E'))\). Assume also that \(1\le r,\ s,\ t<\infty \) and \(r^{-1}+s^{-1}+t^{-1}=1\). Then

$$\begin{aligned} |EXY-EXEY|\le C\Vert X\Vert _r \Vert Y\Vert _s \{\psi ({\textit{Card}}(E),{\textit{Card}}(E'))\varphi ({\textit{dist}}(E,E'))\}^{1/t}.\nonumber \\ \end{aligned}$$
(11)

(ii) For r.v.’s bounded with probability 1, the right-hand side of (11) can be replaced by \(C\psi (\textit{Card}(E),\textit{Card}(E'))\varphi (\textit{dist}(E,E'))\).

Lemma 2

Let \(S_1,S_2,\ldots ,S_k\) be sets containing \(m\) sites each with \(\textit{dist}(S_i,S_j)\ge \delta \) for all \(i\ne j\) where \(1\le i,j\le k\). Let \(W_1,W_2,\ldots ,W_k\) be a sequence of real valued r.v’s taking value in \([a,b]\) and defined on some probability space \((\Omega ,\,\,\mathcal A ,\mathbf{P})\). Suppose \(W_1,W_2,\ldots ,W_k\) be respectively measurable with respect to \(\mathcal B (S_1),\mathcal B (S_2),\ldots ,\mathcal B (S_k)\). Then there exists a sequence of independent r.v’s \(W_1^*,W_2^*,\ldots ,W_k^*\) independent of \(W_1,W_2,\ldots ,W_k\) such that \(W_i^*\) has the same distribution as \(W_i\) and satisfies

$$\begin{aligned} \sum _{i=1}^k E\left| W_i-W_i^*\right| \le 2k(b-a)\psi \left( (k-1)m,m\right) \varphi (\delta ). \end{aligned}$$

The proof of of Theorem 1 is based on the following lemma.

Lemma 3

Under conditions of Theorem 1, we have for any compact \(\mathcal C \) of \(\mathbb R \)

$$\begin{aligned} \sup _{y\in \mathcal C }\left| \widehat{f^x}(y)-f^x(y)\right| \stackrel{a.co}{\rightarrow }0. \end{aligned}$$

In order to establish Lemma 3, we introduce the following notations and state the following three technical lemmas. For \(y\in \mathbb R \), let

$$\begin{aligned}&V_\mathbf{i}(x,y) = \frac{1}{\widehat{\mathbf{n}}h_K^d h_H}K\left( \frac{x-X_\mathbf{i}}{h_K}\right) H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) ,\\&\varDelta _\mathbf{i}(x,y)=V_\mathbf{i}(x,y)-EV_\mathbf{i}(x,y) \end{aligned}$$

and

$$\begin{aligned} S_\mathbf{n}(x,y)&= \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}\varDelta _\mathbf{i}(x,y)=f_\mathbf{n}(x,y)-Ef_\mathbf{n}(x,y),\\ I_\mathbf{n}(x,y)&= \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}E(\varDelta _\mathbf{i}(x,y))^2, R_\mathbf{n}(x,y)= \sum _{\underset{\mathbf{i},\mathbf{j}\in \mathcal I _\mathbf{n}}{\mathbf{i} \ne \mathbf{j}}}E\left| \varDelta _\mathbf{i}(x,y) \varDelta _\mathbf{j}(x,y)\right| . \end{aligned}$$

Lemma 4

Under the conditions of Lemma 3, we have

$$\begin{aligned} \displaystyle I_\mathbf{n}(x,y)+ R_\mathbf{n}(x,y)=O\left( \frac{1}{\widehat{\mathbf{n}}h_K^dh_H}\right) . \end{aligned}$$

Lemma 5

Under the conditions of Lemma 3, we have

$$\begin{aligned} \sup _{y \in \mathcal C }|f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\stackrel{a.co}{\rightarrow } 0. \end{aligned}$$

Lemma 6

If the conditions of Lemma 3 are satisfied, then

$$\begin{aligned} \displaystyle \widehat{f}(x)\stackrel{a.co}{\rightarrow } f_X(x). \end{aligned}$$

Proof of Lemma 4

We have

$$\begin{aligned} \widehat{\mathbf{n}} h_K^dh_H I_{\mathbf{n}}(x,y) = \widehat{\mathbf{n}} h_K^dh_H \sum _{{\mathbf{i}}\in \mathcal I _ {\mathbf{n}}}\left( E V_\mathbf{i}^2(x,y)-E^2V_\mathbf{i}(x,y)\right) . \end{aligned}$$

First, remark that

$$\begin{aligned}&\widehat{\mathbf{n}}h_K^dh_H\sum _{{\mathbf{i}}\in \mathcal I _ {\mathbf{n}}}E V_\mathbf{i}^2(x,y)\\&\quad =h_K^{-d}h_H^{-1}\int \limits _\mathbb{R ^{d+1}}K^2 \left( \frac{x-z}{h_K}\right) H^2\left( \frac{y-v}{h_H}\right) f_{X,Y}(z,v)dzdv\\&\quad =\int \limits _\mathbb{R ^{d+1}}K^2\left( z\right) H^2\left( v\right) f_{X,Y}(x-h_Kz,y-h_Hv)dzdv. \end{aligned}$$

By assumption \(H_0\) and Lebesgue dominated theorem, this last integral converges to \(f_{X,Y}(x,y)\int _\mathbb{R ^{d+1}}K^2 \left( z\right) H^2\left( v\right) dzdv\). Next, notice that

$$\begin{aligned}&\widehat{\mathbf{n}} h_K^d h_H\sum _{{\mathbf{i}}\in \mathcal I _{\mathbf{n}}}E^2V_{\mathbf{i}}(x,y)\\&\quad =h_K^{-d}h_H^{-1}\left( \int \limits _\mathbb{R ^{d+1}} K\left( \frac{x-z}{h_K}\right) H\left( \frac{y-v}{h_H}\right) f_{X,Y}(z,v)dzdv\right) ^2. \end{aligned}$$

By an usual change of variables, we obtain

$$\begin{aligned}&\widehat{\mathbf{n}} h_1^d h_H\sum _{\mathbf{i}\in \mathcal I _ {\mathbf{n}}}E^2 V_{\mathbf{i}}(x,y)\\&\quad = h_K^dh_H \left( \int \limits _\mathbb{R ^{d+1}}K(z)H(v)f_{X,Y}(x-h_Kz,y-h_Hv)dzdv \right) ^2. \end{aligned}$$

This last term tends to 0 by \(H_0\) and Lebesgue dominated theorem. Let us now prove that for \(\mathbf{n}\) large enough, there exists \(C\) such that \(\widehat{\mathbf{n}} h_K^dh_H R_{\mathbf{n}}(x,y)< C.\) Let \(\displaystyle S=\{\mathbf{i},\mathbf{j}, dist(\mathbf{i},\mathbf{j})\le s_{\mathbf{n}}\}\), where \(s_{\mathbf{n}}\) is a real sequence that converges to infinity and will be specified later. We have \( R_{\mathbf{n}}(x,y) = R_{\mathbf{n}}^1(x,y)+ R_{\mathbf{n}}^2(x,y),\) with

$$\begin{aligned} R_{\mathbf{n}}^1(x,y)=\sum _{{\mathbf{i}},{\mathbf{j}}\in S}\left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right| \end{aligned}$$

and

$$\begin{aligned} R_{\mathbf{n}}^2(x,y)=\sum _{{\mathbf{i}},{\mathbf{j}}\in S^c}\left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right| , \end{aligned}$$

where \(S^c\) stands for the complement of \(S\). Now, by change of variable, \(H_3\) and Lebesgue dominated Theorem, we get

$$\begin{aligned}&E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| \left| H\left( \frac{y-Y_\mathbf{j}}{h_H}\right) \right| |(X_\mathbf{i},X_\mathbf{j})\right] \\&\quad =\int \limits _\mathbb{R ^2} H\left( \frac{y-t}{h_H}\right) H\left( \frac{y-s}{h_H}\right) f^{(X_\mathbf{i},X_\mathbf{j})}(t,s)dtds\\&\quad = h_H^2\int \limits _\mathbb{R ^2}H(t)H(s)f^{(X_\mathbf{i},X_\mathbf{j})}(y-h_Ht,y-h_Hs)dtds\\&\quad = O\left( h_H^2\right) . \end{aligned}$$

Similarly, we have

$$\begin{aligned} E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| |X_\mathbf{i}\right] =h_H\int \limits _\mathbb{R }H(t)f^{X_\mathbf{i}}(y-h_Ht)dt=O\left( h_H\right) . \end{aligned}$$

In addition, by (8), we get

$$\begin{aligned} \displaystyle EK_\mathbf{i}K_\mathbf{j}=O\left( h_K^{2d}\right) \quad \hbox {and}\quad \displaystyle EK_\mathbf{i}=O\left( h_K^d\right) . \end{aligned}$$

Let us consider \(R_{\mathbf{n}}^1(x,y)\). We have

$$\begin{aligned} \left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right|&= \left| EV_\mathbf{i}(x,y)V_\mathbf{j}(x,y)-EV_\mathbf{i}(x,y)EV_\mathbf{j}(x,y)\right| \\&\le E\left[ E \left| V_\mathbf{i}(x,y)V_\mathbf{j}(x,y)\right| |(X_\mathbf{i},X_\mathbf{j})\right] +\left( E\left[ E |V_\mathbf{i}(x,y)||X_\mathbf{i}\right] \right) ^2\\&\le \widehat{\mathbf{n}}^{\,-2}h_K^{-2d}h_H^{-2} EK_\mathbf{i}K_\mathbf{j} E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| \left| H\left( \frac{y-Y_\mathbf{j}}{h_H}\right) \right| \right. \\&\left. |(X_\mathbf{i},X_\mathbf{j})\right] +\widehat{\mathbf{n}}^{-2}h_K^{-2d}h_H^{-2} \left( EK_\mathbf{i}E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| |X_\mathbf{i}\right] \right) ^2\\&\le C\widehat{\mathbf{n}}^{-2}. \end{aligned}$$

Then

$$\begin{aligned} \widehat{\mathbf{n}} h_K^dh_H R_{\mathbf{n}}^1 (x,y) \le \widehat{\mathbf{n}}^{-1}h_K^dh_H\sum _{{\mathbf{i}},{\mathbf{j}}\in S}1\le Ch_K^dh_H s_{\mathbf{n}}^N. \end{aligned}$$

Let us now compute \(R_{\mathbf{n}}^2 (x,y)\). Since \(K\) and \(H\) are bounded, by applying Lemma 1 (ii) we have

$$\begin{aligned} \left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right| \le C\widehat{\mathbf{n}}^{-2}h_K^{-2d}h_H^{-2}\psi (1,1)\varphi (\Vert \mathbf{i}-\mathbf{j}\Vert ). \end{aligned}$$

Then, we obtain that

$$\begin{aligned} \widehat{\mathbf{n}}h_K^dh_H R_{\mathbf{n}}^2(x,y)&\le C\widehat{\mathbf{n}}^{-1}h_K^{-d}h_H^{-1} \sum _{\mathbf{i},\mathbf{j}\in S^c}\psi (1,1)\varphi (\Vert \mathbf{i}-\mathbf{j}\Vert )\\&\le Ch_K^{-d}h_H^{-1}s_{\mathbf{n}}^{-N}\sum _{\Vert \mathbf{i}\Vert >s_{\mathbf{n}}}\Vert \mathbf{i}\Vert ^N\varphi (\Vert \mathbf{i}\Vert )\\&\le Ch_K^{-d}h_H^{-1}s_{\mathbf{n}}^{-N}\sum _{\Vert \mathbf{i}\Vert >s_{\mathbf{n}}}\Vert \mathbf{i}\Vert ^{N-\mu }. \end{aligned}$$

As \(\mu >N+1\), the choice \(s_{\mathbf{n}}=\left( h_K^dh_H\right) ^{-1/N}\) gives the desired result and yields the proof. \(\square \)

Proof of Lemma 5

Remark that

$$\begin{aligned}&\sup _{y\in \mathcal C }|f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\\&\quad \le \sup _{y\in \mathcal C }|f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y)|+\sup _{y\in \mathcal C }|E f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|. \end{aligned}$$

The asymptotic behavior of the bias term is standard, in the sense that it is not affected by the dependence structure of the data. We have

$$\begin{aligned}&\sup _{y \in \mathcal C }|Ef_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\\&\quad =\sup _{y \in \mathcal C }\left| \frac{1}{h_K^d h_H}\int \limits _\mathbb{R ^{d+1}}K\left( \frac{x-u}{h_K}\right) H\left( \frac{y-v}{h_H}\right) f_{X,Y}(u,v)dudv -f_{X,Y}(x,y)\right| \\&\quad =\sup _{y \in \mathcal C }\left| \,\int \limits _\mathbb{R ^{d+1}}K(t)H(s)\left[ f_{X,Y} (x-h_Kt,y-h_Hs)-f_{X,Y}(x,y)\right] dtds\right| \\&\quad \le \int \limits _\mathbb{R ^{d+1}}K(t)H(s)\sup _{y \in \mathcal C }\left| f_{X,Y}(x-h_Kt,y-h_Hs)-f_{X,Y}(x,y)\right| dtds. \end{aligned}$$

This last term goes to zero by \(H_0\) and the Lebesgue dominated theorem. The proof of the almost complete convergence of \( U_{1\mathbf{n}}(x)=\sup _{y\in \mathcal C }|f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y)|\) is similar to that of Theorem 3.3 of Carbon et al. (1997) or Lemma 3.2 of Dabo-Niang and Yao (2007). By seek of completeness, we present it entirely. Let us now introduce a spatial block decomposition that has been used by Tran (1990) and Carbon et al. (1997). Without loss of generality, assume that \(n_i=2pq_i\) for \(1\le i\le N\). The random variables \(\varDelta _\mathbf{i}(x,y)\) can be grouped into \(2^Nq_1\ldots q_N\) cubic blocks of side \(p\). Denote

$$\begin{aligned} U(1,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N}{i_k=2j_kp+1}}^{(2j_k+1)p}\varDelta _\mathbf{i}(x,y),\\ U(2,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N-1}{i_k=2j_kp+1}}^{(2j_k+1)p} \sum _{i_N=(2j_N+1)p+1}^{2(j_N+1)p}\varDelta _\mathbf{i}(x,y),\\ U(3,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N-2}{i_k=2j_kp+1}}^{(2j_k+1)p} \sum _{i_{N-1}=(2j_{N-1}+1)p+1}^{2(j_{N-1}+1)p} \sum _{i_N=2j_Np+1}^{(2j_N+1)p}\varDelta _\mathbf{i}(x,y),\\ U(4,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N-2}{i_k=2j_kp+1}}^{(2j_k+1)p} \sum _{i_{N-1}=(2j_{N-1}+1)p+1}^{2(j_{N-1}+1)p} \sum _{i_N=(2j_N+1)p+1}^{2(j_N+1)p}\varDelta _\mathbf{i}(x,y), \end{aligned}$$

and so on. Note that

$$\begin{aligned} U(2^{N-1},\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots , N-1}{i_k=(2j_k+1)p+1}}^{2(j_k+1)p} \sum _{i_N=2j_Np+1}^{(2j_N+1)p}\varDelta _\mathbf{i}(x,y). \end{aligned}$$

Finally,

$$\begin{aligned} U(2^N,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots , N}{i_k=(2j_k+1)p+1}}^{2(j_k+1)p} \varDelta _\mathbf{i}(x,y). \end{aligned}$$

For each integer \(1\le i\le 2^N\), define

$$\begin{aligned} T(\mathbf{n},i)=\sum _{\underset{k=1,\ldots ,N}{j_k=0}}^{q_k-1}U(i,\mathbf{n},\mathbf{j}). \end{aligned}$$

Clearly

$$\begin{aligned} S_\mathbf{n}(x,y)=\sum _{i=1}^{2^N}T(\mathbf{n},i). \end{aligned}$$

Observe that, for any \(\varepsilon >0\)

$$\begin{aligned} P\left( |S_\mathbf{n}(x,y)|>\varepsilon \right)&= P\left( \left| \sum _{i=1}^{2^N}T(\mathbf{n},i)\right| >\varepsilon \right) \nonumber \\&\le 2^NP\left( \left| T(\mathbf{n},1)\right| >\varepsilon /2^N\right) . \end{aligned}$$
(12)

We enumerate in an arbitrary way the \(\widehat{q}=q_1\ldots q_N\) terms \(U(1,\mathbf{n},\mathbf{j})\) of the sum \(T(\mathbf{n},1)\) that we call \(W_1,\ldots ,W_{\widehat{q}}\). Note that \(U(1,\mathbf{n},\mathbf{j})\) is measurable with respect to the \(\sigma \)-field generated by \(V_\mathbf{i}(x,y)\), with \(\mathbf{i}\) such that \(2j_kp+1\le i_k\le (2j_k+1)p, k=1,\ldots , N\).

These sets of sites are separated by a distance at least \(p\) and since \(K\) and \(H\) are bounded, then we have for all \(i=1,\ldots ,\widehat{q}\),

$$\begin{aligned} |W_i|\le C(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\Vert K\Vert _{\infty }\Vert H\Vert _{\infty }. \end{aligned}$$

Lemma 2 ensures that there exist independent random variables \(W_1^*,\ldots ,W_{\widehat{q}}^*\) such that,

$$\begin{aligned} \sum _{i=1}^{\widehat{q}} E|W_i-W_i^*|\le C \widehat{q} (\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\Vert K\Vert _{\infty }\Vert H\Vert _{\infty }\psi (\widehat{\mathbf{n}},p^N)\varphi (p). \end{aligned}$$

Markov’s inequality leads to

$$\begin{aligned} P\left( \sum _{i=1}^{\widehat{q}}|W_i-W_i^*|>\varepsilon /2^{N+1}\right) \le C2^{N+1}(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\widehat{q}\psi (\widehat{\mathbf{n}},p^N)\varepsilon ^{-1}\varphi (p).\nonumber \\ \end{aligned}$$
(13)

By Bernstein’s inequality, we have

$$\begin{aligned} P\left( |\sum _{i=1}^{\widehat{q}}W_i^*|\!>\!\varepsilon /2^{N+1}\right) \!\le \! 2\exp \left\{ \frac{-\varepsilon ^2/(2^{N+1})^2}{4\sum _{i=1}^{\widehat{q}}E W_i^{*2}\!+\!2C(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\varepsilon /2^{N+1}}\right\} .\nonumber \\ \end{aligned}$$
(14)

Combining (12), (13) and (14), we get

$$\begin{aligned}&P\left( |S_\mathbf{n}(x,y)|>\varepsilon \right) \\&\quad \le 2^NP\left( \sum _{i=1}^{\widehat{q}}|W_i-W_i^*|> \varepsilon /2^{N+1}\right) +2^NP \left( \left| \sum _{i=1}^{\widehat{q}}W_i^*\right| >\varepsilon /2^{N+1}\right) \\&\quad \le 2^{N+1}\exp \left\{ \frac{-\varepsilon ^2/(2^{N+1})^2}{4\sum _{i=1}^{\widehat{q}}E W_i^{*2}+2C(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N \varepsilon /2^{N+1}}\right\} \\&\qquad + C2^{2N+1}\psi (\widehat{\mathbf{n}},p^N)(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\widehat{q}\varepsilon ^{-1}\varphi (p). \end{aligned}$$

Let \(\lambda >0\) and set

$$\begin{aligned} \varepsilon =\varepsilon _\mathbf{n}&= \left( \frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^dh_H}\right) ^{1/2}, \quad p =p_\mathbf{n}= \left( \frac{\widehat{\mathbf{n}} h_K^dh_H}{\log \widehat{\mathbf{n}}}\right) ^{1/2N}. \end{aligned}$$

By the fact that \(W_i^*\) and \(W_i\) have the same distribution, we have

$$\begin{aligned} \sum _{i=1}^{\widehat{q}}E W_i^{*2}=\sum _{i=1}^{\widehat{q}}E W_i^2 \le I_\mathbf{n}(x,q_\alpha (x))+ R_\mathbf{n}(x,q_\alpha (x)). \end{aligned}$$

Then, by Lemma 4, we get \( \sum \nolimits _{i=1}^{\widehat{q}}E W_i^{*2}=O\left( \frac{1}{\widehat{n}h_K^dh_H}\right) \). Thus, for the case (i) of the Theorem 3, a simple computation shows that for sufficiently large \(\mathbf{n}\),

$$\begin{aligned} P\left( |S_{\mathbf{n}}(x,y)|>\lambda \varepsilon _\mathbf{n}\right)&\le 2^{N+1}\exp \left\{ \frac{-\lambda ^2\log \widehat{\mathbf{n}}}{2^{2N+4}C+2^{N+2}C\lambda }\right\} \nonumber \\&+ C2^{N+1}p^N h_K^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p) \nonumber \\&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}p^N h_K^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p), \end{aligned}$$
(15)

where \(b>0\) and depends on \(\lambda \) . For case (ii) of Theorem 3, we obtain

$$\begin{aligned} P\left( |S_{\mathbf{n}}(x,y)|>\lambda \varepsilon _\mathbf{n}\right)&\le 2^{N+1}\exp \left\{ \frac{-\lambda ^2\log \widehat{\mathbf{n}}}{2^{2N+4}C+2^{N+2}C\lambda }\right\} \nonumber \\&+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} h_1^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p)\nonumber \\&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p). \end{aligned}$$
(16)

Then, (15) and (16) can be condensed in

$$\begin{aligned}&P\left( |f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y)|>\lambda \varepsilon _{\mathbf{n}}\right) \\&\quad \le \left\{ \begin{array}{l} C\widehat{\mathbf{n}}^{-b}+C\lambda ^{-1}h_K^{-d}h_H^{-1}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \quad \hbox { under (i)},\\ C\widehat{\mathbf{n}}^{-b}+C\lambda ^{-1}h_K^{-d}h_H^{-1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\quad \hbox {under (ii)}. \end{array}\right. \end{aligned}$$

Now, set \(r_{\mathbf{n}}=h_K^{d}h_H^{2}\varepsilon _{\mathbf{n}}\). The compact \(\mathcal C \) can be covered with \(d_{\mathbf{n}}\) intervals \(I_k\) centered at \(y_k\) and having \(r_{\mathbf{n}}\) as length. We have \(d_{\mathbf{n}}\le C r_{\mathbf{n}}^{-1}\) and

$$\begin{aligned} \sup _{y\in C}|E f_{\mathbf{n}}(x,y)-f_{\mathbf{n}}(x,y)|&\le \max _k\sup _{y\in I_k }\left| f_{\mathbf{n}}(x,y)-f_{\mathbf{n}}(x,y_k)\right| \nonumber \\&+ \max _k\left| f_{\mathbf{n}}(x,y_k)-E f_{\mathbf{n}}(x,y_k)\right| \end{aligned}$$
(17)
$$\begin{aligned}&+ \max _k\sup _{y\in I_k}\left| E f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y_k)\right| .\qquad \quad \end{aligned}$$
(18)

Since the density \(H\) is lipschitz and \(K\) is bounded, we have

$$\begin{aligned} \left| f_{\mathbf{n}}(x,y)-f_{\mathbf{n}}(x,y_k)\right| \le C h_K^{-d}h_H^{-2}|y-y_k| = O(\varepsilon _{\mathbf{n}}), \end{aligned}$$

and

$$\begin{aligned} \left| E f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y_k)\right|&= O(\varepsilon _{\mathbf{n}}). \end{aligned}$$

Let us focus on \(\displaystyle U_{2\mathbf{n}}(x)=\max _k\left| f_{\mathbf{n}}(x,y_k)-Ef_{\mathbf{n}}(x,y_k)\right| \) and remark that

$$\begin{aligned} P\left( U_{2\mathbf{n}}(x)>\lambda \varepsilon _{\mathbf{n}}\right) \le d_{\mathbf{n}}\max _k P\left( |f_{\mathbf{n}}(x,y_k)-E f_{\mathbf{n}}(x,y_k)|>\lambda \varepsilon _{\mathbf{n}}\right) \!. \end{aligned}$$

To prove the convergence of \(U_{2\mathbf{n}}(x)\), it suffices to show that for respectively \((i)\) and \((ii)\)

$$\begin{aligned} \left\{ \begin{array}{l} Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-1}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \rightarrow 0,\\ Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\rightarrow 0. \end{array}\right. \end{aligned}$$

First, observe that condition \(H_6\) or \(H_7\) implies that \(\widehat{\mathbf{n}} h_K^d \rightarrow \infty \) and \(\widehat{\mathbf{n}}h_H \rightarrow \infty \). These last limits imply respectively that, there exists \(C>0\) such that \(\widehat{\mathbf{n}}> Ch_K^{-d}\) (resp. \(\widehat{\mathbf{n}}> Ch_H^{-1}\)) for \(\mathbf{n}\) large enough. Then, \(d_{\mathbf{n}}\le C\widehat{\mathbf{n}}^{5/2}(\log \widehat{\mathbf{n}})^{-1/2}\). We have

$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\le C\widehat{\mathbf{n}}^{7/2-b}(\log \widehat{\mathbf{n}})^{-1/2}u_\mathbf{n}, \end{aligned}$$

this last goes to 0 if \(b>7/2\). On one hand,

$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-1}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}}&\le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +N)}{\mu -5N}}h_H^{\frac{\mu +3N}{\mu -5N}} (\log \widehat{\mathbf{n}})^{\frac{3N-\mu }{\mu -5N}}u_{\mathbf{n}}^{\frac{-2N}{\mu -5N}}\right] ^{\frac{5N-\mu }{2N}}, \end{aligned}$$

this goes to \(0\) by \(H_6\). On the other hand, we have

$$\begin{aligned}&d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\\&\quad \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +2N)}{\mu -N(4+2{\widetilde{\beta }})}} h_H^{\frac{\mu +4N}{\mu -N(4+2{\widetilde{\beta }})}} (\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -N(4+2{\widetilde{\beta }})}}u_{\mathbf{n}}^{\frac{-2N}{\mu -N(4+2{\widetilde{\beta }})}} \right] ^{\frac{N(4+2\widetilde{\beta })-\mu }{2N}}, \end{aligned}$$

which goes to \(0\) by \(H_7\). This yields the proof of Lemma 5. \(\square \)

Proof of Lemma 6

By using the same arguments as in the proof of Lemma 5, we get

$$\begin{aligned} \left| E\widehat{f}(x)-f_X(x)\right| = \left| \int \limits _\mathbb{R ^{d}}K(s)\left[ f_{X}(x-sh)-f_{X}(x)\right] ds \right| . \end{aligned}$$

This last term tends to zero by Lebesgue dominated theorem. Let \(\displaystyle V_\mathbf{i}(x) = \frac{1}{\widehat{\mathbf{n}} h_K^d}K\left( \frac{x-X_\mathbf{i}}{h_K}\right) ,\, \varDelta _\mathbf{i}(x)=V_\mathbf{i}(x)-E V_\mathbf{i}(x)\). Then we have: \(\widehat{f}(x)-E\widehat{f}(x)=\sum \nolimits _{\mathbf{i}\in \mathcal I _\mathbf{n}}\varDelta _\mathbf{i}(x)=S_\mathbf{n}(x).\) Let \(\displaystyle I_\mathbf{n}(x) = \sum \nolimits _{\mathbf{i}\in \mathcal I _\mathbf{n}}E(\varDelta _\mathbf{i}(x))^2 \quad \hbox {and} \quad R_\mathbf{n}(x)=\sum \nolimits _{\underset{\mathbf{i},\mathbf{j}\in \mathcal I _\mathbf{n}}{\mathbf{i} \ne \mathbf{j}}}E\left| \varDelta _\mathbf{i}(x)\varDelta _\mathbf{j}(x)\right| .\) Lemma 2.2 of Tran (1990) gives that \(\displaystyle I_\mathbf{n}(x)+ R_\mathbf{n}(x)=O\left( \frac{1}{\widehat{\mathbf{n}} h_K^d}\right) \).

Consider \(\displaystyle \varepsilon =\varepsilon _\mathbf{n} = \left( \frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^d}\right) ^{1/2},\; p =p_\mathbf{n}= \left( \frac{\widehat{\mathbf{n}} h_K^d}{\log \widehat{\mathbf{n}}}\right) ^{1/2N}\) and use the same arguments as in the proof of Lemma 5, to get for sufficiently large \(\mathbf{n}\)

$$\begin{aligned} P\left( |S_{\mathbf{n}}(x)|>\lambda \varepsilon _\mathbf{n}\right) \le \left\{ \begin{array}{c} C\widehat{\mathbf{n}}^{-b}+C2^{N+1}p^N h_K^{-d}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p) \quad \hbox { under (i)},\\ C\widehat{\mathbf{n}}^{-b}+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p)\quad \hbox {under (ii)} \end{array}\right. \end{aligned}$$

with \(b>0\). It suffices to show that for the case \((i)\) (resp. \((ii)\)) \( p^Nh_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}}\rightarrow 0\) (resp. \(\widehat{\mathbf{n}}^{{\widetilde{\beta }}} h_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}}\rightarrow 0)\). A simple computation shows respectively for \((i)\) and \((ii)\)

$$\begin{aligned} \left\{ \begin{array}{l} p^Nh_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}} \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d\mu }{\mu -4N}}(\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -4N}}u_{\mathbf{n}}^{\frac{-2N}{\mu -4N}}\right] ^{\frac{4N-\mu }{2N}},\\ \widehat{\mathbf{n}}^{\widetilde{\beta }} h_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}} \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(N+\mu )}{\mu -N(3+2{\widetilde{\beta }})}}(\log \widehat{\mathbf{n}})^{\frac{N-\mu }{\mu -N(3+2{\widetilde{\beta }})}}u_{\mathbf{n}}^{\frac{-2N}{\mu -N(3+2{\widetilde{\beta }})}}\right] ^{\frac{N(3+2{\widetilde{\beta }})-\mu }{2N}}. \end{array}\right. \end{aligned}$$

These last go to \(0\) by respectively \(H_6\) and \(H_7\). This yields the proof. \(\square \)

Proof of Lemma 3

We have

$$\begin{aligned} \sup _{y \in \mathcal C }|\widehat{f^x}(y)-f^x(y)|&\le \frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }|f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\\&+\frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }f^x(y)|\widehat{f}(x)-f_X(x)|. \end{aligned}$$

Lemmas 6 and 5 give respectively the almost complete convergence of \(\widehat{f}(x)\) to \(f_X(x)\) and \(f_\mathbf{n}(x,y)\) to \(f_{X,Y}(x,y)\). As respectively by \(H_0\) and \(H_1\) , \(f_X(x)>0\) and \(f^x(y)\) is bounded on the compact \(\mathcal C \), the proof is finished. \(\square \)

To prove Theorem 2, we need the three following lemmas.

Lemma 7

Under conditions of Theorem 2, we have for any compact \(\mathcal C \) of \(\mathbb R \)

$$\begin{aligned} \mathbb \sup _{y\in \mathcal C }\left| \widehat{f^x}^{(2)}(y)-f^{x^{(2)}}(y)\right| \stackrel{a.co}{\rightarrow }0. \end{aligned}$$

Lemma 8

Under conditions of Theorem 2, we have

$$\begin{aligned} \left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right\| _{2r} =O\left( h_K^{b_1}+h_H^{b_2}\right) . \end{aligned}$$

Lemma 9

If conditions of Theorem 2 are satisfied, then

$$\begin{aligned} \left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})\right) \right\| _{2r} =O\left( \left( \frac{1}{\widehat{\mathbf{n}}h_K^d h_H^4}\right) ^{1/2}\right) . \end{aligned}$$

where

$$\begin{aligned}&K_{\mathbf{i}}=K\left( \frac{x-X_{\mathbf{i}}}{h_1}\right) ,\; H_{\mathbf{i}}(y)=h_H^{-1}H\left( \frac{y-Y_{\mathbf{i}}}{h_H}\right) ,\\&W_{\mathbf{ni}}=W_{\mathbf{ni}}(x)=\frac{K_{\mathbf{i}}}{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}}}. \end{aligned}$$

To prove Lemma 7 similarly to Lemma 3, we introduce the following notations and state the two technical lemmas below. Let

$$\begin{aligned} \displaystyle \widetilde{V}_\mathbf{i}(x,y)&= \frac{1}{\widehat{\mathbf{n}}h_K^d h_H^3}K\left( \frac{x-X_\mathbf{i}}{h_K}\right) H^{(2)}\left( \frac{y-Y_\mathbf{i}}{h_H}\right) ,\\ \widetilde{\varDelta }_\mathbf{i}(x,y)&= \widetilde{V}_\mathbf{i}(x,y)-E\widetilde{V}_\mathbf{i}(x,y)\;\quad \hbox {and}\\ \displaystyle \widetilde{S}_\mathbf{n}(x,y)&= \displaystyle \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}\widetilde{\varDelta }_\mathbf{i}(x,y)=f_\mathbf{n}^{(2)}(x,y)-Ef_\mathbf{n}^{(2)}(x,y)\\ \displaystyle \widetilde{I}_\mathbf{n}(x,y)&= \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}E(\widetilde{\varDelta }_\mathbf{i}(x,y))^2,\quad \widetilde{R}_\mathbf{n}(x,y)=\sum _{\underset{\mathbf{i},\mathbf{j}\in \mathcal I _\mathbf{n}}{\mathbf{i} \ne \mathbf{j}}}E\left| \widetilde{\varDelta }_\mathbf{i}(x,y) \widetilde{\varDelta }_\mathbf{j}(x,y)\right| . \end{aligned}$$

Lemma 10

If the conditions of Lemma 7 are satisfied, then

$$\begin{aligned} \displaystyle \widetilde{I}_\mathbf{n}(x,y)+\widetilde{R}_\mathbf{n}(x,y)=O\left( \frac{1}{\widehat{\mathbf{n}}h_K^dh_H^5}\right) . \end{aligned}$$

Lemma 11

Under the conditions of Lemma 7, we have

$$\begin{aligned} \displaystyle \sup _{y \in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|\rightarrow 0\quad \hbox {a.co}. \end{aligned}$$

Proof of Lemma 10

The proof is similar to that of Lemma 4. Therefore, similar calculations give

$$\begin{aligned}&\widehat{\mathbf{n}}h_K^dh_H^5\sum _{{\mathbf{i}}\in \mathcal I _ {\mathbf{n}}}E \widetilde{V}_\mathbf{i}^2(x,y)\\&\quad = h_K^{-d}h_H^{-1}\int \limits _\mathbb{R ^{d+1}} K^2\left( \frac{x-z}{h_K}\right) \left( H^{(2)}\right) ^2 \left( \frac{y-v}{h_H}\right) f_{X,Y}(z,v)dzdv\\&\quad =\int \limits _\mathbb{R ^{d+1}}K^2\left( z\right) \left( H^{(2)}\right) ^2\left( v\right) f_{X,Y}(x-h_Kz,y-h_Hv)dzdv. \end{aligned}$$

By assumption \(H_0\), Lebesgue dominated theorem and \(H_5\), this last integral converges to \(f_{X,Y}(x,y)\int _\mathbb{R ^{d+1}}K^2\left( z\right) \left( H^{(2)}\right) ^2\left( v\right) dzdv\). Next, notice that

$$\begin{aligned} \widehat{\mathbf{n}} h_K^d h_H^5\sum _{{\mathbf{i}}\in \mathcal I _{\mathbf{n}}}E^2\widetilde{V}_{\mathbf{i}}(x,y) \!=\!h_K^{-d}h_H^{-1}\left( \,\,\int \limits _\mathbb{R ^{d+1}}K \left( \frac{x\!-\!z}{h_K}\right) H^{(2)}\left( \frac{y\!-\!v}{h_H} \right) f_{X,Y}(z,v)dzdv\right) ^2. \end{aligned}$$

By an usual change of variables, we obtain

$$\begin{aligned} \widehat{\mathbf{n}} h_1^d h_H^5\sum _{\mathbf{i}\in \mathcal I _ {\mathbf{n}}}E^2 V_{\mathbf{i}}(x,y) \!=\! h_K^dh_H\left( \,\,\int \limits _\mathbb{R ^{d+1}}K(z)H^{(2)} (v)f_{X,Y}(x\!-\!h_Kz,y\!-\!h_Hv)dzdv \right) ^2. \end{aligned}$$

This last term tends to \(0\) by \(H_0\) and Lebesgue dominated theorem.

To yield the proof it suffices to treat \(\widetilde{R}_\mathbf{n}(x,y)\) by conditioning on \(X_\mathbf{i}\) and \((X_\mathbf{i},X_\mathbf{j})\) as in the end of the proof of Lemma 4. \(\square \)

Proof of Lemma 11

Remark that

$$\begin{aligned} \sup _{y\in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|&\le \sup _{y\in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-E f_{\mathbf{n}}^{(2)}(x,y)|\\&+\sup _{y\in \mathcal C }|E f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|. \end{aligned}$$

By using two successive integrations by parts and a classical change of variables, the bias term is such that

$$\begin{aligned}&\sup _{y\in \mathcal C }\left| E f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)\right| \\&\quad =\sup _{y\in \mathcal C }\left| \frac{1}{h_K^d h_H^3}\int \limits _\mathbb{R ^{d+1}}K\left( \frac{x-u}{h_K}\right) H^{(2)}\left( \frac{y-v}{h_H}\right) f_{X,Y}(u,v)dudv -f_{X,Y}^{(2)}(x,y)\right| \\&\quad \le \int \limits _\mathbb{R ^{d+1}}K(t)H(s)\sup _{y\in \mathcal C }\left| f_{X,Y}^{(2)}(x-h_Kt,y-h_Hs)- f_{X,Y}^{(2)}(x,y)\right| dtds. \end{aligned}$$

This last term goes to zero by \(H_1\) and Lebesgue dominated theorem. Thus, it remains to show the almost complete convergence of \(V_{1\mathbf{n}}(x)=\sup _{y\in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-E f_{\mathbf{n}}^{(2)}(x,y)|\) by following the same lines as in the proof of the almost complete convergence of \(U_{1\mathbf{n}}(x).\) More precisely, let here \(\displaystyle \varepsilon _\mathbf{n} = \left( \frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^dh_H^5}\right) ^{1/2}\) and \( r_\mathbf{n}=h_K^dh_H^4\varepsilon _\mathbf{n}\), we obtain for \(\lambda >0\), the existence of \(b>0\) such that for \(\mathbf{n}\) large enough,

$$\begin{aligned} P\left( |f_{\mathbf{n}}^{(2)}(x,y)\!-\!E f_{\mathbf{n}}^{(2)}(x,y)|\!>\!\lambda \varepsilon _{\mathbf{n}}\right) \le \left\{ \begin{array}{l} C\widehat{\mathbf{n}}^{-b}\!+\!C\lambda ^{-1}h_K^{-d}h_H^{-3}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \quad \hbox { under (i)},\\ C\widehat{\mathbf{n}}^{-b}\!+\!C\lambda ^{-1}h_K^{-d}h_H^{-3}\widehat{\mathbf{n}}^{\widetilde{\beta }} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\quad \hbox {under (ii)}. \end{array}\right. \end{aligned}$$

Thus, it suffices to show that for respectively \((i)\) and \((ii)\)

$$\begin{aligned} \left\{ \begin{array}{l} Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-3}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \rightarrow 0,\\ Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-3}\widehat{\mathbf{n}}^{\widetilde{\beta }} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\rightarrow 0. \end{array}\right. \end{aligned}$$

where \(d_{\mathbf{n}}\le C r_{\mathbf{n}}^{-1}\) . Then, for \(\mathbf{n}\) large enough, there exists \(C>0\) such that \(d_{\mathbf{n}}\le C\widehat{\mathbf{n}}^{5/2}(\log \widehat{\mathbf{n}})^{-1/2}\), and

$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\le C\widehat{\mathbf{n}}^{7/2-b}(\log \widehat{\mathbf{n}})^{-1/2}u_\mathbf{n}. \end{aligned}$$

This last goes to 0 if \(b>7/2\). On the one hand, we have

$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-3}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}}&\le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +N)}{\mu -5N}}h_H^{\frac{5\mu -N}{\mu -5N}}(\log \widehat{\mathbf{n}})^{\frac{3N-\mu }{\mu -5N}}u_{\mathbf{n}}^{\frac{-2N}{\mu -5N}}\right] ^{\frac{5N-\mu }{2N}}. \end{aligned}$$

This last goes to \(0\) by \(H_8\). On the other hand, we have

$$\begin{aligned}&d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-3}\widehat{\mathbf{n}}^{\widetilde{\beta }} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\\&\quad \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +2N)}{\mu -N(4+2{\widetilde{\beta }})}} h_H^{\frac{5\mu +4N}{\mu -N(4+2{\widetilde{\beta }})}} (\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -N(4+2{\widetilde{\beta }})}}u_{\mathbf{n}}^{\frac{-2N}{\mu -N(4+2{\widetilde{\beta }})}}\right] ^{\frac{N(4+2\widetilde{\beta })-\mu }{2N}}, \end{aligned}$$

which goes to \(0\) by \(H_9\). This finishes the proof. \(\square \)

Proof of Lemma 7

Analogously to the proof of Lemma 3, we have

$$\begin{aligned} \sup _{y \in \mathcal C }|\widehat{f}^{x^{(2)}}(y)-f^{x^{(2)}}(y)|&\le \frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|\\&+\frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }f^{x^{(2)}}(y)|\widehat{f}(x)-f_X(x)|. \end{aligned}$$

Lemmas 6 and 11 give respectively the almost complete convergence of \(\widehat{f}(x)\) to \(f_X(x)\) and \(f_{\mathbf{n}}^{(2)}(x,y)\) to \(f_{X,Y}^{(2)}(x,y)\). We have respectively by \(H_0\) and \(H_1, f_X(x)>0\) and \(f^{x^{(2)}}(y)\) bounded on the compact \(\mathcal C \). This ends the proof. \(\square \)

Proof of Lemma 8

On the one hand, the \(W_\mathbf{ni}\)’s definition and assumption \(H_4\) permit to write that \(\displaystyle W_\mathbf{ni}=W_\mathbf{ni}1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\), so

$$\begin{aligned}&\left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta ) |X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right\| _{2r}\nonumber \\&\quad = \left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right\| _{2r}. \end{aligned}$$
(19)

On the other hand, for all \(\mathbf{i}\in \mathcal I _{\mathbf{n}} \), we have

$$\begin{aligned}&\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \\&\quad = \left| \int \limits _\mathbb{R }h_H^{-2}H^{(1)}\left( \frac{\theta -z}{h_H}\right) f^{X_\mathbf{i}}(z)dz-f^{x^{(1)}}(\theta (x))\right| . \end{aligned}$$

Then, by using respectively an integration by parts and an usual change of variables, we get

$$\begin{aligned}&\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \\&\quad = \left| h_H^{-1}\int \limits _\mathbb{R }H\left( \frac{\theta -z}{h_H}\right) f^{X_\mathbf{i}^{(1)}}(z)dz-f^{x^{(1)}}(\theta )\right| \\&\quad =\left| \int \limits _\mathbb{R }H(u)\left[ f^{X_\mathbf{i}^{(1)}}(\theta -h_H u)-f^{x^{(1)}}(\theta )\right] du\right| \\&\quad \le \int \limits _\mathbb{R }H(u)\left| f^{X_\mathbf{i}^{(1)}}(\theta -h_H u)-f^{x^{(1)}}(\theta )\right| du. \end{aligned}$$

Hence

$$\begin{aligned}&1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \\&\quad \le 1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\int \limits _\mathbb{R }H(u)\left| f^{X_\mathbf{i}^{(1)}}(\theta -h_H u)-f^{x^{(1)}}(\theta )\right| du. \end{aligned}$$

Then assumption \(H_2\) gives

$$\begin{aligned} 1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \le C\left( h_K^{b_1}+h_H^{b_2}\right) . \end{aligned}$$

Since \(\sum \nolimits _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\) or \(0\), this last inequality in conjunction with (19) finishes the proof. \(\square \)

Before to estabish Lemma 9, we introduce some notations and state the two following lemmas. Let \(\xi _\mathbf{i}=K_{\mathbf{i}}\digamma _{\mathbf{i}}\), with \(\digamma _{\mathbf{i}}=H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) -E\left( H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) |X_\mathbf{i}\right) \). We have

Lemma 12

Under conditions of Lemma 9, we have

$$\begin{aligned} \displaystyle E\left[ \left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\xi _{\mathbf{i}}\right) ^{2r}\right] \le C\left( \widehat{\mathbf{n}}h_K^d\right) ^r. \end{aligned}$$

Lemma 13

If conditions of Lemma 9 are satisfied, then we have

$$\begin{aligned} \left( P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{n}u/2\right] \right) ^{1/2r}=O\left( \left( \frac{1}{\widehat{\mathbf{n}}h_K^d}\right) ^{1/2}\right) . \end{aligned}$$

Proof of Lemma 12

The proof is completely similar to that of Lemma 2.2 of Gao et al. (2008). That is why, we use the notation \(\xi _\mathbf{i}\) introduced by them. Because of the boundness of \(\digamma _{\mathbf{i}}\), the moment results obtained have a more simple form than in Gao et al. (2008). To start, note that

$$\begin{aligned} E\left[ \left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\xi _{\mathbf{i}}\right) ^{2r}\right] =\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}E\left[ \xi _{\mathbf{i}}^{2r}\right] +\sum _{s=1}^{2r-1}\sum _{\nu _0+\nu _1+ \ldots +\nu _s=2r}V_s(\nu _0,\nu _1,\ldots ,\nu _s),\qquad \quad \end{aligned}$$
(20)

where \(\sum _{\nu _0+\nu _1+\ldots +\nu _s=2r}\) is the summation over \((\nu _0,\nu _1,\ldots ,\nu _s)\) with positive integer components satisfying \(\nu _0+\nu _1+\cdots +\nu _s=2r\) and

$$\begin{aligned} V_s(\nu _0,\nu _1,\ldots ,\nu _s)=\sum _{\mathbf{i}_0\ne \mathbf{i}_1\ne \ldots \ne \mathbf{i}_s}E\left[ \xi _{\mathbf{i}_0}^{\nu _0} \xi _{\mathbf{i}_1}^{\nu _1}\ldots \xi _{\mathbf{i}_s}^{\nu _s}\right] , \end{aligned}$$

where the summation \(\sum _{\mathbf{i}_0\ne \mathbf{i}_1\ne \ldots \ne \mathbf{i}_s}\) is over indexes \((\mathbf{i}_0,\mathbf{i}_1,\ldots ,\mathbf{i}_s)\) with each index \(\mathbf{i}_j\) taking value in \(\mathcal I _{\mathbf{n}}\) and satisfying \(\mathbf{i}_j\ne \mathbf{i}_l\) for any \(j\ne l, 0\le j,l\le s\).

By stationarity and \(H_4\), we have

$$\begin{aligned} \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}E\left( \xi _{\mathbf{i}}\right) ^{2r} \le \widehat{\mathbf{n}}E\left( K_{\mathbf{i}}\left| \digamma _{\mathbf{i}}\right| \right) ^{2r} \le C\widehat{\mathbf{n}}E\left( K_{\mathbf{i}}\right) ^{2r}\le C\widehat{\mathbf{n}} h_K^d, \end{aligned}$$
(21)

where \(\displaystyle \digamma _{\mathbf{i}}=H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) -E\left( H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) |X_\mathbf{i}\right) \).

To control the term \(V_s(\nu _0,\nu _1,\ldots ,\nu _s)\), we need to prove, for any positive integers \( \nu _1,\nu _2,\ldots ,\nu _s \), the following results:

  1. i)

    \(\displaystyle E\left| \xi _{\mathbf{i}_1}^{\nu _1}\xi _{\mathbf{i}_2}^{\nu _2}\ldots \xi _{\mathbf{i}_s}^{\nu _s}\right| \le h_K^{ds}\)

  2. ii)

    \(\displaystyle V_s(\nu _0,\nu _1,\ldots ,\nu _s)=O\left( \left( \widehat{\mathbf{n}}h_K^d\right) ^{s+1}\right) \),    for   \(s=1,2,\ldots ,r-1\) and \(r>1\)

  3. iii)

    \(\displaystyle V_s(\nu _0,\nu _1,\ldots ,\nu _s)=O\left( \left( \widehat{\mathbf{n}}h_K^d\right) ^r\right) \),    for   \(r\le s\le 2r-1\).

The proofs of \(i), ii)\) and \(iii)\) will be ommitted because the technics used in theses proofs are similar to that given in Gao et al. (2008). Now, remark that \((i)\) and \((ii)\) imply that

$$\begin{aligned} \sum _{s=1}^{2r-1}\sum _{\nu _0+\nu _1+\ldots +\nu _s=2r}V_s(\nu _0,\nu _1,\ldots ,\nu _s)\le C\widehat{\mathbf{n}}\left( h_K^d\right) ^r. \end{aligned}$$
(22)

Then, Lemma 12 follows from (20), (21) and (22). \(\square \)

Proof of Lemma 13

Using (23), we have

$$\begin{aligned} P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right]&\le P\left[ \left| \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\left( K_\mathbf{i}-EK_\mathbf{i}\right) \right| \ge \widehat{\mathbf{n}}u/2\right] \\&\le P\left[ \left| \frac{1}{\widehat{\mathbf{n}}h_K^d}\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\left( K_\mathbf{i}-EK_\mathbf{i}\right) \right| \ge C\right] . \end{aligned}$$

Then, for large \(\mathbf{n}\),

$$\begin{aligned} P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right] \le P\left[ \left| S_\mathbf{n}(x)\right| >\lambda \varepsilon _{\mathbf{n}}\right] , \end{aligned}$$

where \(S_\mathbf{n}(x)=\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\varDelta _\mathbf{i}(x)=\widehat{f}(x)-E\widehat{f}(x), \lambda , p=p_\mathbf{n}=\left( \dfrac{\widehat{\mathbf{n}}h_K^d}{\log \widehat{n}}\right) ^{1/2N}\) and \(\varepsilon _{\mathbf{n}}=\sqrt{\frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^d}}\) are the same as in the proof of Lemma 6. Considering sufficiently large \(\mathbf{n}\) (see the proof of Lemma 6), we have the existence of \(b>0\) such that for respectively \((i)\) and \((ii)\),

$$\begin{aligned} P\left( |S_{\mathbf{n}}(x)|>\lambda \varepsilon _{\mathbf{n}}\right)&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}p^N h_K^{-d}\lambda ^{-1}\varepsilon _{\mathbf{n}}^{-1}\varphi (p),\\ P\left( |S_{\mathbf{n}}(x)|>\lambda \varepsilon _{\mathbf{n}}\right)&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}\lambda ^{-1}\varepsilon _{\mathbf{n}}^{-1}\varphi (p). \end{aligned}$$

Note that \(\displaystyle \widehat{\mathbf{n}}^{(r-b)}h_K^{dr}\) tends to zero, for \(b>r\). For the case of (i), a simple computation gives

$$\begin{aligned} \left( \widehat{\mathbf{n}}h_K^d\right) ^rC2^{N+1}p^N h_K^{-d}\lambda ^{-1} \varepsilon _{\mathbf{n}}^{-1}\varphi (p) \le \left[ \widehat{\mathbf{n}}h_K^{\frac{d(\mu -2Nr)}{\mu -2N(r+1)}}(\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -2N(r+1)}}\right] ^\frac{2N(r+1)-\mu }{2N}, \end{aligned}$$

which tends to zero by \(H_8\). Similarly, for the case of (ii), we obtain

$$\begin{aligned} \left( \widehat{\mathbf{n}}h_K^d\right) ^rC2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}\lambda ^{-1}\varepsilon _{\mathbf{n}}^{-1}\varphi (p)\le \left[ \widehat{\mathbf{n}}h_K^\frac{d(\mu -2Nr+N)}{\mu -N(2r+2\tilde{\beta }+1)}(\log \widehat{\mathbf{n}})^{\frac{N-\mu }{\mu -N(2r+2\tilde{\beta }+1)}}\right] ^\frac{N(2r+2\tilde{\beta }+1)-\mu }{2N}, \end{aligned}$$

which goes to zero under \(H_9\). Thus, in the two cases of mixing, we have

$$\begin{aligned} P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right] =O\left( \left( \frac{1}{\widehat{\mathbf{n}}h_K^d}\right) ^r\right) . \end{aligned}$$

This yields the proof. \(\square \)

Proof of Lemma 9

Let us set

$$\begin{aligned} G=\displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta ) -E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})\right) . \end{aligned}$$

We can write

$$\begin{aligned} G=\frac{ g_\mathbf{n}(x)}{\widehat{f}(x)}, \end{aligned}$$

with

$$\begin{aligned} \displaystyle g_\mathbf{n}(x)=\frac{1}{\widehat{\mathbf{n}}h_K^d}\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}} \left( H_{\mathbf{i}}^{(1)}(\theta )-E(H_{\mathbf{i}}^{(1)}(\theta )|X_{\mathbf{i}})\right) \end{aligned}$$

and

$$\begin{aligned} \displaystyle \widehat{f}(x)=\frac{1}{\widehat{\mathbf{n}}h_K^d}\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}}. \end{aligned}$$

As by \(H_5, H^{(1)}\) is bounded, we have

$$\begin{aligned} \forall \mathbf{i} :0\le \left| H_{\mathbf{i}}^{(1)}(\theta )-E(H_{\mathbf{i}}^{(1)}(\theta )|X_{\mathbf{i}})\right| \le C h_H^{-2}. \end{aligned}$$

Thus, \(|G|\le C\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}} h_H^{-2}=C h_H^{-2}\). Then, we have

$$\begin{aligned} |G|&= |G|1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}>c}+|G|1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le c}\\&\le \frac{|g_\mathbf{n}(x)|}{\widehat{f}(x)}1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}>c}+C h_H^{-2}1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le c} \end{aligned}$$

where \(c\) is a given real number.

Let us take \(c=\widehat{\mathbf{n}}u/2\), with \(\displaystyle u=EK_{\mathbf{i}}=\int _\mathbb{R ^d}K\left( \frac{x-t}{h_K}\right) f_X(t)dt\). We get by \(H_4\):

$$\begin{aligned} \int C_1\mathbb I _{[0,1]}\left( \left\| \frac{x-t}{h_K}\right\| \right) f_X(t)dt\le u\le \int C_2\mathbb I _{[0,1]}\left( \left\| \frac{x-t}{h_K}\right\| \right) f_X(t)dt.\qquad \qquad \end{aligned}$$

By the usual change of variables \(\displaystyle s=\frac{x-t}{h_K}\), we obtain:

$$\begin{aligned} h_K^d\int C_1\mathbb I _{[0,1]}\left( \Vert s\Vert \right) f_X\left( x-{\textit{sh}}_K\right) ds \le u \le h_K^d\int C_2\mathbb I _{[0,1]}\left( \Vert s\Vert \right) f_X\left( x-{\textit{sh}}_K\right) ds. \end{aligned}$$

As by \(H_0, f_X\) is bounded, then there exist two constants \(\delta \) and \(\delta '\) such that

$$\begin{aligned} \delta h_K^d\le u\le \delta 'h_K^d. \end{aligned}$$
(23)

Therefore, if \(\sum \nolimits _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}>\widehat{\mathbf{n}}u/2\), then \(\displaystyle \widehat{f}(x)> u/2h_K^d>C\) and

$$\begin{aligned} \displaystyle \frac{|g_\mathbf{n}(x)|}{\widehat{f}(x)}< C|g_\mathbf{n}(x)|. \end{aligned}$$

Thus, we have

$$\begin{aligned} |G| \le C |g_\mathbf{n}(x)|+C h_H^{-2}1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2}, \end{aligned}$$

so

$$\begin{aligned} \left\| G\right\| _{2r}\le C \left\| g_\mathbf{n}(x)\right\| _{2r}+C h_H^{-2}\left( P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right] \right) ^{1/2r}. \end{aligned}$$
(24)

Let us focus on the first term of the right hand side of the above inequality. We can write

$$\begin{aligned} \left\| g_\mathbf{n}(x)\right\| _{2r}=\frac{1}{\widehat{\mathbf{n}}h_K^d h_H^2}\left( E\left[ \left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\xi _{\mathbf{i}}\right) ^{2r}\right] \right) ^{1/2r}, \end{aligned}$$

where \(\xi _\mathbf{i}=K_{\mathbf{i}}\digamma _{\mathbf{i}}\), with \(\digamma _{\mathbf{i}}=H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) -E\left( H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) |X_\mathbf{i}\right) \). Therefore, the proof of Lemma 9 follows from inequality (24) and Lemmas 12 and 13. \(\square \)

1.2 Proofs of main results

Proof of Theorem 1

The assumption \(H_1\) and condition (1) ensure that the conditional density \(f^x\) is continuous and strictly increasing on \((\theta -\zeta ,\theta )\). Thus, the inverse function \({f^x}^{-1}\) is also continuous and strictly increasing. In particular, the continuity of \({f^x}^{-1}\) in \( f^x(\theta )\) gives for any \(\epsilon >0\):

$$\begin{aligned} \exists \eta _1(\epsilon )>0, \forall y\in (\theta -\zeta ,\theta ),\;\left| f^x(y)-f^x(\theta )\right| \le \eta _1(\epsilon )\Rightarrow \left| y-\theta \right| \le \epsilon . \end{aligned}$$

Similarly, we also have

$$\begin{aligned} \exists \eta _2(\epsilon )>0, \forall y\in (\theta ,\theta +\zeta ),\;\left| f^x(y)-f^x(\theta )\right| \le \eta _2(\epsilon )\Rightarrow \left| y-\theta \right| \le \epsilon . \end{aligned}$$

Because by construction \(\widehat{\theta }\in (\theta -\zeta ,\theta +\zeta )\), we get

$$\begin{aligned} \exists \eta (\epsilon )>0\;,\;\left| f^x(\widehat{\theta })-f^x(\theta )\right| \le \eta (\epsilon )\Rightarrow \left| \widehat{\theta }-\theta \right| \le \epsilon . \end{aligned}$$

In such a way, we get finally

$$\begin{aligned} \exists \eta (\epsilon )>0\;,\;P\left( \left| \widehat{\theta }-\theta \right| >\epsilon \right) \le P\left( \left| f^x(\widehat{\theta })-f^x(\theta )\right| > \eta (\epsilon )\right) . \end{aligned}$$

It follows directly from the definitions of \(\theta \) and \(\widehat{\theta }\) that

$$\begin{aligned} \left| f^x(\widehat{\theta })-f^x(\theta )\right|&= f^x(\theta )-f^x(\widehat{\theta })\\&= \left( f^x(\theta )-\widehat{f^x}(\theta )\right) +\left( \widehat{f^x}(\theta )-f^x(\widehat{\theta })\right) \\&\le \left( f^x(\theta )-\widehat{f^x}(\theta )\right) +\left( \widehat{f^x} (\widehat{\theta })-f^x(\widehat{\theta })\right) \\&\le 2\sup _{y\in (\theta -\zeta ,\theta +\zeta )}\left| \widehat{f^x}(y)-f^x(y)\right| . \end{aligned}$$

Thus, the uniform almost complete convergence over the compact \(\varGamma =(\theta -\zeta ,\theta +\zeta )\) of the kernel conditional density estimate suffices to end the proof. That is done in Lemma 3 above. \(\square \)

Proof of Theorem 2

By a Taylor expansion of \(\widehat{f^x}^{(1)}(.)\) on a neighborhood of \(\theta \), we have

$$\begin{aligned} \widehat{f^x}^{(1)}(\widehat{\theta })-\widehat{f^x}^{(1)} (\theta )=\left( \widehat{\theta }-\theta \right) \widehat{f^x}^{(2)} (\theta ^*) \end{aligned}$$

where \(\theta ^*\) is an element of the interval of extremities \(\widehat{\theta }\) and \(\theta \). Thus

$$\begin{aligned} \theta -\widehat{\theta }&= \frac{\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )}{\widehat{f^x}^{(2)}(\theta ^*)}\\&= \frac{\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )}{f^{x^{(2)}}(\theta )}\\&\quad +\frac{\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )}{f^{x^{(2)}}(\theta )} \left[ \frac{f^{x^{(2)}}(\theta )-\widehat{f^x}^{(2)}(\theta ^*)}{\widehat{f^x}^{(2)}(\theta ^*)}\right] \\&= A+AB \end{aligned}$$

Theorem 1, Lemma 7 and \(H_1\) imply that \(\widehat{f}^{x^{(2)}}(\theta ^*)\rightarrow f^{x^{(2)}}(\theta )\) aco, (see for example Ferraty et al. 2005). By Minkowski’s inequality, we have

$$\begin{aligned} \left\| \theta -\widehat{\theta }\right\| _{2r}\le \left\| A\right\| _{2r}+\left\| AB\right\| _{2r}. \end{aligned}$$

Then to study the \(2r-\)mean consistency of \(\widehat{\theta }\), it suffices to focus on the term \(\displaystyle \left\| A\right\| _{2r}=\frac{1}{f^{x^{(2)}}(\theta )} \left\| \widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )\right\| _{2r}\). We recall the notations

$$\begin{aligned}&K_{\mathbf{i}}=K\left( \frac{x-X_{\mathbf{i}}}{h_1}\right) ,\; H_{\mathbf{i}}(y)=h_H^{-1}H\left( \frac{y-Y_{\mathbf{i}}}{h_H}\right) ,\\&W_{\mathbf{ni}}=W_{\mathbf{ni}}(x)=\frac{K_{\mathbf{i}}}{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}}}. \end{aligned}$$

and notice that, if we adopt the convention \(0/0=0\), then

$$\begin{aligned} \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}= 0 \quad \hbox {or}\,1. \end{aligned}$$

So, we can write:

$$\begin{aligned} \widehat{f^x}^{(1)}(y)=\left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}H^{(1)}_{\mathbf{i}}(y)\right) 1\!\!1_{\left[ \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\right] }, \end{aligned}$$

where \(1\!\!1_{[.]}\) is the indicator function. Thus, we have

$$\begin{aligned}&\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta ) =\left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta ) |X_{\mathbf{i}})\right) \right) 1\!\!1_{\left[ \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\right] }\\&\quad \qquad \qquad \qquad \qquad \qquad \quad +\left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta ) |X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right) 1\!\!1_{\left[ \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\right] }. \end{aligned}$$

Then,

$$\begin{aligned} \left| \widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )\right|&\le \left| \displaystyle \sum _{\varvec{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})\right) \right| \\&+\left| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right| . \end{aligned}$$

Applying Minkowski’s inequality, we get

$$\begin{aligned}&E^{1/2r}\left( \widehat{f}^{x^{(1)}}(\theta )-f^{x^{(1)}} (\theta )\right) ^{2r} \le E^{1/2r} \left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta )| X_{\mathbf{i}})\right) \right) ^{2r}\\&\quad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad +E^{1/2r}\left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right) ^{2r}\!\!. \end{aligned}$$

The two terms of the right hand side of this last inequality are treated respectively in Lemmas 8 and 9 above. This yields the proof of Theorem 2. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dabo-Niang, S., Ould-Abdi, S.A., Ould-Abdi, A. et al. Consistency of a nonparametric conditional mode estimator for random fields. Stat Methods Appl 23, 1–39 (2014). https://doi.org/10.1007/s10260-013-0239-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-013-0239-2

Keywords

Navigation