Appendix
In this section, we establish the main results and give the necessary technical lemmas with their proofs.
1.1 Proofs of technical lemmas
The following two technical lemmas are given respectively in Tran (1990) and Carbon et al. (1997) and their proofs will then be omitted.
Lemma 1
(i) Suppose that (4) holds. Denote by \(\mathcal L _r(\mathcal F )\) the class of \(\mathcal F -\)measurable r.v.’s \(X\) satisfying \(\Vert X\Vert _r=(E|X|^r)^{1/r}<\infty \). Suppose \(X \in \mathcal L _r(\mathcal B (E))\) and \(Y \in \mathcal L _s(\mathcal B (E'))\). Assume also that \(1\le r,\ s,\ t<\infty \) and \(r^{-1}+s^{-1}+t^{-1}=1\). Then
$$\begin{aligned} |EXY-EXEY|\le C\Vert X\Vert _r \Vert Y\Vert _s \{\psi ({\textit{Card}}(E),{\textit{Card}}(E'))\varphi ({\textit{dist}}(E,E'))\}^{1/t}.\nonumber \\ \end{aligned}$$
(11)
(ii) For r.v.’s bounded with probability 1, the right-hand side of (11) can be replaced by \(C\psi (\textit{Card}(E),\textit{Card}(E'))\varphi (\textit{dist}(E,E'))\).
Lemma 2
Let \(S_1,S_2,\ldots ,S_k\) be sets containing \(m\) sites each with \(\textit{dist}(S_i,S_j)\ge \delta \) for all \(i\ne j\) where \(1\le i,j\le k\). Let \(W_1,W_2,\ldots ,W_k\) be a sequence of real valued r.v’s taking value in \([a,b]\) and defined on some probability space \((\Omega ,\,\,\mathcal A ,\mathbf{P})\). Suppose \(W_1,W_2,\ldots ,W_k\) be respectively measurable with respect to \(\mathcal B (S_1),\mathcal B (S_2),\ldots ,\mathcal B (S_k)\). Then there exists a sequence of independent r.v’s \(W_1^*,W_2^*,\ldots ,W_k^*\) independent of \(W_1,W_2,\ldots ,W_k\) such that \(W_i^*\) has the same distribution as \(W_i\) and satisfies
$$\begin{aligned} \sum _{i=1}^k E\left| W_i-W_i^*\right| \le 2k(b-a)\psi \left( (k-1)m,m\right) \varphi (\delta ). \end{aligned}$$
The proof of of Theorem 1 is based on the following lemma.
Lemma 3
Under conditions of Theorem 1, we have for any compact \(\mathcal C \) of \(\mathbb R \)
$$\begin{aligned} \sup _{y\in \mathcal C }\left| \widehat{f^x}(y)-f^x(y)\right| \stackrel{a.co}{\rightarrow }0. \end{aligned}$$
In order to establish Lemma 3, we introduce the following notations and state the following three technical lemmas. For \(y\in \mathbb R \), let
$$\begin{aligned}&V_\mathbf{i}(x,y) = \frac{1}{\widehat{\mathbf{n}}h_K^d h_H}K\left( \frac{x-X_\mathbf{i}}{h_K}\right) H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) ,\\&\varDelta _\mathbf{i}(x,y)=V_\mathbf{i}(x,y)-EV_\mathbf{i}(x,y) \end{aligned}$$
and
$$\begin{aligned} S_\mathbf{n}(x,y)&= \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}\varDelta _\mathbf{i}(x,y)=f_\mathbf{n}(x,y)-Ef_\mathbf{n}(x,y),\\ I_\mathbf{n}(x,y)&= \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}E(\varDelta _\mathbf{i}(x,y))^2, R_\mathbf{n}(x,y)= \sum _{\underset{\mathbf{i},\mathbf{j}\in \mathcal I _\mathbf{n}}{\mathbf{i} \ne \mathbf{j}}}E\left| \varDelta _\mathbf{i}(x,y) \varDelta _\mathbf{j}(x,y)\right| . \end{aligned}$$
Lemma 4
Under the conditions of Lemma 3, we have
$$\begin{aligned} \displaystyle I_\mathbf{n}(x,y)+ R_\mathbf{n}(x,y)=O\left( \frac{1}{\widehat{\mathbf{n}}h_K^dh_H}\right) . \end{aligned}$$
Lemma 5
Under the conditions of Lemma 3, we have
$$\begin{aligned} \sup _{y \in \mathcal C }|f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\stackrel{a.co}{\rightarrow } 0. \end{aligned}$$
Lemma 6
If the conditions of Lemma 3 are satisfied, then
$$\begin{aligned} \displaystyle \widehat{f}(x)\stackrel{a.co}{\rightarrow } f_X(x). \end{aligned}$$
Proof of Lemma 4
We have
$$\begin{aligned} \widehat{\mathbf{n}} h_K^dh_H I_{\mathbf{n}}(x,y) = \widehat{\mathbf{n}} h_K^dh_H \sum _{{\mathbf{i}}\in \mathcal I _ {\mathbf{n}}}\left( E V_\mathbf{i}^2(x,y)-E^2V_\mathbf{i}(x,y)\right) . \end{aligned}$$
First, remark that
$$\begin{aligned}&\widehat{\mathbf{n}}h_K^dh_H\sum _{{\mathbf{i}}\in \mathcal I _ {\mathbf{n}}}E V_\mathbf{i}^2(x,y)\\&\quad =h_K^{-d}h_H^{-1}\int \limits _\mathbb{R ^{d+1}}K^2 \left( \frac{x-z}{h_K}\right) H^2\left( \frac{y-v}{h_H}\right) f_{X,Y}(z,v)dzdv\\&\quad =\int \limits _\mathbb{R ^{d+1}}K^2\left( z\right) H^2\left( v\right) f_{X,Y}(x-h_Kz,y-h_Hv)dzdv. \end{aligned}$$
By assumption \(H_0\) and Lebesgue dominated theorem, this last integral converges to \(f_{X,Y}(x,y)\int _\mathbb{R ^{d+1}}K^2 \left( z\right) H^2\left( v\right) dzdv\). Next, notice that
$$\begin{aligned}&\widehat{\mathbf{n}} h_K^d h_H\sum _{{\mathbf{i}}\in \mathcal I _{\mathbf{n}}}E^2V_{\mathbf{i}}(x,y)\\&\quad =h_K^{-d}h_H^{-1}\left( \int \limits _\mathbb{R ^{d+1}} K\left( \frac{x-z}{h_K}\right) H\left( \frac{y-v}{h_H}\right) f_{X,Y}(z,v)dzdv\right) ^2. \end{aligned}$$
By an usual change of variables, we obtain
$$\begin{aligned}&\widehat{\mathbf{n}} h_1^d h_H\sum _{\mathbf{i}\in \mathcal I _ {\mathbf{n}}}E^2 V_{\mathbf{i}}(x,y)\\&\quad = h_K^dh_H \left( \int \limits _\mathbb{R ^{d+1}}K(z)H(v)f_{X,Y}(x-h_Kz,y-h_Hv)dzdv \right) ^2. \end{aligned}$$
This last term tends to 0 by \(H_0\) and Lebesgue dominated theorem. Let us now prove that for \(\mathbf{n}\) large enough, there exists \(C\) such that \(\widehat{\mathbf{n}} h_K^dh_H R_{\mathbf{n}}(x,y)< C.\) Let \(\displaystyle S=\{\mathbf{i},\mathbf{j}, dist(\mathbf{i},\mathbf{j})\le s_{\mathbf{n}}\}\), where \(s_{\mathbf{n}}\) is a real sequence that converges to infinity and will be specified later. We have \( R_{\mathbf{n}}(x,y) = R_{\mathbf{n}}^1(x,y)+ R_{\mathbf{n}}^2(x,y),\) with
$$\begin{aligned} R_{\mathbf{n}}^1(x,y)=\sum _{{\mathbf{i}},{\mathbf{j}}\in S}\left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right| \end{aligned}$$
and
$$\begin{aligned} R_{\mathbf{n}}^2(x,y)=\sum _{{\mathbf{i}},{\mathbf{j}}\in S^c}\left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right| , \end{aligned}$$
where \(S^c\) stands for the complement of \(S\). Now, by change of variable, \(H_3\) and Lebesgue dominated Theorem, we get
$$\begin{aligned}&E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| \left| H\left( \frac{y-Y_\mathbf{j}}{h_H}\right) \right| |(X_\mathbf{i},X_\mathbf{j})\right] \\&\quad =\int \limits _\mathbb{R ^2} H\left( \frac{y-t}{h_H}\right) H\left( \frac{y-s}{h_H}\right) f^{(X_\mathbf{i},X_\mathbf{j})}(t,s)dtds\\&\quad = h_H^2\int \limits _\mathbb{R ^2}H(t)H(s)f^{(X_\mathbf{i},X_\mathbf{j})}(y-h_Ht,y-h_Hs)dtds\\&\quad = O\left( h_H^2\right) . \end{aligned}$$
Similarly, we have
$$\begin{aligned} E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| |X_\mathbf{i}\right] =h_H\int \limits _\mathbb{R }H(t)f^{X_\mathbf{i}}(y-h_Ht)dt=O\left( h_H\right) . \end{aligned}$$
In addition, by (8), we get
$$\begin{aligned} \displaystyle EK_\mathbf{i}K_\mathbf{j}=O\left( h_K^{2d}\right) \quad \hbox {and}\quad \displaystyle EK_\mathbf{i}=O\left( h_K^d\right) . \end{aligned}$$
Let us consider \(R_{\mathbf{n}}^1(x,y)\). We have
$$\begin{aligned} \left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right|&= \left| EV_\mathbf{i}(x,y)V_\mathbf{j}(x,y)-EV_\mathbf{i}(x,y)EV_\mathbf{j}(x,y)\right| \\&\le E\left[ E \left| V_\mathbf{i}(x,y)V_\mathbf{j}(x,y)\right| |(X_\mathbf{i},X_\mathbf{j})\right] +\left( E\left[ E |V_\mathbf{i}(x,y)||X_\mathbf{i}\right] \right) ^2\\&\le \widehat{\mathbf{n}}^{\,-2}h_K^{-2d}h_H^{-2} EK_\mathbf{i}K_\mathbf{j} E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| \left| H\left( \frac{y-Y_\mathbf{j}}{h_H}\right) \right| \right. \\&\left. |(X_\mathbf{i},X_\mathbf{j})\right] +\widehat{\mathbf{n}}^{-2}h_K^{-2d}h_H^{-2} \left( EK_\mathbf{i}E\left[ \left| H\left( \frac{y-Y_\mathbf{i}}{h_H}\right) \right| |X_\mathbf{i}\right] \right) ^2\\&\le C\widehat{\mathbf{n}}^{-2}. \end{aligned}$$
Then
$$\begin{aligned} \widehat{\mathbf{n}} h_K^dh_H R_{\mathbf{n}}^1 (x,y) \le \widehat{\mathbf{n}}^{-1}h_K^dh_H\sum _{{\mathbf{i}},{\mathbf{j}}\in S}1\le Ch_K^dh_H s_{\mathbf{n}}^N. \end{aligned}$$
Let us now compute \(R_{\mathbf{n}}^2 (x,y)\). Since \(K\) and \(H\) are bounded, by applying Lemma 1 (ii) we have
$$\begin{aligned} \left| E\varDelta _{\mathbf{i}}(x,y) \varDelta _{\mathbf{j}}(x,y)\right| \le C\widehat{\mathbf{n}}^{-2}h_K^{-2d}h_H^{-2}\psi (1,1)\varphi (\Vert \mathbf{i}-\mathbf{j}\Vert ). \end{aligned}$$
Then, we obtain that
$$\begin{aligned} \widehat{\mathbf{n}}h_K^dh_H R_{\mathbf{n}}^2(x,y)&\le C\widehat{\mathbf{n}}^{-1}h_K^{-d}h_H^{-1} \sum _{\mathbf{i},\mathbf{j}\in S^c}\psi (1,1)\varphi (\Vert \mathbf{i}-\mathbf{j}\Vert )\\&\le Ch_K^{-d}h_H^{-1}s_{\mathbf{n}}^{-N}\sum _{\Vert \mathbf{i}\Vert >s_{\mathbf{n}}}\Vert \mathbf{i}\Vert ^N\varphi (\Vert \mathbf{i}\Vert )\\&\le Ch_K^{-d}h_H^{-1}s_{\mathbf{n}}^{-N}\sum _{\Vert \mathbf{i}\Vert >s_{\mathbf{n}}}\Vert \mathbf{i}\Vert ^{N-\mu }. \end{aligned}$$
As \(\mu >N+1\), the choice \(s_{\mathbf{n}}=\left( h_K^dh_H\right) ^{-1/N}\) gives the desired result and yields the proof. \(\square \)
Proof of Lemma 5
Remark that
$$\begin{aligned}&\sup _{y\in \mathcal C }|f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\\&\quad \le \sup _{y\in \mathcal C }|f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y)|+\sup _{y\in \mathcal C }|E f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|. \end{aligned}$$
The asymptotic behavior of the bias term is standard, in the sense that it is not affected by the dependence structure of the data. We have
$$\begin{aligned}&\sup _{y \in \mathcal C }|Ef_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\\&\quad =\sup _{y \in \mathcal C }\left| \frac{1}{h_K^d h_H}\int \limits _\mathbb{R ^{d+1}}K\left( \frac{x-u}{h_K}\right) H\left( \frac{y-v}{h_H}\right) f_{X,Y}(u,v)dudv -f_{X,Y}(x,y)\right| \\&\quad =\sup _{y \in \mathcal C }\left| \,\int \limits _\mathbb{R ^{d+1}}K(t)H(s)\left[ f_{X,Y} (x-h_Kt,y-h_Hs)-f_{X,Y}(x,y)\right] dtds\right| \\&\quad \le \int \limits _\mathbb{R ^{d+1}}K(t)H(s)\sup _{y \in \mathcal C }\left| f_{X,Y}(x-h_Kt,y-h_Hs)-f_{X,Y}(x,y)\right| dtds. \end{aligned}$$
This last term goes to zero by \(H_0\) and the Lebesgue dominated theorem. The proof of the almost complete convergence of \( U_{1\mathbf{n}}(x)=\sup _{y\in \mathcal C }|f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y)|\) is similar to that of Theorem 3.3 of Carbon et al. (1997) or Lemma 3.2 of Dabo-Niang and Yao (2007). By seek of completeness, we present it entirely. Let us now introduce a spatial block decomposition that has been used by Tran (1990) and Carbon et al. (1997). Without loss of generality, assume that \(n_i=2pq_i\) for \(1\le i\le N\). The random variables \(\varDelta _\mathbf{i}(x,y)\) can be grouped into \(2^Nq_1\ldots q_N\) cubic blocks of side \(p\). Denote
$$\begin{aligned} U(1,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N}{i_k=2j_kp+1}}^{(2j_k+1)p}\varDelta _\mathbf{i}(x,y),\\ U(2,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N-1}{i_k=2j_kp+1}}^{(2j_k+1)p} \sum _{i_N=(2j_N+1)p+1}^{2(j_N+1)p}\varDelta _\mathbf{i}(x,y),\\ U(3,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N-2}{i_k=2j_kp+1}}^{(2j_k+1)p} \sum _{i_{N-1}=(2j_{N-1}+1)p+1}^{2(j_{N-1}+1)p} \sum _{i_N=2j_Np+1}^{(2j_N+1)p}\varDelta _\mathbf{i}(x,y),\\ U(4,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots ,N-2}{i_k=2j_kp+1}}^{(2j_k+1)p} \sum _{i_{N-1}=(2j_{N-1}+1)p+1}^{2(j_{N-1}+1)p} \sum _{i_N=(2j_N+1)p+1}^{2(j_N+1)p}\varDelta _\mathbf{i}(x,y), \end{aligned}$$
and so on. Note that
$$\begin{aligned} U(2^{N-1},\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots , N-1}{i_k=(2j_k+1)p+1}}^{2(j_k+1)p} \sum _{i_N=2j_Np+1}^{(2j_N+1)p}\varDelta _\mathbf{i}(x,y). \end{aligned}$$
Finally,
$$\begin{aligned} U(2^N,\mathbf{n},\mathbf{j})&= \sum _{\underset{k=1,\ldots , N}{i_k=(2j_k+1)p+1}}^{2(j_k+1)p} \varDelta _\mathbf{i}(x,y). \end{aligned}$$
For each integer \(1\le i\le 2^N\), define
$$\begin{aligned} T(\mathbf{n},i)=\sum _{\underset{k=1,\ldots ,N}{j_k=0}}^{q_k-1}U(i,\mathbf{n},\mathbf{j}). \end{aligned}$$
Clearly
$$\begin{aligned} S_\mathbf{n}(x,y)=\sum _{i=1}^{2^N}T(\mathbf{n},i). \end{aligned}$$
Observe that, for any \(\varepsilon >0\)
$$\begin{aligned} P\left( |S_\mathbf{n}(x,y)|>\varepsilon \right)&= P\left( \left| \sum _{i=1}^{2^N}T(\mathbf{n},i)\right| >\varepsilon \right) \nonumber \\&\le 2^NP\left( \left| T(\mathbf{n},1)\right| >\varepsilon /2^N\right) . \end{aligned}$$
(12)
We enumerate in an arbitrary way the \(\widehat{q}=q_1\ldots q_N\) terms \(U(1,\mathbf{n},\mathbf{j})\) of the sum \(T(\mathbf{n},1)\) that we call \(W_1,\ldots ,W_{\widehat{q}}\). Note that \(U(1,\mathbf{n},\mathbf{j})\) is measurable with respect to the \(\sigma \)-field generated by \(V_\mathbf{i}(x,y)\), with \(\mathbf{i}\) such that \(2j_kp+1\le i_k\le (2j_k+1)p, k=1,\ldots , N\).
These sets of sites are separated by a distance at least \(p\) and since \(K\) and \(H\) are bounded, then we have for all \(i=1,\ldots ,\widehat{q}\),
$$\begin{aligned} |W_i|\le C(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\Vert K\Vert _{\infty }\Vert H\Vert _{\infty }. \end{aligned}$$
Lemma 2 ensures that there exist independent random variables \(W_1^*,\ldots ,W_{\widehat{q}}^*\) such that,
$$\begin{aligned} \sum _{i=1}^{\widehat{q}} E|W_i-W_i^*|\le C \widehat{q} (\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\Vert K\Vert _{\infty }\Vert H\Vert _{\infty }\psi (\widehat{\mathbf{n}},p^N)\varphi (p). \end{aligned}$$
Markov’s inequality leads to
$$\begin{aligned} P\left( \sum _{i=1}^{\widehat{q}}|W_i-W_i^*|>\varepsilon /2^{N+1}\right) \le C2^{N+1}(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\widehat{q}\psi (\widehat{\mathbf{n}},p^N)\varepsilon ^{-1}\varphi (p).\nonumber \\ \end{aligned}$$
(13)
By Bernstein’s inequality, we have
$$\begin{aligned} P\left( |\sum _{i=1}^{\widehat{q}}W_i^*|\!>\!\varepsilon /2^{N+1}\right) \!\le \! 2\exp \left\{ \frac{-\varepsilon ^2/(2^{N+1})^2}{4\sum _{i=1}^{\widehat{q}}E W_i^{*2}\!+\!2C(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\varepsilon /2^{N+1}}\right\} .\nonumber \\ \end{aligned}$$
(14)
Combining (12), (13) and (14), we get
$$\begin{aligned}&P\left( |S_\mathbf{n}(x,y)|>\varepsilon \right) \\&\quad \le 2^NP\left( \sum _{i=1}^{\widehat{q}}|W_i-W_i^*|> \varepsilon /2^{N+1}\right) +2^NP \left( \left| \sum _{i=1}^{\widehat{q}}W_i^*\right| >\varepsilon /2^{N+1}\right) \\&\quad \le 2^{N+1}\exp \left\{ \frac{-\varepsilon ^2/(2^{N+1})^2}{4\sum _{i=1}^{\widehat{q}}E W_i^{*2}+2C(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N \varepsilon /2^{N+1}}\right\} \\&\qquad + C2^{2N+1}\psi (\widehat{\mathbf{n}},p^N)(\widehat{\mathbf{n}} h_K^dh_H)^{-1}p^N\widehat{q}\varepsilon ^{-1}\varphi (p). \end{aligned}$$
Let \(\lambda >0\) and set
$$\begin{aligned} \varepsilon =\varepsilon _\mathbf{n}&= \left( \frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^dh_H}\right) ^{1/2}, \quad p =p_\mathbf{n}= \left( \frac{\widehat{\mathbf{n}} h_K^dh_H}{\log \widehat{\mathbf{n}}}\right) ^{1/2N}. \end{aligned}$$
By the fact that \(W_i^*\) and \(W_i\) have the same distribution, we have
$$\begin{aligned} \sum _{i=1}^{\widehat{q}}E W_i^{*2}=\sum _{i=1}^{\widehat{q}}E W_i^2 \le I_\mathbf{n}(x,q_\alpha (x))+ R_\mathbf{n}(x,q_\alpha (x)). \end{aligned}$$
Then, by Lemma 4, we get \( \sum \nolimits _{i=1}^{\widehat{q}}E W_i^{*2}=O\left( \frac{1}{\widehat{n}h_K^dh_H}\right) \). Thus, for the case (i) of the Theorem 3, a simple computation shows that for sufficiently large \(\mathbf{n}\),
$$\begin{aligned} P\left( |S_{\mathbf{n}}(x,y)|>\lambda \varepsilon _\mathbf{n}\right)&\le 2^{N+1}\exp \left\{ \frac{-\lambda ^2\log \widehat{\mathbf{n}}}{2^{2N+4}C+2^{N+2}C\lambda }\right\} \nonumber \\&+ C2^{N+1}p^N h_K^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p) \nonumber \\&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}p^N h_K^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p), \end{aligned}$$
(15)
where \(b>0\) and depends on \(\lambda \) . For case (ii) of Theorem 3, we obtain
$$\begin{aligned} P\left( |S_{\mathbf{n}}(x,y)|>\lambda \varepsilon _\mathbf{n}\right)&\le 2^{N+1}\exp \left\{ \frac{-\lambda ^2\log \widehat{\mathbf{n}}}{2^{2N+4}C+2^{N+2}C\lambda }\right\} \nonumber \\&+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} h_1^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p)\nonumber \\&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}h_H^{-1}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p). \end{aligned}$$
(16)
Then, (15) and (16) can be condensed in
$$\begin{aligned}&P\left( |f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y)|>\lambda \varepsilon _{\mathbf{n}}\right) \\&\quad \le \left\{ \begin{array}{l} C\widehat{\mathbf{n}}^{-b}+C\lambda ^{-1}h_K^{-d}h_H^{-1}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \quad \hbox { under (i)},\\ C\widehat{\mathbf{n}}^{-b}+C\lambda ^{-1}h_K^{-d}h_H^{-1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\quad \hbox {under (ii)}. \end{array}\right. \end{aligned}$$
Now, set \(r_{\mathbf{n}}=h_K^{d}h_H^{2}\varepsilon _{\mathbf{n}}\). The compact \(\mathcal C \) can be covered with \(d_{\mathbf{n}}\) intervals \(I_k\) centered at \(y_k\) and having \(r_{\mathbf{n}}\) as length. We have \(d_{\mathbf{n}}\le C r_{\mathbf{n}}^{-1}\) and
$$\begin{aligned} \sup _{y\in C}|E f_{\mathbf{n}}(x,y)-f_{\mathbf{n}}(x,y)|&\le \max _k\sup _{y\in I_k }\left| f_{\mathbf{n}}(x,y)-f_{\mathbf{n}}(x,y_k)\right| \nonumber \\&+ \max _k\left| f_{\mathbf{n}}(x,y_k)-E f_{\mathbf{n}}(x,y_k)\right| \end{aligned}$$
(17)
$$\begin{aligned}&+ \max _k\sup _{y\in I_k}\left| E f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y_k)\right| .\qquad \quad \end{aligned}$$
(18)
Since the density \(H\) is lipschitz and \(K\) is bounded, we have
$$\begin{aligned} \left| f_{\mathbf{n}}(x,y)-f_{\mathbf{n}}(x,y_k)\right| \le C h_K^{-d}h_H^{-2}|y-y_k| = O(\varepsilon _{\mathbf{n}}), \end{aligned}$$
and
$$\begin{aligned} \left| E f_{\mathbf{n}}(x,y)-E f_{\mathbf{n}}(x,y_k)\right|&= O(\varepsilon _{\mathbf{n}}). \end{aligned}$$
Let us focus on \(\displaystyle U_{2\mathbf{n}}(x)=\max _k\left| f_{\mathbf{n}}(x,y_k)-Ef_{\mathbf{n}}(x,y_k)\right| \) and remark that
$$\begin{aligned} P\left( U_{2\mathbf{n}}(x)>\lambda \varepsilon _{\mathbf{n}}\right) \le d_{\mathbf{n}}\max _k P\left( |f_{\mathbf{n}}(x,y_k)-E f_{\mathbf{n}}(x,y_k)|>\lambda \varepsilon _{\mathbf{n}}\right) \!. \end{aligned}$$
To prove the convergence of \(U_{2\mathbf{n}}(x)\), it suffices to show that for respectively \((i)\) and \((ii)\)
$$\begin{aligned} \left\{ \begin{array}{l} Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-1}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \rightarrow 0,\\ Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\rightarrow 0. \end{array}\right. \end{aligned}$$
First, observe that condition \(H_6\) or \(H_7\) implies that \(\widehat{\mathbf{n}} h_K^d \rightarrow \infty \) and \(\widehat{\mathbf{n}}h_H \rightarrow \infty \). These last limits imply respectively that, there exists \(C>0\) such that \(\widehat{\mathbf{n}}> Ch_K^{-d}\) (resp. \(\widehat{\mathbf{n}}> Ch_H^{-1}\)) for \(\mathbf{n}\) large enough. Then, \(d_{\mathbf{n}}\le C\widehat{\mathbf{n}}^{5/2}(\log \widehat{\mathbf{n}})^{-1/2}\). We have
$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\le C\widehat{\mathbf{n}}^{7/2-b}(\log \widehat{\mathbf{n}})^{-1/2}u_\mathbf{n}, \end{aligned}$$
this last goes to 0 if \(b>7/2\). On one hand,
$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-1}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}}&\le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +N)}{\mu -5N}}h_H^{\frac{\mu +3N}{\mu -5N}} (\log \widehat{\mathbf{n}})^{\frac{3N-\mu }{\mu -5N}}u_{\mathbf{n}}^{\frac{-2N}{\mu -5N}}\right] ^{\frac{5N-\mu }{2N}}, \end{aligned}$$
this goes to \(0\) by \(H_6\). On the other hand, we have
$$\begin{aligned}&d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\\&\quad \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +2N)}{\mu -N(4+2{\widetilde{\beta }})}} h_H^{\frac{\mu +4N}{\mu -N(4+2{\widetilde{\beta }})}} (\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -N(4+2{\widetilde{\beta }})}}u_{\mathbf{n}}^{\frac{-2N}{\mu -N(4+2{\widetilde{\beta }})}} \right] ^{\frac{N(4+2\widetilde{\beta })-\mu }{2N}}, \end{aligned}$$
which goes to \(0\) by \(H_7\). This yields the proof of Lemma 5. \(\square \)
Proof of Lemma 6
By using the same arguments as in the proof of Lemma 5, we get
$$\begin{aligned} \left| E\widehat{f}(x)-f_X(x)\right| = \left| \int \limits _\mathbb{R ^{d}}K(s)\left[ f_{X}(x-sh)-f_{X}(x)\right] ds \right| . \end{aligned}$$
This last term tends to zero by Lebesgue dominated theorem. Let \(\displaystyle V_\mathbf{i}(x) = \frac{1}{\widehat{\mathbf{n}} h_K^d}K\left( \frac{x-X_\mathbf{i}}{h_K}\right) ,\, \varDelta _\mathbf{i}(x)=V_\mathbf{i}(x)-E V_\mathbf{i}(x)\). Then we have: \(\widehat{f}(x)-E\widehat{f}(x)=\sum \nolimits _{\mathbf{i}\in \mathcal I _\mathbf{n}}\varDelta _\mathbf{i}(x)=S_\mathbf{n}(x).\) Let \(\displaystyle I_\mathbf{n}(x) = \sum \nolimits _{\mathbf{i}\in \mathcal I _\mathbf{n}}E(\varDelta _\mathbf{i}(x))^2 \quad \hbox {and} \quad R_\mathbf{n}(x)=\sum \nolimits _{\underset{\mathbf{i},\mathbf{j}\in \mathcal I _\mathbf{n}}{\mathbf{i} \ne \mathbf{j}}}E\left| \varDelta _\mathbf{i}(x)\varDelta _\mathbf{j}(x)\right| .\) Lemma 2.2 of Tran (1990) gives that \(\displaystyle I_\mathbf{n}(x)+ R_\mathbf{n}(x)=O\left( \frac{1}{\widehat{\mathbf{n}} h_K^d}\right) \).
Consider \(\displaystyle \varepsilon =\varepsilon _\mathbf{n} = \left( \frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^d}\right) ^{1/2},\; p =p_\mathbf{n}= \left( \frac{\widehat{\mathbf{n}} h_K^d}{\log \widehat{\mathbf{n}}}\right) ^{1/2N}\) and use the same arguments as in the proof of Lemma 5, to get for sufficiently large \(\mathbf{n}\)
$$\begin{aligned} P\left( |S_{\mathbf{n}}(x)|>\lambda \varepsilon _\mathbf{n}\right) \le \left\{ \begin{array}{c} C\widehat{\mathbf{n}}^{-b}+C2^{N+1}p^N h_K^{-d}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p) \quad \hbox { under (i)},\\ C\widehat{\mathbf{n}}^{-b}+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}\lambda ^{-1}\varepsilon _\mathbf{n}^{-1}\varphi (p)\quad \hbox {under (ii)} \end{array}\right. \end{aligned}$$
with \(b>0\). It suffices to show that for the case \((i)\) (resp. \((ii)\)) \( p^Nh_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}}\rightarrow 0\) (resp. \(\widehat{\mathbf{n}}^{{\widetilde{\beta }}} h_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}}\rightarrow 0)\). A simple computation shows respectively for \((i)\) and \((ii)\)
$$\begin{aligned} \left\{ \begin{array}{l} p^Nh_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}} \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d\mu }{\mu -4N}}(\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -4N}}u_{\mathbf{n}}^{\frac{-2N}{\mu -4N}}\right] ^{\frac{4N-\mu }{2N}},\\ \widehat{\mathbf{n}}^{\widetilde{\beta }} h_K^{-d}\varepsilon _\mathbf{n}^{-1}\varphi (p)\widehat{\mathbf{n}} u_{\mathbf{n}} \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(N+\mu )}{\mu -N(3+2{\widetilde{\beta }})}}(\log \widehat{\mathbf{n}})^{\frac{N-\mu }{\mu -N(3+2{\widetilde{\beta }})}}u_{\mathbf{n}}^{\frac{-2N}{\mu -N(3+2{\widetilde{\beta }})}}\right] ^{\frac{N(3+2{\widetilde{\beta }})-\mu }{2N}}. \end{array}\right. \end{aligned}$$
These last go to \(0\) by respectively \(H_6\) and \(H_7\). This yields the proof. \(\square \)
Proof of Lemma 3
We have
$$\begin{aligned} \sup _{y \in \mathcal C }|\widehat{f^x}(y)-f^x(y)|&\le \frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }|f_{\mathbf{n}}(x,y)-f_{X,Y}(x,y)|\\&+\frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }f^x(y)|\widehat{f}(x)-f_X(x)|. \end{aligned}$$
Lemmas 6 and 5 give respectively the almost complete convergence of \(\widehat{f}(x)\) to \(f_X(x)\) and \(f_\mathbf{n}(x,y)\) to \(f_{X,Y}(x,y)\). As respectively by \(H_0\) and \(H_1\) , \(f_X(x)>0\) and \(f^x(y)\) is bounded on the compact \(\mathcal C \), the proof is finished. \(\square \)
To prove Theorem 2, we need the three following lemmas.
Lemma 7
Under conditions of Theorem 2, we have for any compact \(\mathcal C \) of \(\mathbb R \)
$$\begin{aligned} \mathbb \sup _{y\in \mathcal C }\left| \widehat{f^x}^{(2)}(y)-f^{x^{(2)}}(y)\right| \stackrel{a.co}{\rightarrow }0. \end{aligned}$$
Lemma 8
Under conditions of Theorem 2, we have
$$\begin{aligned} \left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right\| _{2r} =O\left( h_K^{b_1}+h_H^{b_2}\right) . \end{aligned}$$
Lemma 9
If conditions of Theorem 2 are satisfied, then
$$\begin{aligned} \left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})\right) \right\| _{2r} =O\left( \left( \frac{1}{\widehat{\mathbf{n}}h_K^d h_H^4}\right) ^{1/2}\right) . \end{aligned}$$
where
$$\begin{aligned}&K_{\mathbf{i}}=K\left( \frac{x-X_{\mathbf{i}}}{h_1}\right) ,\; H_{\mathbf{i}}(y)=h_H^{-1}H\left( \frac{y-Y_{\mathbf{i}}}{h_H}\right) ,\\&W_{\mathbf{ni}}=W_{\mathbf{ni}}(x)=\frac{K_{\mathbf{i}}}{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}}}. \end{aligned}$$
To prove Lemma 7 similarly to Lemma 3, we introduce the following notations and state the two technical lemmas below. Let
$$\begin{aligned} \displaystyle \widetilde{V}_\mathbf{i}(x,y)&= \frac{1}{\widehat{\mathbf{n}}h_K^d h_H^3}K\left( \frac{x-X_\mathbf{i}}{h_K}\right) H^{(2)}\left( \frac{y-Y_\mathbf{i}}{h_H}\right) ,\\ \widetilde{\varDelta }_\mathbf{i}(x,y)&= \widetilde{V}_\mathbf{i}(x,y)-E\widetilde{V}_\mathbf{i}(x,y)\;\quad \hbox {and}\\ \displaystyle \widetilde{S}_\mathbf{n}(x,y)&= \displaystyle \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}\widetilde{\varDelta }_\mathbf{i}(x,y)=f_\mathbf{n}^{(2)}(x,y)-Ef_\mathbf{n}^{(2)}(x,y)\\ \displaystyle \widetilde{I}_\mathbf{n}(x,y)&= \sum _{\mathbf{i}\in \mathcal I _\mathbf{n}}E(\widetilde{\varDelta }_\mathbf{i}(x,y))^2,\quad \widetilde{R}_\mathbf{n}(x,y)=\sum _{\underset{\mathbf{i},\mathbf{j}\in \mathcal I _\mathbf{n}}{\mathbf{i} \ne \mathbf{j}}}E\left| \widetilde{\varDelta }_\mathbf{i}(x,y) \widetilde{\varDelta }_\mathbf{j}(x,y)\right| . \end{aligned}$$
Lemma 10
If the conditions of Lemma 7 are satisfied, then
$$\begin{aligned} \displaystyle \widetilde{I}_\mathbf{n}(x,y)+\widetilde{R}_\mathbf{n}(x,y)=O\left( \frac{1}{\widehat{\mathbf{n}}h_K^dh_H^5}\right) . \end{aligned}$$
Lemma 11
Under the conditions of Lemma 7, we have
$$\begin{aligned} \displaystyle \sup _{y \in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|\rightarrow 0\quad \hbox {a.co}. \end{aligned}$$
Proof of Lemma 10
The proof is similar to that of Lemma 4. Therefore, similar calculations give
$$\begin{aligned}&\widehat{\mathbf{n}}h_K^dh_H^5\sum _{{\mathbf{i}}\in \mathcal I _ {\mathbf{n}}}E \widetilde{V}_\mathbf{i}^2(x,y)\\&\quad = h_K^{-d}h_H^{-1}\int \limits _\mathbb{R ^{d+1}} K^2\left( \frac{x-z}{h_K}\right) \left( H^{(2)}\right) ^2 \left( \frac{y-v}{h_H}\right) f_{X,Y}(z,v)dzdv\\&\quad =\int \limits _\mathbb{R ^{d+1}}K^2\left( z\right) \left( H^{(2)}\right) ^2\left( v\right) f_{X,Y}(x-h_Kz,y-h_Hv)dzdv. \end{aligned}$$
By assumption \(H_0\), Lebesgue dominated theorem and \(H_5\), this last integral converges to \(f_{X,Y}(x,y)\int _\mathbb{R ^{d+1}}K^2\left( z\right) \left( H^{(2)}\right) ^2\left( v\right) dzdv\). Next, notice that
$$\begin{aligned} \widehat{\mathbf{n}} h_K^d h_H^5\sum _{{\mathbf{i}}\in \mathcal I _{\mathbf{n}}}E^2\widetilde{V}_{\mathbf{i}}(x,y) \!=\!h_K^{-d}h_H^{-1}\left( \,\,\int \limits _\mathbb{R ^{d+1}}K \left( \frac{x\!-\!z}{h_K}\right) H^{(2)}\left( \frac{y\!-\!v}{h_H} \right) f_{X,Y}(z,v)dzdv\right) ^2. \end{aligned}$$
By an usual change of variables, we obtain
$$\begin{aligned} \widehat{\mathbf{n}} h_1^d h_H^5\sum _{\mathbf{i}\in \mathcal I _ {\mathbf{n}}}E^2 V_{\mathbf{i}}(x,y) \!=\! h_K^dh_H\left( \,\,\int \limits _\mathbb{R ^{d+1}}K(z)H^{(2)} (v)f_{X,Y}(x\!-\!h_Kz,y\!-\!h_Hv)dzdv \right) ^2. \end{aligned}$$
This last term tends to \(0\) by \(H_0\) and Lebesgue dominated theorem.
To yield the proof it suffices to treat \(\widetilde{R}_\mathbf{n}(x,y)\) by conditioning on \(X_\mathbf{i}\) and \((X_\mathbf{i},X_\mathbf{j})\) as in the end of the proof of Lemma 4. \(\square \)
Proof of Lemma 11
Remark that
$$\begin{aligned} \sup _{y\in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|&\le \sup _{y\in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-E f_{\mathbf{n}}^{(2)}(x,y)|\\&+\sup _{y\in \mathcal C }|E f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|. \end{aligned}$$
By using two successive integrations by parts and a classical change of variables, the bias term is such that
$$\begin{aligned}&\sup _{y\in \mathcal C }\left| E f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)\right| \\&\quad =\sup _{y\in \mathcal C }\left| \frac{1}{h_K^d h_H^3}\int \limits _\mathbb{R ^{d+1}}K\left( \frac{x-u}{h_K}\right) H^{(2)}\left( \frac{y-v}{h_H}\right) f_{X,Y}(u,v)dudv -f_{X,Y}^{(2)}(x,y)\right| \\&\quad \le \int \limits _\mathbb{R ^{d+1}}K(t)H(s)\sup _{y\in \mathcal C }\left| f_{X,Y}^{(2)}(x-h_Kt,y-h_Hs)- f_{X,Y}^{(2)}(x,y)\right| dtds. \end{aligned}$$
This last term goes to zero by \(H_1\) and Lebesgue dominated theorem. Thus, it remains to show the almost complete convergence of \(V_{1\mathbf{n}}(x)=\sup _{y\in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-E f_{\mathbf{n}}^{(2)}(x,y)|\) by following the same lines as in the proof of the almost complete convergence of \(U_{1\mathbf{n}}(x).\) More precisely, let here \(\displaystyle \varepsilon _\mathbf{n} = \left( \frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^dh_H^5}\right) ^{1/2}\) and \( r_\mathbf{n}=h_K^dh_H^4\varepsilon _\mathbf{n}\), we obtain for \(\lambda >0\), the existence of \(b>0\) such that for \(\mathbf{n}\) large enough,
$$\begin{aligned} P\left( |f_{\mathbf{n}}^{(2)}(x,y)\!-\!E f_{\mathbf{n}}^{(2)}(x,y)|\!>\!\lambda \varepsilon _{\mathbf{n}}\right) \le \left\{ \begin{array}{l} C\widehat{\mathbf{n}}^{-b}\!+\!C\lambda ^{-1}h_K^{-d}h_H^{-3}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \quad \hbox { under (i)},\\ C\widehat{\mathbf{n}}^{-b}\!+\!C\lambda ^{-1}h_K^{-d}h_H^{-3}\widehat{\mathbf{n}}^{\widetilde{\beta }} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\quad \hbox {under (ii)}. \end{array}\right. \end{aligned}$$
Thus, it suffices to show that for respectively \((i)\) and \((ii)\)
$$\begin{aligned} \left\{ \begin{array}{l} Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-3}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}} \rightarrow 0,\\ Cd_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\rightarrow 0\ \ \hbox { and}\ \ d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\lambda ^{-1}h_K^{-d}h_H^{-3}\widehat{\mathbf{n}}^{\widetilde{\beta }} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\rightarrow 0. \end{array}\right. \end{aligned}$$
where \(d_{\mathbf{n}}\le C r_{\mathbf{n}}^{-1}\) . Then, for \(\mathbf{n}\) large enough, there exists \(C>0\) such that \(d_{\mathbf{n}}\le C\widehat{\mathbf{n}}^{5/2}(\log \widehat{\mathbf{n}})^{-1/2}\), and
$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}}u_\mathbf{n}\widehat{\mathbf{n}}^{-b}\le C\widehat{\mathbf{n}}^{7/2-b}(\log \widehat{\mathbf{n}})^{-1/2}u_\mathbf{n}. \end{aligned}$$
This last goes to 0 if \(b>7/2\). On the one hand, we have
$$\begin{aligned} d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-3}\varepsilon _{\mathbf{n}}^{\frac{\mu -2N}{N}}&\le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +N)}{\mu -5N}}h_H^{\frac{5\mu -N}{\mu -5N}}(\log \widehat{\mathbf{n}})^{\frac{3N-\mu }{\mu -5N}}u_{\mathbf{n}}^{\frac{-2N}{\mu -5N}}\right] ^{\frac{5N-\mu }{2N}}. \end{aligned}$$
This last goes to \(0\) by \(H_8\). On the other hand, we have
$$\begin{aligned}&d_{\mathbf{n}}\widehat{\mathbf{n}} u_{\mathbf{n}}\lambda ^{-1}h_K^{-d}h_H^{-3}\widehat{\mathbf{n}}^{\widetilde{\beta }} \varepsilon _{\mathbf{n}}^{\frac{\mu -N}{N}}\\&\quad \le C\left[ \widehat{\mathbf{n}} h_K^{\frac{d(\mu +2N)}{\mu -N(4+2{\widetilde{\beta }})}} h_H^{\frac{5\mu +4N}{\mu -N(4+2{\widetilde{\beta }})}} (\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -N(4+2{\widetilde{\beta }})}}u_{\mathbf{n}}^{\frac{-2N}{\mu -N(4+2{\widetilde{\beta }})}}\right] ^{\frac{N(4+2\widetilde{\beta })-\mu }{2N}}, \end{aligned}$$
which goes to \(0\) by \(H_9\). This finishes the proof. \(\square \)
Proof of Lemma 7
Analogously to the proof of Lemma 3, we have
$$\begin{aligned} \sup _{y \in \mathcal C }|\widehat{f}^{x^{(2)}}(y)-f^{x^{(2)}}(y)|&\le \frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }|f_{\mathbf{n}}^{(2)}(x,y)-f_{X,Y}^{(2)}(x,y)|\\&+\frac{1}{\widehat{f}(x)}\sup _{y \in \mathcal C }f^{x^{(2)}}(y)|\widehat{f}(x)-f_X(x)|. \end{aligned}$$
Lemmas 6 and 11 give respectively the almost complete convergence of \(\widehat{f}(x)\) to \(f_X(x)\) and \(f_{\mathbf{n}}^{(2)}(x,y)\) to \(f_{X,Y}^{(2)}(x,y)\). We have respectively by \(H_0\) and \(H_1, f_X(x)>0\) and \(f^{x^{(2)}}(y)\) bounded on the compact \(\mathcal C \). This ends the proof. \(\square \)
Proof of Lemma 8
On the one hand, the \(W_\mathbf{ni}\)’s definition and assumption \(H_4\) permit to write that \(\displaystyle W_\mathbf{ni}=W_\mathbf{ni}1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\), so
$$\begin{aligned}&\left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta ) |X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right\| _{2r}\nonumber \\&\quad = \left\| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right\| _{2r}. \end{aligned}$$
(19)
On the other hand, for all \(\mathbf{i}\in \mathcal I _{\mathbf{n}} \), we have
$$\begin{aligned}&\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \\&\quad = \left| \int \limits _\mathbb{R }h_H^{-2}H^{(1)}\left( \frac{\theta -z}{h_H}\right) f^{X_\mathbf{i}}(z)dz-f^{x^{(1)}}(\theta (x))\right| . \end{aligned}$$
Then, by using respectively an integration by parts and an usual change of variables, we get
$$\begin{aligned}&\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \\&\quad = \left| h_H^{-1}\int \limits _\mathbb{R }H\left( \frac{\theta -z}{h_H}\right) f^{X_\mathbf{i}^{(1)}}(z)dz-f^{x^{(1)}}(\theta )\right| \\&\quad =\left| \int \limits _\mathbb{R }H(u)\left[ f^{X_\mathbf{i}^{(1)}}(\theta -h_H u)-f^{x^{(1)}}(\theta )\right] du\right| \\&\quad \le \int \limits _\mathbb{R }H(u)\left| f^{X_\mathbf{i}^{(1)}}(\theta -h_H u)-f^{x^{(1)}}(\theta )\right| du. \end{aligned}$$
Hence
$$\begin{aligned}&1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \\&\quad \le 1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\int \limits _\mathbb{R }H(u)\left| f^{X_\mathbf{i}^{(1)}}(\theta -h_H u)-f^{x^{(1)}}(\theta )\right| du. \end{aligned}$$
Then assumption \(H_2\) gives
$$\begin{aligned} 1\!\!1_{\Vert X_\mathbf{i}-x\Vert \le h_K}\left| E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right| \le C\left( h_K^{b_1}+h_H^{b_2}\right) . \end{aligned}$$
Since \(\sum \nolimits _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\) or \(0\), this last inequality in conjunction with (19) finishes the proof. \(\square \)
Before to estabish Lemma 9, we introduce some notations and state the two following lemmas. Let \(\xi _\mathbf{i}=K_{\mathbf{i}}\digamma _{\mathbf{i}}\), with \(\digamma _{\mathbf{i}}=H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) -E\left( H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) |X_\mathbf{i}\right) \). We have
Lemma 12
Under conditions of Lemma 9, we have
$$\begin{aligned} \displaystyle E\left[ \left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\xi _{\mathbf{i}}\right) ^{2r}\right] \le C\left( \widehat{\mathbf{n}}h_K^d\right) ^r. \end{aligned}$$
Lemma 13
If conditions of Lemma 9 are satisfied, then we have
$$\begin{aligned} \left( P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{n}u/2\right] \right) ^{1/2r}=O\left( \left( \frac{1}{\widehat{\mathbf{n}}h_K^d}\right) ^{1/2}\right) . \end{aligned}$$
Proof of Lemma 12
The proof is completely similar to that of Lemma 2.2 of Gao et al. (2008). That is why, we use the notation \(\xi _\mathbf{i}\) introduced by them. Because of the boundness of \(\digamma _{\mathbf{i}}\), the moment results obtained have a more simple form than in Gao et al. (2008). To start, note that
$$\begin{aligned} E\left[ \left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\xi _{\mathbf{i}}\right) ^{2r}\right] =\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}E\left[ \xi _{\mathbf{i}}^{2r}\right] +\sum _{s=1}^{2r-1}\sum _{\nu _0+\nu _1+ \ldots +\nu _s=2r}V_s(\nu _0,\nu _1,\ldots ,\nu _s),\qquad \quad \end{aligned}$$
(20)
where \(\sum _{\nu _0+\nu _1+\ldots +\nu _s=2r}\) is the summation over \((\nu _0,\nu _1,\ldots ,\nu _s)\) with positive integer components satisfying \(\nu _0+\nu _1+\cdots +\nu _s=2r\) and
$$\begin{aligned} V_s(\nu _0,\nu _1,\ldots ,\nu _s)=\sum _{\mathbf{i}_0\ne \mathbf{i}_1\ne \ldots \ne \mathbf{i}_s}E\left[ \xi _{\mathbf{i}_0}^{\nu _0} \xi _{\mathbf{i}_1}^{\nu _1}\ldots \xi _{\mathbf{i}_s}^{\nu _s}\right] , \end{aligned}$$
where the summation \(\sum _{\mathbf{i}_0\ne \mathbf{i}_1\ne \ldots \ne \mathbf{i}_s}\) is over indexes \((\mathbf{i}_0,\mathbf{i}_1,\ldots ,\mathbf{i}_s)\) with each index \(\mathbf{i}_j\) taking value in \(\mathcal I _{\mathbf{n}}\) and satisfying \(\mathbf{i}_j\ne \mathbf{i}_l\) for any \(j\ne l, 0\le j,l\le s\).
By stationarity and \(H_4\), we have
$$\begin{aligned} \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}E\left( \xi _{\mathbf{i}}\right) ^{2r} \le \widehat{\mathbf{n}}E\left( K_{\mathbf{i}}\left| \digamma _{\mathbf{i}}\right| \right) ^{2r} \le C\widehat{\mathbf{n}}E\left( K_{\mathbf{i}}\right) ^{2r}\le C\widehat{\mathbf{n}} h_K^d, \end{aligned}$$
(21)
where \(\displaystyle \digamma _{\mathbf{i}}=H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) -E\left( H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) |X_\mathbf{i}\right) \).
To control the term \(V_s(\nu _0,\nu _1,\ldots ,\nu _s)\), we need to prove, for any positive integers \( \nu _1,\nu _2,\ldots ,\nu _s \), the following results:
-
i)
\(\displaystyle E\left| \xi _{\mathbf{i}_1}^{\nu _1}\xi _{\mathbf{i}_2}^{\nu _2}\ldots \xi _{\mathbf{i}_s}^{\nu _s}\right| \le h_K^{ds}\)
-
ii)
\(\displaystyle V_s(\nu _0,\nu _1,\ldots ,\nu _s)=O\left( \left( \widehat{\mathbf{n}}h_K^d\right) ^{s+1}\right) \), for \(s=1,2,\ldots ,r-1\) and \(r>1\)
-
iii)
\(\displaystyle V_s(\nu _0,\nu _1,\ldots ,\nu _s)=O\left( \left( \widehat{\mathbf{n}}h_K^d\right) ^r\right) \), for \(r\le s\le 2r-1\).
The proofs of \(i), ii)\) and \(iii)\) will be ommitted because the technics used in theses proofs are similar to that given in Gao et al. (2008). Now, remark that \((i)\) and \((ii)\) imply that
$$\begin{aligned} \sum _{s=1}^{2r-1}\sum _{\nu _0+\nu _1+\ldots +\nu _s=2r}V_s(\nu _0,\nu _1,\ldots ,\nu _s)\le C\widehat{\mathbf{n}}\left( h_K^d\right) ^r. \end{aligned}$$
(22)
Then, Lemma 12 follows from (20), (21) and (22). \(\square \)
Proof of Lemma 13
Using (23), we have
$$\begin{aligned} P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right]&\le P\left[ \left| \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\left( K_\mathbf{i}-EK_\mathbf{i}\right) \right| \ge \widehat{\mathbf{n}}u/2\right] \\&\le P\left[ \left| \frac{1}{\widehat{\mathbf{n}}h_K^d}\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\left( K_\mathbf{i}-EK_\mathbf{i}\right) \right| \ge C\right] . \end{aligned}$$
Then, for large \(\mathbf{n}\),
$$\begin{aligned} P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right] \le P\left[ \left| S_\mathbf{n}(x)\right| >\lambda \varepsilon _{\mathbf{n}}\right] , \end{aligned}$$
where \(S_\mathbf{n}(x)=\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\varDelta _\mathbf{i}(x)=\widehat{f}(x)-E\widehat{f}(x), \lambda , p=p_\mathbf{n}=\left( \dfrac{\widehat{\mathbf{n}}h_K^d}{\log \widehat{n}}\right) ^{1/2N}\) and \(\varepsilon _{\mathbf{n}}=\sqrt{\frac{\log \widehat{\mathbf{n}}}{\widehat{\mathbf{n}} h_K^d}}\) are the same as in the proof of Lemma 6. Considering sufficiently large \(\mathbf{n}\) (see the proof of Lemma 6), we have the existence of \(b>0\) such that for respectively \((i)\) and \((ii)\),
$$\begin{aligned} P\left( |S_{\mathbf{n}}(x)|>\lambda \varepsilon _{\mathbf{n}}\right)&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}p^N h_K^{-d}\lambda ^{-1}\varepsilon _{\mathbf{n}}^{-1}\varphi (p),\\ P\left( |S_{\mathbf{n}}(x)|>\lambda \varepsilon _{\mathbf{n}}\right)&\le C\widehat{\mathbf{n}}^{-b}+C2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}\lambda ^{-1}\varepsilon _{\mathbf{n}}^{-1}\varphi (p). \end{aligned}$$
Note that \(\displaystyle \widehat{\mathbf{n}}^{(r-b)}h_K^{dr}\) tends to zero, for \(b>r\). For the case of (i), a simple computation gives
$$\begin{aligned} \left( \widehat{\mathbf{n}}h_K^d\right) ^rC2^{N+1}p^N h_K^{-d}\lambda ^{-1} \varepsilon _{\mathbf{n}}^{-1}\varphi (p) \le \left[ \widehat{\mathbf{n}}h_K^{\frac{d(\mu -2Nr)}{\mu -2N(r+1)}}(\log \widehat{\mathbf{n}})^{\frac{2N-\mu }{\mu -2N(r+1)}}\right] ^\frac{2N(r+1)-\mu }{2N}, \end{aligned}$$
which tends to zero by \(H_8\). Similarly, for the case of (ii), we obtain
$$\begin{aligned} \left( \widehat{\mathbf{n}}h_K^d\right) ^rC2^{N+1}\widehat{\mathbf{n}}^{{\widetilde{\beta }}}h_K^{-d}\lambda ^{-1}\varepsilon _{\mathbf{n}}^{-1}\varphi (p)\le \left[ \widehat{\mathbf{n}}h_K^\frac{d(\mu -2Nr+N)}{\mu -N(2r+2\tilde{\beta }+1)}(\log \widehat{\mathbf{n}})^{\frac{N-\mu }{\mu -N(2r+2\tilde{\beta }+1)}}\right] ^\frac{N(2r+2\tilde{\beta }+1)-\mu }{2N}, \end{aligned}$$
which goes to zero under \(H_9\). Thus, in the two cases of mixing, we have
$$\begin{aligned} P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right] =O\left( \left( \frac{1}{\widehat{\mathbf{n}}h_K^d}\right) ^r\right) . \end{aligned}$$
This yields the proof. \(\square \)
Proof of Lemma 9
Let us set
$$\begin{aligned} G=\displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta ) -E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})\right) . \end{aligned}$$
We can write
$$\begin{aligned} G=\frac{ g_\mathbf{n}(x)}{\widehat{f}(x)}, \end{aligned}$$
with
$$\begin{aligned} \displaystyle g_\mathbf{n}(x)=\frac{1}{\widehat{\mathbf{n}}h_K^d}\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}} \left( H_{\mathbf{i}}^{(1)}(\theta )-E(H_{\mathbf{i}}^{(1)}(\theta )|X_{\mathbf{i}})\right) \end{aligned}$$
and
$$\begin{aligned} \displaystyle \widehat{f}(x)=\frac{1}{\widehat{\mathbf{n}}h_K^d}\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}}. \end{aligned}$$
As by \(H_5, H^{(1)}\) is bounded, we have
$$\begin{aligned} \forall \mathbf{i} :0\le \left| H_{\mathbf{i}}^{(1)}(\theta )-E(H_{\mathbf{i}}^{(1)}(\theta )|X_{\mathbf{i}})\right| \le C h_H^{-2}. \end{aligned}$$
Thus, \(|G|\le C\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}} h_H^{-2}=C h_H^{-2}\). Then, we have
$$\begin{aligned} |G|&= |G|1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}>c}+|G|1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le c}\\&\le \frac{|g_\mathbf{n}(x)|}{\widehat{f}(x)}1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}>c}+C h_H^{-2}1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le c} \end{aligned}$$
where \(c\) is a given real number.
Let us take \(c=\widehat{\mathbf{n}}u/2\), with \(\displaystyle u=EK_{\mathbf{i}}=\int _\mathbb{R ^d}K\left( \frac{x-t}{h_K}\right) f_X(t)dt\). We get by \(H_4\):
$$\begin{aligned} \int C_1\mathbb I _{[0,1]}\left( \left\| \frac{x-t}{h_K}\right\| \right) f_X(t)dt\le u\le \int C_2\mathbb I _{[0,1]}\left( \left\| \frac{x-t}{h_K}\right\| \right) f_X(t)dt.\qquad \qquad \end{aligned}$$
By the usual change of variables \(\displaystyle s=\frac{x-t}{h_K}\), we obtain:
$$\begin{aligned} h_K^d\int C_1\mathbb I _{[0,1]}\left( \Vert s\Vert \right) f_X\left( x-{\textit{sh}}_K\right) ds \le u \le h_K^d\int C_2\mathbb I _{[0,1]}\left( \Vert s\Vert \right) f_X\left( x-{\textit{sh}}_K\right) ds. \end{aligned}$$
As by \(H_0, f_X\) is bounded, then there exist two constants \(\delta \) and \(\delta '\) such that
$$\begin{aligned} \delta h_K^d\le u\le \delta 'h_K^d. \end{aligned}$$
(23)
Therefore, if \(\sum \nolimits _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}>\widehat{\mathbf{n}}u/2\), then \(\displaystyle \widehat{f}(x)> u/2h_K^d>C\) and
$$\begin{aligned} \displaystyle \frac{|g_\mathbf{n}(x)|}{\widehat{f}(x)}< C|g_\mathbf{n}(x)|. \end{aligned}$$
Thus, we have
$$\begin{aligned} |G| \le C |g_\mathbf{n}(x)|+C h_H^{-2}1\!\!1_{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2}, \end{aligned}$$
so
$$\begin{aligned} \left\| G\right\| _{2r}\le C \left\| g_\mathbf{n}(x)\right\| _{2r}+C h_H^{-2}\left( P\left[ \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_\mathbf{i}\le \widehat{\mathbf{n}}u/2\right] \right) ^{1/2r}. \end{aligned}$$
(24)
Let us focus on the first term of the right hand side of the above inequality. We can write
$$\begin{aligned} \left\| g_\mathbf{n}(x)\right\| _{2r}=\frac{1}{\widehat{\mathbf{n}}h_K^d h_H^2}\left( E\left[ \left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}\xi _{\mathbf{i}}\right) ^{2r}\right] \right) ^{1/2r}, \end{aligned}$$
where \(\xi _\mathbf{i}=K_{\mathbf{i}}\digamma _{\mathbf{i}}\), with \(\digamma _{\mathbf{i}}=H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) -E\left( H^{(1)}\left( \frac{\theta -Y_\mathbf{i}}{h_H}\right) |X_\mathbf{i}\right) \). Therefore, the proof of Lemma 9 follows from inequality (24) and Lemmas 12 and 13. \(\square \)
1.2 Proofs of main results
Proof of Theorem 1
The assumption \(H_1\) and condition (1) ensure that the conditional density \(f^x\) is continuous and strictly increasing on \((\theta -\zeta ,\theta )\). Thus, the inverse function \({f^x}^{-1}\) is also continuous and strictly increasing. In particular, the continuity of \({f^x}^{-1}\) in \( f^x(\theta )\) gives for any \(\epsilon >0\):
$$\begin{aligned} \exists \eta _1(\epsilon )>0, \forall y\in (\theta -\zeta ,\theta ),\;\left| f^x(y)-f^x(\theta )\right| \le \eta _1(\epsilon )\Rightarrow \left| y-\theta \right| \le \epsilon . \end{aligned}$$
Similarly, we also have
$$\begin{aligned} \exists \eta _2(\epsilon )>0, \forall y\in (\theta ,\theta +\zeta ),\;\left| f^x(y)-f^x(\theta )\right| \le \eta _2(\epsilon )\Rightarrow \left| y-\theta \right| \le \epsilon . \end{aligned}$$
Because by construction \(\widehat{\theta }\in (\theta -\zeta ,\theta +\zeta )\), we get
$$\begin{aligned} \exists \eta (\epsilon )>0\;,\;\left| f^x(\widehat{\theta })-f^x(\theta )\right| \le \eta (\epsilon )\Rightarrow \left| \widehat{\theta }-\theta \right| \le \epsilon . \end{aligned}$$
In such a way, we get finally
$$\begin{aligned} \exists \eta (\epsilon )>0\;,\;P\left( \left| \widehat{\theta }-\theta \right| >\epsilon \right) \le P\left( \left| f^x(\widehat{\theta })-f^x(\theta )\right| > \eta (\epsilon )\right) . \end{aligned}$$
It follows directly from the definitions of \(\theta \) and \(\widehat{\theta }\) that
$$\begin{aligned} \left| f^x(\widehat{\theta })-f^x(\theta )\right|&= f^x(\theta )-f^x(\widehat{\theta })\\&= \left( f^x(\theta )-\widehat{f^x}(\theta )\right) +\left( \widehat{f^x}(\theta )-f^x(\widehat{\theta })\right) \\&\le \left( f^x(\theta )-\widehat{f^x}(\theta )\right) +\left( \widehat{f^x} (\widehat{\theta })-f^x(\widehat{\theta })\right) \\&\le 2\sup _{y\in (\theta -\zeta ,\theta +\zeta )}\left| \widehat{f^x}(y)-f^x(y)\right| . \end{aligned}$$
Thus, the uniform almost complete convergence over the compact \(\varGamma =(\theta -\zeta ,\theta +\zeta )\) of the kernel conditional density estimate suffices to end the proof. That is done in Lemma 3 above. \(\square \)
Proof of Theorem 2
By a Taylor expansion of \(\widehat{f^x}^{(1)}(.)\) on a neighborhood of \(\theta \), we have
$$\begin{aligned} \widehat{f^x}^{(1)}(\widehat{\theta })-\widehat{f^x}^{(1)} (\theta )=\left( \widehat{\theta }-\theta \right) \widehat{f^x}^{(2)} (\theta ^*) \end{aligned}$$
where \(\theta ^*\) is an element of the interval of extremities \(\widehat{\theta }\) and \(\theta \). Thus
$$\begin{aligned} \theta -\widehat{\theta }&= \frac{\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )}{\widehat{f^x}^{(2)}(\theta ^*)}\\&= \frac{\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )}{f^{x^{(2)}}(\theta )}\\&\quad +\frac{\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )}{f^{x^{(2)}}(\theta )} \left[ \frac{f^{x^{(2)}}(\theta )-\widehat{f^x}^{(2)}(\theta ^*)}{\widehat{f^x}^{(2)}(\theta ^*)}\right] \\&= A+AB \end{aligned}$$
Theorem 1, Lemma 7 and \(H_1\) imply that \(\widehat{f}^{x^{(2)}}(\theta ^*)\rightarrow f^{x^{(2)}}(\theta )\) aco, (see for example Ferraty et al. 2005). By Minkowski’s inequality, we have
$$\begin{aligned} \left\| \theta -\widehat{\theta }\right\| _{2r}\le \left\| A\right\| _{2r}+\left\| AB\right\| _{2r}. \end{aligned}$$
Then to study the \(2r-\)mean consistency of \(\widehat{\theta }\), it suffices to focus on the term \(\displaystyle \left\| A\right\| _{2r}=\frac{1}{f^{x^{(2)}}(\theta )} \left\| \widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )\right\| _{2r}\). We recall the notations
$$\begin{aligned}&K_{\mathbf{i}}=K\left( \frac{x-X_{\mathbf{i}}}{h_1}\right) ,\; H_{\mathbf{i}}(y)=h_H^{-1}H\left( \frac{y-Y_{\mathbf{i}}}{h_H}\right) ,\\&W_{\mathbf{ni}}=W_{\mathbf{ni}}(x)=\frac{K_{\mathbf{i}}}{\sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}K_{\mathbf{i}}}. \end{aligned}$$
and notice that, if we adopt the convention \(0/0=0\), then
$$\begin{aligned} \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}= 0 \quad \hbox {or}\,1. \end{aligned}$$
So, we can write:
$$\begin{aligned} \widehat{f^x}^{(1)}(y)=\left( \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}H^{(1)}_{\mathbf{i}}(y)\right) 1\!\!1_{\left[ \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\right] }, \end{aligned}$$
where \(1\!\!1_{[.]}\) is the indicator function. Thus, we have
$$\begin{aligned}&\widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta ) =\left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta ) |X_{\mathbf{i}})\right) \right) 1\!\!1_{\left[ \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\right] }\\&\quad \qquad \qquad \qquad \qquad \qquad \quad +\left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta ) |X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right) 1\!\!1_{\left[ \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}=1\right] }. \end{aligned}$$
Then,
$$\begin{aligned} \left| \widehat{f^x}^{(1)}(\theta )-f^{x^{(1)}}(\theta )\right|&\le \left| \displaystyle \sum _{\varvec{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})\right) \right| \\&+\left| \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right| . \end{aligned}$$
Applying Minkowski’s inequality, we get
$$\begin{aligned}&E^{1/2r}\left( \widehat{f}^{x^{(1)}}(\theta )-f^{x^{(1)}} (\theta )\right) ^{2r} \le E^{1/2r} \left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( H^{(1)}_{\mathbf{i}}(\theta )-E(H^{(1)}_{\mathbf{i}}(\theta )| X_{\mathbf{i}})\right) \right) ^{2r}\\&\quad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad +E^{1/2r}\left( \displaystyle \sum _{\mathbf{i}\in \mathcal I _{\mathbf{n}}}W_{\mathbf{ni}}\left( E(H^{(1)}_{\mathbf{i}}(\theta )|X_{\mathbf{i}})-f^{x^{(1)}}(\theta )\right) \right) ^{2r}\!\!. \end{aligned}$$
The two terms of the right hand side of this last inequality are treated respectively in Lemmas 8 and 9 above. This yields the proof of Theorem 2. \(\square \)