Multiscale local polynomial decompositions using bandwidths as scales

Jansen, Maarten; Amghar, Mohamed

doi:10.1007/s11222-016-9692-8

Multiscale local polynomial decompositions using bandwidths as scales

Published: 20 August 2016

Volume 27, pages 1383–1399, (2017)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

581 Accesses
Explore all metrics

Abstract

The multiscale local polynomial transform, developped in this paper, combines the benefits from local polynomial smoothing with sparse multiscale decompositions. The contribution of the paper is twofold. First, it focusses on the bandwidths used throughout the transform. These bandwidths operate as user controlled scales in a multiscale analysis, which is explained to be of particular interest in the case of nonequispaced data. The paper presents both a likelihood based optimal bandwidth selection and a fast, heuristic approach. The second contribution of the paper is the combination of local polynomial smoothing with orthogonal prefilters, similar to Daubechies’ wavelet filters, but defined on irregularly spaced covariate values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density Estimation Using Multiscale Local Polynomial Transforms

Multiscale representation for irregularly spaced data

Article 09 October 2017

Multiscale Approximation

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Anderson, L., et al.: The clustering of galaxies in the sdss-iii baryon oscillation spectroscopic survey: baryon acoustic oscillations in the data release 9 spectroscopic galaxy sample. Mon. Notic. R. Astron. Soc. 427(4), 3435–3467 (2012)
Article Google Scholar
Anderson, L., et al.: The clustering of galaxies in the sdss-iii baryon oscillation spectroscopic survey: Baryon acoustic oscillations in the data release 10 and 11 galaxy samples. Mon. Notic. R. Astron. Soc. 441(1), 24–62 (2014)
Article Google Scholar
Burt, P.J., Adelson, E.H.: Laplacian pyramid as a compact image code. IEEE Trans. Commun. 31(4), 532–540 (1983)
Article Google Scholar
Daubechies, I.: Ten Lectures on wavelets. CBMS-NSF regional conference series in applied mathematics, vol. 61. SIAM, Philadelphia (1992)
Do, M.N., Vetterli, M.: Framing pyramids. IEEE Trans. Signal Process. 51(9), 2329–2342 (2003)
Article MathSciNet Google Scholar
Fan, J., Gijbels, I.: Local Polynomial Modelling and its Applications. Chapman and Hall, London (1996)
MATH Google Scholar
Jansen, M.: Multiscale local polynomial smoothing in a lifted pyramid for non-equispaced data. IEEE Trans. Signal Process. 61(3), 545–555 (2013)
Article MathSciNet Google Scholar
Jansen, M.: Multiscale local polynomial models for estimation and testing. In: Akritas, M., Lahiri, S.N., Politis, D., (eds.), Topics in NonParametric statistics of the Springer proceedings in mathematics & statistics, chapter 14, vol. 74 , pp. 155–166. Springer, New York, Proceedings of the first conference of the international society for nonparametric statistics (2014)
Percival, W.J., et al.: Baryon acoustic oscillations in the sloan digital sky survey data release 7 galaxy sample. Mon. Notic. R. Astron. Soc. 401(4), 2148–2168 (2010)
Article Google Scholar
Strang, G., Nguyen, T.: Wavelets and Filter Banks. Wellesley-Cambridge Press, Wellesley (1996)
MATH Google Scholar
Sweldens, W.: The lifting scheme: A custom-design construction of biorthogonal wavelets. Appl. Comp. Harmon. Anal. 3(2), 186–200 (1996)
Article MathSciNet MATH Google Scholar
Sweldens, W.: The lifting scheme: a construction of second generation wavelets. SIAM J. Math. Anal. 29(2), 511–546 (1998)
Article MathSciNet MATH Google Scholar
Vieu, P.: Nonparametric regression: optimal local bandwidth choice. J. R. Stat. Soc. Ser. B 53(2), 453–464 (1991)
MathSciNet MATH Google Scholar

Download references

Acknowledgments

Research support by the IAP research network Grant Nr. P7/06 of the Belgian government (Belgian Science Policy) is gratefully acknowledged.

Author information

Authors and Affiliations

Department of Mathematics, Université libre de Bruxelles, Brussels, Belgium
Maarten Jansen & Mohamed Amghar

Authors

Maarten Jansen
View author publications
You can also search for this author inPubMed Google Scholar
Mohamed Amghar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Maarten Jansen.

Appendices

Appendix 1: Construction of prefilters using elementary orthogonal matrices

This sections develops more details on the construction of prefilters using the form in (25). We assume that appropriate choices of $\gamma _j$ and $\widetilde{{\mathbf {J}}}_j$ allow us to define a real matrix $\widetilde{{\mathbf {S}}}_j$ in (24). In particular, as in Lemma 3, the subsampling operation $\widetilde{{\mathbf {J}}}_j$ leaves out not only the odd indices from ${\varvec{s}}_{j+1}$, but also ${\widetilde{p}}-(2n_j-n_{j+1})$ even indices near the boundary. Near the boundary, the filtering is replaced by simple subsampling, meaning for instance that $s_{j,0} = _{j+1,0}$, thus allowing us to focus on the interior. Obviously, alternative solutions may filter near the boundary, at the price of slightly less variance reduction in the interior.

The initial step of the iteration creates the matrix $ \widetilde{{\mathbf {U}}}_j^{[0]} = \widetilde{{\mathbf {S}}}_j \widetilde{{\mathbf {Q}}}_j^{[0]} - \widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {V}}}_j^T. $ As the primary goal of this step is to map the $n_j \times n_j$ matrix $\widetilde{{\mathbf {S}}}_j$ onto the $n_j \times (n_{j+1}-{\widetilde{p}})$ matrix $ \widetilde{{\mathbf {S}}}_j^{[0]} = \widetilde{{\mathbf {S}}}_j \widetilde{{\mathbf {Q}}}_j^{[0]}, $ the matrix $\widetilde{{\mathbf {Q}}}_j^{[0]}$ is chosen to be a possibly row-permuted submatrix of the $(n_{j+1}-{\widetilde{p}}) \times (n_{j+1}-{\widetilde{p}})$ identity matrix. The resulting matrix $\widetilde{{\mathbf {S}}}_j^{[0]}$ contains the columns of $\widetilde{{\mathbf {S}}}_j$ completed with zero columns. Since $\Vert {\mathbf {A}}-{\mathbf {B}} \Vert _F^2 = \Vert {\mathbf {A}} \Vert _F^2+\Vert {\mathbf {B}} \Vert _F^2-2\mathrm {Tr}(A^TB)$, the norm of ${\widetilde{U}}_j^{[0]}$ can be minimised by looking at the maximum values in each row of $ \widetilde{{\mathbf {S}}}_j^T \left( \widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {V}}}_j^T\right) . $

The next matrices $\widetilde{{\mathbf {Q}}}_j^{[i]}$ consist of products of elementary Givens rotations or reflections. More precisely,

$$\begin{aligned} \widetilde{{\mathbf {Q}}}_j^{[i]} = \widetilde{{\mathbf {Q}}}_j^{[i,1]}\widetilde{{\mathbf {Q}}}_j^{[i,2]} \widetilde{{\mathbf {Q}}}_j^{[i,3]}\ldots , \end{aligned}$$

where the elements of $\widetilde{{\mathbf {Q}}}_j^{[i,\ell ]}$ coincides with those of ${\mathbf {I}}_j$, except in the $2 \times 2$ submatrix at rows $(r_1,r_2)$ and columns $(r_1,r_2)$, where the couple $(r_1,r_2)$ depends on $\ell $. This submatrix is equal to

$$\begin{aligned} \left[ \begin{array}{cc} b_\ell &{} 1-b_\ell \\ 1-b_\ell &{} b_\ell \end{array}\right] \left[ \begin{array}{cc} \cos (\alpha _\ell ) &{} \sin (\alpha _\ell ) \\ -\sin (\alpha _\ell ) &{} \cos (\alpha _\ell ) \end{array}\right] , \end{aligned}$$

where the binary $b_\ell $ and the real $\alpha _\ell \in [0,2\pi ]$ are chosen to minimize the Frobenius norm of the outcome $\widetilde{{\mathbf {U}}}_j^{[i,\ell ]} = \widetilde{{\mathbf {S}}}_j^{[i,\ell -1]}\widetilde{{\mathbf {Q}}}_j^{[i,\ell ]} -\widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {V}}}_j^T$. Since right multiplication with $\widetilde{{\mathbf {Q}}}_j^{[i,\ell ]}$ affects rows $r_1$ and $r_2$ only, the values of $b_\ell $ and $\alpha _\ell $ are easy to find. A slight modification consists in weighting the elements of the matrix in the calculation of the Frobenius norm. More precisely, by putting zero weights on the elements near the diagonal of $\widetilde{{\mathbf {U}}}_j^{[i,\ell ]}$, this matrix and also $\widetilde{{\mathbf {F}}}_j^{[i,\ell ]}$ are pushed towards diagonal dominance.

Appendix 2: Proof of Lemma 2

First, the polynomial reproducing condition (17) represents $n_j{\widetilde{p}}$ linear equations, independently from the number of zeros elements in $\widetilde{{\mathbf {F}}}_j$. Second, the diagonal of (16) adds $n_j$ non-linear non-homogeneous equations. The homogeneous equations corresponding to the off-diagonals express orthogonality between rows of $\widetilde{{\mathbf {F}}}_j$. These equations are inactive if the nonzeros of the rows under consideration have no overlap. Ignoring boundary effects, this amounts to $\left\lceil (r-l)/2\right\rceil -1$ active equations on each row of (16). Third, from the proof of Lemma 1, it turns out that (16) being the optimal variance reduction implies that all columns of $\widetilde{{\mathbf {F}}}_j$ add up to the same value, more precisely $ \widetilde{{\mathbf {F}}}_j^T{\varvec{1}}_j = {\varvec{1}}_{j+1} (n_j/n_{j+1}). $ These $n_{j+1}$ equations for column sums are linearly independent from the row sum equations in (17), except for the last one. In total, we have at least ${\mathcal {O}}(n_j{\widetilde{p}}+n_j+n_j(\left\lceil (r-l)/2\right\rceil -1)+n_{j+1})$ conditions, which is at least ${\widetilde{p}}+2+\left\lceil (r-l)/2\right\rceil $ at each row. Therefore, the number of free parameters at each row, $r-l$, must satisfy $(r-l) \ge {\widetilde{p}}+2+\left\lceil (r-l)/2\right\rceil $, which leads to the stated result. $\square $

In the case of equidistant covariate values, the prefilter $\widetilde{{\mathbf {F}}}_j$ has only $2{\widetilde{p}}$ nonzeros on each row. This can be understood from the fact that for equidistant covariates, $\widetilde{{\mathbf {F}}}_j = \widetilde{{\mathbf {J}}}_j\overline{{\mathbf {F}}}_j$, where is $\overline{{\mathbf {F}}}_j$ a Toeplitz matrix. The proof of Lemma 1 reveals that in the general case $n_{j+1}-1$ linear equations follow from minimising the output variance, namely that all column sums $\widetilde{{\mathbf {F}}}_j^T{\varvec{1}}_j$ have the same value. For $\widetilde{{\mathbf {F}}}_j = \widetilde{{\mathbf {J}}}_j\overline{{\mathbf {F}}}_j$, with $\overline{{\mathbf {F}}}_j$ Toeplitz, all even column sums are automatically equal to each other, and the same holds for the odd column sums, thereby reducing $n_{j+1}-1$ linear equations to a single equation. This subtracts the value two from the number of nonzeros on each row. But because of that, there is less overlap between rows, giving us a bonus reduction of two elements.

Appendix 3: Proof of Lemma 3

In this proof, $\widetilde{{\mathbf {J}}}_j$ denotes a partitioning operation, which can be more general than the even-odd subsampling.

Let $\widetilde{{\mathbf {X}}}_{j+1}^{(2)}$ be the orthonormalised columns of ${\mathbf {X}}_{j+1}^{(2)}$, i.e., $(\widetilde{{\mathbf {X}}}_{j+1}^{(2)})_{i,1} = n_{j+1}^{-1/2}$ and $(\widetilde{{\mathbf {X}}}_{j+1}^{(2)})_{i,2} = n_{j+1}^{-1/2} (x_{j+1,i} - \overline{{\varvec{x}}}_{j+1})/ ({\overline{{\varvec{x}}^2}_{\!\!\!j+1}-\overline{{\varvec{x}}}_{j+1}^2})^{1/2}$. Then for any vector ${\varvec{c}}$ of length $n_j$, we have

$$\begin{aligned}&{\varvec{c}}^T\widetilde{{\mathbf {J}}}_j(\widetilde{{\mathbf {V}}}_j^T \widetilde{{\mathbf {V}}}_j-(1-\gamma _j){\mathbf {I}}_{j+1}) \widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \nonumber \\&\quad = {\varvec{c}}^T\widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {V}}}_j^T \widetilde{{\mathbf {V}}}_j\widetilde{{\mathbf {J}}}_j^T{\varvec{c}}- (1-\gamma _j){\varvec{c}}^T \widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {J}}}_j^T{\varvec{c}}\nonumber \\&\quad = \Vert \widetilde{{\mathbf {V}}}_j\widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \Vert _2^2 - (1-\gamma _j)\Vert \widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \Vert _2^2. \end{aligned}$$

As $[\widetilde{{\mathbf {X}}}_{j+1}^{(2)} \widetilde{{\mathbf {V}}}_j^T]$ constitutes an orthogonal matrix, we have $ \Vert \widetilde{{\mathbf {X}}}_{j+1}^{(2)T}\widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \Vert _2^2 + \Vert \widetilde{{\mathbf {V}}}_j\widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \Vert _2^2 = \Vert \widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \Vert _2^2 = \Vert {\varvec{c}} \Vert _2^2. $ From this it follows that

$$\begin{aligned}&{\varvec{c}}^T\widetilde{{\mathbf {J}}}_j(\widetilde{{\mathbf {V}}}_j^T \widetilde{{\mathbf {V}}}_j-(1-\gamma _j){\mathbf {I}}_{j+1}) \widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \nonumber \\&\qquad \ge 0 \Leftrightarrow \Vert \widetilde{{\mathbf {X}}}_{j+1}^{(2)T}\widetilde{{\mathbf {J}}}_j^T{\varvec{c}} \Vert _2^2 \le \gamma _j\Vert {\varvec{c}} \Vert _2^2. \end{aligned}$$

This holds for any vector ${\varvec{c}}$ if $ \Vert \widetilde{{\mathbf {X}}}_{j+1}^{(2)T}\widetilde{{\mathbf {J}}}_j^T \Vert _2^2 = \Vert \widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {X}}}_{j+1}^{(2)} \Vert _2^2 \le \gamma _j. $ By writing $\rho ({\mathbf {A}})$ for the spectral radius of matrix ${\mathbf {A}}$, the squared matrix norm is

$$\begin{aligned} \Vert \widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {X}}}_{j+1} \Vert _2^2 = \rho (\widetilde{{\mathbf {X}}}_{j+1}^{(2)T}\widetilde{{\mathbf {J}}}_j^T \widetilde{{\mathbf {J}}}_j\widetilde{{\mathbf {X}}}_{j+1}) = {n_j \over n_{j+1}}\rho \left( \left[ \begin{array}{ll} 1 &{} \xi _j \\ \xi _j &{} 1-\zeta _j\end{array}\right] \right) , \end{aligned}$$

with $\xi _j$ and $\zeta _j$ as defined in (26) and (27). The squared matrix norm is equal to the minimum value of $\gamma _j$ proposed in (28). $\square $

Appendix 4: Kullback–Leibler divergence of the approximative mixture model in Sect. 4.3

The error term in (37) consists of two terms $ R(p_j,a_j,\tau _j,\sigma _{j,k}) = R_0(p_j,a_j,\tau _j,\sigma _{j,k},\lambda _j) + R_1(p_j,a_j,\tau _j,\sigma _{j,k},\lambda _j). $ These terms are defined as

$$\begin{aligned}&R_0(p_j,a_j,\tau _j,\sigma _{j,k},\lambda _j) \nonumber \\&\quad = E_{{\widetilde{D}}_{j,k}|{\varvec{X}}}\left[ \log \left( { f_{{\widetilde{D}}_{j,k}|{\varvec{X}}}({\widetilde{D}}_{j,k}|{\varvec{X}})) \over (1-{\widetilde{p}}_{\lambda ;j,k}) {\widetilde{f}}_0({\widetilde{D}}_{j,k};\epsilon _{j,k},\lambda _j) }\right) \right. \nonumber \\&\qquad \times \left. I({\widetilde{M}}_{j,k}=0|{\varvec{X}})\right] ,\\&R_1(p_j,a_j,\tau _j,\sigma _{j,k},\lambda _j) \nonumber \\&\quad = E_{{\widetilde{D}}_{j,k}|{\varvec{X}}}\left[ \log \left( { f_{{\widetilde{D}}_{j,k}|{\varvec{X}}}({\widetilde{D}}_{j,k}|{\varvec{X}})) \over {\widetilde{p}}_{\lambda ;j,k} {\widetilde{f}}_1({\widetilde{D}}_{j,k};a_j,\lambda _j\epsilon _{j,k}) }\right) \right. \nonumber \\&\qquad \times \left. I({\widetilde{M}}_{j,k}=1|{\varvec{X}})\right] . \end{aligned}$$

Since

$$\begin{aligned}&\log \left( {f_{{\widetilde{D}}_{j,k}|{\varvec{X}}}(u) \over (1-{\widetilde{p}}_{\lambda ;j,k}) {\widetilde{f}}_0(u)}\right) \nonumber \\&\quad = \log (2{\varPhi }\left( \lambda _j\right) -1) -\log (1-{\widetilde{p}}_{\lambda ;j,k}) \nonumber \\&\qquad +\,\log \left[ 1+p_j \left( {g(u;a_j,\sigma _{j,k})\over \phi \left( u/\epsilon _{j,k}\right) / \epsilon _{j,k}}-1\right) \right] , \end{aligned}$$

we can rewrite

$$\begin{aligned}&R_0(p_j,a_j,\tau _j,\sigma _{j,k},\lambda _j) \nonumber \\&\quad = (1-{\widetilde{p}}_{\lambda ;j,k}) \left[ \log (2{\varPhi }\left( \lambda _j\right) -1)-\log (1-{\widetilde{p}}_{\lambda ;j,k}) \right] \nonumber \\&\qquad +\,R_{00}(p_j;a_j,\tau _j,\sigma _{j,k},\lambda _j), \end{aligned}$$

where the following definition of $R_{00}$ uses the short notation $\epsilon = \sqrt{\sigma ^2+\tau ^2}$,

In this expression, we can bound the second factor of the second term by

$$\begin{aligned}&P(M=1|{\widetilde{M}}=0)\nonumber \\&\quad = {p P(|{\widetilde{D}}|\le \lambda \epsilon |M=1) \over p P(|{\widetilde{D}}|\le \lambda \epsilon |M=1) + (1-p) P(|{\widetilde{D}}|\le \lambda \epsilon |M=0)} \le p, \end{aligned}$$

provided that $P(|{\widetilde{D}}|\le d|M=1) \le P(|{\widetilde{D}}|\le d|M=0)$ for any positive d, which is true for $a\epsilon $ sufficiently close to zero. In the first factor of the second term, we use that $g(u;a,\sigma )/\phi _\epsilon (u)$ is symmetric on $[-\lambda \epsilon ,\lambda \epsilon ]$ and non-decreasing on $[0,\lambda \epsilon ]$,

This factor is bounded below by $-p/(1-p)$ and above by ${\mathcal {O}}(\lambda ^2)$, while the values of a, $\sigma $, and $\epsilon $ have no impact on these bounds.

In the first term we have, for p small,

This is further rewritten and bounded as

All together, we have, for $\lambda \rightarrow \infty $ and $p \rightarrow 0$,

$$\begin{aligned} |R_{00}(p;a,\tau ,\sigma ,\lambda )| = {\mathcal {O}}\left( p\cdot 1\cdot 1+ \lambda ^2\cdot p\cdot 1\right) = {\mathcal {O}}\left( p\lambda ^2\right) . \end{aligned}$$

For $R_1(p_j,a_j,\tau _j,\sigma _{j,k},\lambda _j)$, we use that $ {\widetilde{p}}_\lambda = P({\widetilde{M}}=1|{\varvec{X}}) = p \, 2\left[ 1-G(\lambda \epsilon ;a,\sigma )\right] + (1-p) \, 2\left[ 1-{\varPhi }(\lambda )\right] . $ For $\lambda \epsilon \rightarrow \infty $ and for $a \rightarrow 0$, so that $a\lambda \epsilon \rightarrow 0$, it can be verified that $G(\lambda \epsilon ;a,\sigma ) \rightarrow 0$, and thus ${\widetilde{p}}_\lambda /p \rightarrow 1$. The condition that $a\lambda \epsilon \rightarrow 0$ means that $\lambda \rightarrow \infty $, but not too fast, as otherwise, it would take away too many large coefficients. Furthermore, as

this term is bounded from below and from above by

$$\begin{aligned}&{\widetilde{p}}_\lambda \cdot \log \left( {p \over {\widetilde{p}}_\lambda } \inf _{|d|>\lambda \epsilon } r_1(d) + {(1-p) \over {\widetilde{p}}_\lambda } \inf _{|d|>\lambda \epsilon } r_0(d) \right) \nonumber \\&\quad \le R_1(p,a,\tau ,\sigma ,\lambda )\\&\quad \le {\widetilde{p}}_\lambda \cdot \log \left( {p \over {\widetilde{p}}_\lambda } \sup _{|d|>\lambda \epsilon } r_1(d) + {(1-p) \over {\widetilde{p}}_\lambda } \sup _{|d|>\lambda \epsilon } r_0(d) \right) , \end{aligned}$$

where $r_1(d) = g(d;a,\sigma )/{\widetilde{f}}_1(d;a,\lambda \epsilon )$ and $r_0(d) = \phi (d/\epsilon )/\epsilon {\widetilde{f}}_1(d;a,\lambda \epsilon )$. The function $r_1(d)$ reaches a global minimum at $d=0$, from where it increases monotonically from $r_1(0) = 2[1-{\varPhi }(a\sigma )]\exp (\sigma ^2a^2/2)$ towards $r_1(\pm \infty ) = \exp (\sigma ^2a^2/2)$. On $]-\infty ,-\lambda \epsilon ]\cup [\lambda \epsilon ,\infty [$, and for $\lambda > a\epsilon $, the function $r_0(d) = \phi (d/\epsilon )/\epsilon {\widetilde{f}}_1(d;a,\lambda \epsilon )$ decreases monotonically and rapidly from $r_0(\lambda \epsilon ) = \exp (-\lambda ^2/2)/\sqrt{\pi /2}a\epsilon $ to 0. The bounds for $R_1(p,a,\tau ,\sigma ,\lambda )$ are then

$$\begin{aligned}&{\widetilde{p}}_\lambda \cdot \log \left( {p \over {\widetilde{p}}_\lambda } 2[1-{\varPhi }(a\sigma )]\exp (\sigma ^2a^2/2)\right) \nonumber \\&\quad \le R_1(p,a,\tau ,\sigma ,\lambda )\\&\quad \le {\widetilde{p}}_\lambda \cdot \log \left( {p \over {\widetilde{p}}_\lambda } \exp (\sigma ^2a^2/2) + {(1-p) \over {\widetilde{p}}_\lambda } {\exp (-\lambda ^2/2)\over \sqrt{\pi /2}a\epsilon } \right) , \end{aligned}$$

from which we conclude that $R_1(p,a,\tau ,\sigma ,\lambda ) \sim {\widetilde{p}}_\lambda \sim p$, when $a \rightarrow 0$ and $\lambda \rightarrow \infty $, so that $\exp (-\lambda ^2/)/a \rightarrow 0$. This means that $\lambda $ should grow sufficiently fast in order to prevent false positives from dominating the error term $R_1$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jansen, M., Amghar, M. Multiscale local polynomial decompositions using bandwidths as scales. Stat Comput 27, 1383–1399 (2017). https://doi.org/10.1007/s11222-016-9692-8

Download citation

Received: 29 April 2016
Accepted: 11 August 2016
Published: 20 August 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s11222-016-9692-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiscale local polynomial decompositions using bandwidths as scales

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Density Estimation Using Multiscale Local Polynomial Transforms

Multiscale representation for irregularly spaced data

Multiscale Approximation

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Construction of prefilters using elementary orthogonal matrices

Appendix 2: Proof of Lemma 2

Appendix 3: Proof of Lemma 3

Appendix 4: Kullback–Leibler divergence of the approximative mixture model in Sect. 4.3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now