Doubly robust estimation and robust empirical likelihood in generalized linear models with missing responses

Xue, Liugen

doi:10.1007/s11222-023-10347-8

Doubly robust estimation and robust empirical likelihood in generalized linear models with missing responses

Original Paper
Published: 14 November 2023

Volume 34, article number 39, (2024)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Liugen Xue ORCID: orcid.org/0000-0001-7625-649X¹

235 Accesses
Explore all metrics

Abstract

In this paper, we study doubly robust estimation and robust empirical likelihood of regression parameter for generalized linear models with missing responses. A doubly robust estimating equation is proposed to estimate the regression parameter, and the resulting estimator has consistency and asymptotic normality, regardless of whether the assumed model contains the true model. A robust empirical log-likelihood ratio statistic for the regression parameter is constructed, showing that the statistic weakly converges to the standard $\chi ^2$ distribution. The result can be directly used to construct the confidence region of the regression parameter. A method for selecting the tuning parameters of $\psi $-function is also given. Simulation studies show the robustness of the estimator of the regression parameter and evaluate the performance of the robust empirical likelihood method. A real data example shows that the proposed method is feasible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Missing value imputation: a review and analysis of the literature (2006–2017)

Article 05 April 2019

References

Arnold, S.F.: The Theory of Linear Models and Multivariate Analysis. Wiley, New York (1981)
Bang, H., Robins, J.M.: Doubly robust estimation in missing data and causal inference models. Biometrics 61, 962–973 (2005)
Article MathSciNet Google Scholar
Cantoni, E., Ronchetti, E.: Robust inference for generalized linear models. J. Am. Stat. Assoc. 96, 1022–1030 (2001)
Article MathSciNet Google Scholar
Kim, J., Basak, J.M., Holtzman, D.M.: The role of apolipoprotein E in Alzheimer’s disease. Neuron 63, 287–303 (2009)
Article Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical Analysis With Missing data. John Wiley & Sons Inc, New York (1987)
Google Scholar
Liu, C.C., Kanekiyo, T., Xu, H.X., Bu, G.J.: Apolipoprotein E and Alzheimer disease: risk, mechanisms, and therapy. Nat. Rev. Neurol. 9, 106–118 (2013)
Article Google Scholar
Lo, S.N., Ronchetti, E.: Robust and accurate inference for generalized linear models. J. Multivar. Anal. 100, 2126–2136 (2009)
Article MathSciNet Google Scholar
Moustaki, I., Victoria-Feser, M.P.: Bounded-influence robust estimation in generalized linear latent variable models. J. Am. Stat. Assoc. 101, 644–653 (2006)
Article MathSciNet Google Scholar
Noh, M., Lee, Y.: Robust modeling for inference from generalized linear model classes. J. Am. Stat. Assoc. 102, 1059–1072 (2007)
Article MathSciNet Google Scholar
Orsini, N., Bellocco, R., Sjolander, A.: Doubly robust estimation in generalized linear models. Stata J. 13, 185–205 (2013)
Article Google Scholar
Owen, A.B.: Empirical likelihood ratio confidence regions. Ann. Stat. 18, 90–120 (1990)
Article MathSciNet Google Scholar
Qin, J., Lawless, J.: Empirical likelihood and general estimating equations (in likelihood and related topics). Ann. Stat. 22, 300–325 (1994)
Article Google Scholar
Robins, J.M., Rotnitzky, A.: Discussion on “Celebrating the new millennium’’ by Bickel, P. J. and Kwon, J. Stat. Sin. 11, 920–926 (2001)
Google Scholar
Varatharajah, Y., Ramanan, V.K., Iyer, R., Vemuri, P.: Predicting short-term MCI-to-AD progression using imaging, CSF, genetic factors, cognitive resilience, and demographics. Sci. Rep. 9(1), 2235 (2019)
Article Google Scholar
Welsh, A.H.: On M-processes and M-estimation. Ann. Stat. 17, 337–361 (1989). [Correction 18, 1500 (1990)]
Xue, D., Xue, L.G., Cheng, W.H.: Empirical likelihood for generalized linear models with missing responses. J. Stat. Plan. Inference 141, 2007–2020 (2011)

Download references

Acknowledgements

The author gratefully acknowledges Associate Editor and one anonymous referee for their helpful comments which improved the presentation of the manuscript. The research was supported by the National Natural Science Foundation of China (11971001). Data used was obtained from the data base of Alzheimer’s Disease Neuroimaging Initiative and was provided by Dr. Chunling Li of the Hong Kong Polytechnic University.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Henan University, Kaifeng, People’s Republic of China
Liugen Xue

Authors

Liugen Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liugen Xue.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 64 KB)

Appendix. Proofs of Theorems

In this appendix, we prove Theorems 1 and 2. The following Lemmas 1–4 are useful for proving these Theorems. Their proofs are given in the supplementary material.

Lemma 1

Suppose that conditions (C1)–(C7) hold. Then

$$\begin{aligned} \sup _{\beta \in {{{\mathcal {B}}}}_n}\Vert {\widehat{Q}}(\beta ) - Q_n(\beta )\Vert = o_P(n^{-1/2}), \end{aligned}$$

where ${{{\mathcal {B}}}}_n=\{\beta |\,\Vert \beta -\beta _0\Vert \le cn^{-1/2}\}$ for a constant $c>0$, ${\widehat{Q}}(\beta )$ is defined by (2.6),

$$\begin{aligned} Q_n(\beta )= & {} \frac{1}{n}\sum _{i=1}^n\frac{\delta _i}{\pi (X_i)}\frac{w(\beta ^TX_i)g'(\beta ^TX_i)}{V^{1/2}(g(\beta ^TX_i))}\\{} & {} \times \psi (r_i(\beta ))\{X_i-m(\beta ^TX_i)\}, \end{aligned}$$

and $r_i(\beta )$ is defined by (2.1).

Lemma 2

Suppose that conditions (C1)–(C7) hold. Then

$$\begin{aligned}{} & {} \sqrt{n}{\widehat{Q}}(\beta _0) {\mathop {\longrightarrow }\limits ^{D}}N(0,B(\beta _0)), \\{} & {} \sqrt{n}Q_n(\beta _0) {\mathop {\longrightarrow }\limits ^{D}}N(0,B(\beta _0)), \end{aligned}$$

where ${\widehat{Q}}(\beta _0)$, $Q_n(\beta _0)$ and $B(\beta _0)$ are defined in (2.6), Lemma 1 and Theorem 1, respectively.

Lemma 3

Suppose that conditions (C1)–(C7) hold. Then

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n{\hat{\eta _i}}(\beta _0)\hat{\eta }_i^T(\beta _0){\mathop {\longrightarrow }\limits ^{P}}B(\beta _0) \end{aligned}$$

where ${\hat{\eta _i}}(\beta _0)$ and $B(\beta _0)$ are defined in (2.9) and Theorem 1, respectively.

Lemma 4

Suppose that conditions (C1)–(C7) hold. Then

$$\begin{aligned} \max _{1\le i\le n}|{\hat{\eta _i}}(\beta _0)| = o_P(n^{1/2}), \end{aligned}$$

where ${\hat{\eta _i}}(\beta _0)$ is defined in (2.9)

Now we turn back to prove Theorems 1 and 2.

Proof of Theorem 1

We now prove the asymptotic normality of ${{\hat{\beta }}}$. The proof is divided into two steps: step (I) provides the existence of the estimator ${{\hat{\beta }}}$, and step (II) proves the asymptotic normality of ${{\hat{\beta }}}$.

(I) The existence of the estimator of $\beta _0$. We prove the following fact: Under conditions (C1)–(C8) and with probability one there exists an estimator of $\beta _0$ solving the estimating equation (2.6) in ${{{\mathcal {B}}}}_n^*$, where ${{{\mathcal {B}}}}_n^*=\big \{\beta |\, \Vert \beta -\beta _0\Vert =Mn^{-1/2}\big \}$ for some positive constant M. From Lemma 2 we obtain that uniformly for $\beta \in {{{\mathcal {B}}}}_n^*$,

$$\begin{aligned} {\widehat{Q}}(\beta ) = Q_n(\beta _0) -A(\beta _0)(\beta -\beta _0) + o_P(n^{-1/2}), \end{aligned}$$

(A.1)

where $A(\beta _0)$ is defined in Theorem 1. Therefore, we have, uniformly for $\beta \in {{{\mathcal {B}}}}_n^*$,

$$\begin{aligned} n(\beta -\beta _0){\widehat{Q}}(\beta )= & {} n(\beta -\beta _0)Q_n(\beta _0) - n(\beta -\beta _0)A(\beta _0)\\{} & {} \times (\beta -\beta _0) + o_P(1). \end{aligned}$$

We note that the above formula is dominated by the term $\sim M^2$ because $\sqrt{n}\Vert \beta -\beta _0\Vert =M$, whereas $|n(\beta -\beta _0)^TQ_n(\beta _0)|=MO_P(1)$, and $n(\beta -\beta _0)A(\beta _0)(\beta -\beta _0)\sim M^2$. So, for any given $\eta >0$, if M is chosen large enough, then it will follows that uniformly for $\beta \in {{{\mathcal {B}}}}_n^*$, $n(\beta -\beta _0){\widehat{Q}}(\beta )<0$ on an event with probability $1-\eta $. From the arbitrariness of $\eta $, we can prove the existence of the estimator of $\beta _0$ in ${{{\mathcal {B}}}}_n^*$ as in the proof of Theorem 5.1 of Welsh (1989). The details are omitted.

(II) The asymptotic normality. From step (I) we find that ${{\hat{\beta }}}$ is a solution in ${{{\mathcal {B}}}}_n^*$ to the equation ${\widehat{Q}}(\beta )=0$, namely ${\widehat{Q}}({{\hat{\beta }}})=0$, where ${\widehat{Q}}(\beta )$ is defined in (2.6). From (A.1) we have

$$\begin{aligned} 0=Q_n(\beta _0) - A(\beta _0)({{\hat{\beta }}}-\beta _0) + o_P(n^{-1/2}), \end{aligned}$$

and hence

$$\begin{aligned} \sqrt{n}({{\hat{\beta }}}-\beta _0) = A^{-1}(\beta _0)\sqrt{n}Q_n(\beta _0) + o_P(1). \end{aligned}$$

(A.2)

Theorem 1 follows from (A.2), Lemma 2 and Slutsky’s theorem. $\square $

Proof of Theorem 2

By the Lagrange multiplier method, ${\hat{l}}(\beta _0)$ can be represented as

$$\begin{aligned} {\hat{l}}(\beta _0)=2\sum _{i=1}^n\log \big (1+\lambda ^T{\hat{\eta }}_i(\beta _0)\big ), \end{aligned}$$

(A.3)

where $\lambda =\lambda (\beta )$ is a $d\times 1$ vector given as the solution to

$$\begin{aligned} h(\lambda )=\frac{1}{n}\sum _{i=1}^n\frac{{\hat{\eta }}_i(\beta _0)}{1+\lambda ^T{\hat{\eta }}_i(\beta _0)}=0. \end{aligned}$$

By Lemmas 2–4, and using the same arguments as are used in the proof of (2.8) in Owen (1990), we can show that

$$\begin{aligned} \lambda =O_P(n^{-1/2}). \end{aligned}$$

(A.4)

Applying the Taylor formula to (A.3), and invoking Lemmas 2–4 and (A.4), we get

$$\begin{aligned} {\hat{l}}(\beta _0) = 2 \sum _{i=1}^n\Big [\lambda ^T{\hat{\eta }}_i(\beta _0) - \big \{\lambda ^T{\hat{\eta }}_i(\beta _0)\big \}^2/2\Big ] +o_P(1).\nonumber \\ \end{aligned}$$

(A.5)

Note that $h(\lambda )=0$. It follows that

$$\begin{aligned} 0= & {} \sum _{i=1}^n\frac{{\hat{\eta }}_i(\beta _0)}{1+\lambda ^T{\hat{\eta }}_i(\beta _0)} \\= & {} \sum _{i=1}^n {\hat{\eta }}_i(\beta _0) - \sum _{i=1}^n{\hat{\eta }}_i(\beta _0){\hat{\eta }}_i^T(\beta )\lambda \\{} & {} + \sum _{i=1}^n\frac{{\hat{\eta }}_i(\beta _0)\big \{\lambda ^T{\hat{\eta }}_i(\beta _0)\big \}^2}{1+\lambda ^T{\hat{\eta }}_i(\beta _0)}. \end{aligned}$$

This together with Lemmas 2–4 and (A.4) proves that

$$\begin{aligned} \sum _{i=1}^n \{\lambda ^T{\hat{\eta }}_i(\beta _0)\}^2 = \sum _{i=1}^n\lambda ^T{\hat{\eta }}_i(\beta _0)+o_P(1) \end{aligned}$$

(A.6)

and

$$\begin{aligned} \lambda =\bigg (\sum _{i=1}^n{\hat{\eta }}_i(\beta _0){\hat{\eta }}_i^T(\beta _0) \bigg )^{-1} \sum _{i=1}^n {\hat{\eta }}_i(\beta _0) + o_P\big (n^{-1/2}\big ).\nonumber \\ \end{aligned}$$

(A.7)

Therefore, from (A.5)–(A.7) we have

$$\begin{aligned} {\hat{l}}(\beta _0)= & {} \{\sqrt{n}{\widehat{Q}}^{T}(\beta _0)\}\bigg (\frac{1}{n}\sum _{i=1}^n{\hat{\eta }}_i(\beta _0){\hat{\eta }}_i^T(\beta _0) \bigg )^{-1} \nonumber \\{} & {} \times \{\sqrt{n}{\widehat{Q}}(\beta _0)\} +o_P(1), \end{aligned}$$

(A.8)

where ${\widehat{Q}}(\beta _0)$ is defined in (2.6). This, together with (A.8), Lemmas 2 and 3 as well as Slutsky’s theorem, proves Theorem 2. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xue, L. Doubly robust estimation and robust empirical likelihood in generalized linear models with missing responses. Stat Comput 34, 39 (2024). https://doi.org/10.1007/s11222-023-10347-8

Download citation

Received: 24 May 2022
Accepted: 14 October 2023
Published: 14 November 2023
DOI: https://doi.org/10.1007/s11222-023-10347-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Doubly robust estimation and robust empirical likelihood in generalized linear models with missing responses

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Missing value imputation: a review and analysis of the literature (2006–2017)

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 64 KB)

Appendix. Proofs of Theorems

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Doubly robust estimation and robust empirical likelihood in generalized linear models with missing responses

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Missing value imputation: a review and analysis of the literature (2006–2017)

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 64 KB)

Appendix. Proofs of Theorems

Appendix. Proofs of Theorems

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation